* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-09-29 12:07 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-09-29 12:07 UTC (permalink / raw
To: gentoo-commits
commit: 5d874460f099a0d0d38a90c6bd1007810c6d3d20
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Sep 29 12:05:41 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Sep 29 12:06:30 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=5d874460
BMQ(BitMap Queue) Scheduler v6.17-r0
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 8 +
5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch | 11455 +++++++++++++++++++++++++
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 +
3 files changed, 11476 insertions(+)
diff --git a/0000_README b/0000_README
index 2e9aa3cc..46e37676 100644
--- a/0000_README
+++ b/0000_README
@@ -78,3 +78,11 @@ Patch: 4567_distro-Gentoo-Kconfig.patch
From: Tom Wijsman <TomWij@gentoo.org>
Desc: Add Gentoo Linux support config settings and defaults.
+
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+From: https://gitlab.com/alfredchen/projectc
+Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
+
+Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
+From: https://gitweb.gentoo.org/proj/linux-patches.git/
+Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
new file mode 100644
index 00000000..6b5e3269
--- /dev/null
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
@@ -0,0 +1,11455 @@
+diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
+index 8b49eab937d0..c5d4901a9608 100644
+--- a/Documentation/admin-guide/sysctl/kernel.rst
++++ b/Documentation/admin-guide/sysctl/kernel.rst
+@@ -1716,3 +1716,12 @@ is 10 seconds.
+
+ The softlockup threshold is (``2 * watchdog_thresh``). Setting this
+ tunable to zero will disable lockup detection altogether.
++
++yield_type:
++===========
++
++BMQ/PDS CPU scheduler only. This determines what type of yield calls
++to sched_yield() will be performed.
++
++ 0 - No yield.
++ 1 - Requeue task. (default)
+diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
+new file mode 100644
+index 000000000000..05c84eec0f31
+--- /dev/null
++++ b/Documentation/scheduler/sched-BMQ.txt
+@@ -0,0 +1,110 @@
++ BitMap queue CPU Scheduler
++ --------------------------
++
++CONTENT
++========
++
++ Background
++ Design
++ Overview
++ Task policy
++ Priority management
++ BitMap Queue
++ CPU Assignment and Migration
++
++
++Background
++==========
++
++BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
++of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
++and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
++simple, while efficiency and scalable for interactive tasks, such as desktop,
++movie playback and gaming etc.
++
++Design
++======
++
++Overview
++--------
++
++BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
++each CPU is responsible for scheduling the tasks that are putting into it's
++run queue.
++
++The run queue is a set of priority queues. Note that these queues are fifo
++queue for non-rt tasks or priority queue for rt tasks in data structure. See
++BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
++that most applications are non-rt tasks. No matter the queue is fifo or
++priority, In each queue is an ordered list of runnable tasks awaiting execution
++and the data structures are the same. When it is time for a new task to run,
++the scheduler simply looks the lowest numbered queueue that contains a task,
++and runs the first task from the head of that queue. And per CPU idle task is
++also in the run queue, so the scheduler can always find a task to run on from
++its run queue.
++
++Each task will assigned the same timeslice(default 4ms) when it is picked to
++start running. Task will be reinserted at the end of the appropriate priority
++queue when it uses its whole timeslice. When the scheduler selects a new task
++from the priority queue it sets the CPU's preemption timer for the remainder of
++the previous timeslice. When that timer fires the scheduler will stop execution
++on that task, select another task and start over again.
++
++If a task blocks waiting for a shared resource then it's taken out of its
++priority queue and is placed in a wait queue for the shared resource. When it
++is unblocked it will be reinserted in the appropriate priority queue of an
++eligible CPU.
++
++Task policy
++-----------
++
++BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
++mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
++NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
++policy.
++
++DEADLINE
++ It is squashed as priority 0 FIFO task.
++
++FIFO/RR
++ All RT tasks share one single priority queue in BMQ run queue designed. The
++complexity of insert operation is O(n). BMQ is not designed for system runs
++with major rt policy tasks.
++
++NORMAL/BATCH/IDLE
++ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
++NORMAL policy tasks, but they just don't boost. To control the priority of
++NORMAL/BATCH/IDLE tasks, simply use nice level.
++
++ISO
++ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
++task instead.
++
++Priority management
++-------------------
++
++RT tasks have priority from 0-99. For non-rt tasks, there are three different
++factors used to determine the effective priority of a task. The effective
++priority being what is used to determine which queue it will be in.
++
++The first factor is simply the task’s static priority. Which is assigned from
++task's nice level, within [-20, 19] in userland's point of view and [0, 39]
++internally.
++
++The second factor is the priority boost. This is a value bounded between
++[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
++modified by the following cases:
++
++*When a thread has used up its entire timeslice, always deboost its boost by
++increasing by one.
++*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
++and its switch-in time(time after last switch and run) below the thredhold
++based on its priority boost, will boost its boost by decreasing by one buti is
++capped at 0 (won’t go negative).
++
++The intent in this system is to ensure that interactive threads are serviced
++quickly. These are usually the threads that interact directly with the user
++and cause user-perceivable latency. These threads usually do little work and
++spend most of their time blocked awaiting another user event. So they get the
++priority boost from unblocking while background threads that do most of the
++processing receive the priority penalty for using their entire timeslice.
+diff --git a/fs/bcachefs/io_write.c b/fs/bcachefs/io_write.c
+index 88b1eec8eff3..4619aa57cd9f 100644
+--- a/fs/bcachefs/io_write.c
++++ b/fs/bcachefs/io_write.c
+@@ -640,8 +640,14 @@ static inline void __wp_update_state(struct write_point *wp, enum write_point_st
+ if (state != wp->state) {
+ struct task_struct *p = current;
+ u64 now = ktime_get_ns();
++
++#ifdef CONFIG_SCHED_ALT
++ u64 runtime = tsk_seruntime(p) +
++ (now - p->last_ran);
++#else
+ u64 runtime = p->se.sum_exec_runtime +
+ (now - p->se.exec_start);
++#endif
+
+ if (state == WRITE_POINT_runnable)
+ wp->last_runtime = runtime;
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index 62d35631ba8c..eb1a57209822 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -515,7 +515,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
+ seq_puts(m, "0 0 0\n");
+ else
+ seq_printf(m, "%llu %llu %lu\n",
+- (unsigned long long)task->se.sum_exec_runtime,
++ (unsigned long long)tsk_seruntime(task),
+ (unsigned long long)task->sched_info.run_delay,
+ task->sched_info.pcount);
+
+diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
+index 8874f681b056..59eb72bf7d5f 100644
+--- a/include/asm-generic/resource.h
++++ b/include/asm-generic/resource.h
+@@ -23,7 +23,7 @@
+ [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ [RLIMIT_SIGPENDING] = { 0, 0 }, \
+ [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
+- [RLIMIT_NICE] = { 0, 0 }, \
++ [RLIMIT_NICE] = { 30, 30 }, \
+ [RLIMIT_RTPRIO] = { 0, 0 }, \
+ [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ }
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index e4ce0a76831e..7414ebd6267c 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -843,7 +843,9 @@ struct task_struct {
+ #endif
+
+ int on_cpu;
++
+ struct __call_single_node wake_entry;
++#ifndef CONFIG_SCHED_ALT
+ unsigned int wakee_flips;
+ unsigned long wakee_flip_decay_ts;
+ struct task_struct *last_wakee;
+@@ -857,6 +859,7 @@ struct task_struct {
+ */
+ int recent_used_cpu;
+ int wake_cpu;
++#endif /* !CONFIG_SCHED_ALT */
+ int on_rq;
+
+ int prio;
+@@ -864,6 +867,19 @@ struct task_struct {
+ int normal_prio;
+ unsigned int rt_priority;
+
++#ifdef CONFIG_SCHED_ALT
++ u64 last_ran;
++ s64 time_slice;
++ struct list_head sq_node;
++#ifdef CONFIG_SCHED_BMQ
++ int boost_prio;
++#endif /* CONFIG_SCHED_BMQ */
++#ifdef CONFIG_SCHED_PDS
++ u64 deadline;
++#endif /* CONFIG_SCHED_PDS */
++ /* sched_clock time spent running */
++ u64 sched_time;
++#else /* !CONFIG_SCHED_ALT */
+ struct sched_entity se;
+ struct sched_rt_entity rt;
+ struct sched_dl_entity dl;
+@@ -878,6 +894,7 @@ struct task_struct {
+ unsigned long core_cookie;
+ unsigned int core_occupation;
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_CGROUP_SCHED
+ struct task_group *sched_task_group;
+@@ -914,9 +931,13 @@ struct task_struct {
+ const cpumask_t *cpus_ptr;
+ cpumask_t *user_cpus_ptr;
+ cpumask_t cpus_mask;
++#ifndef CONFIG_SCHED_ALT
+ void *migration_pending;
++#endif
+ unsigned short migration_disabled;
++#ifndef CONFIG_SCHED_ALT
+ unsigned short migration_flags;
++#endif
+
+ #ifdef CONFIG_PREEMPT_RCU
+ int rcu_read_lock_nesting;
+@@ -947,8 +968,10 @@ struct task_struct {
+ struct sched_info sched_info;
+
+ struct list_head tasks;
++#ifndef CONFIG_SCHED_ALT
+ struct plist_node pushable_tasks;
+ struct rb_node pushable_dl_tasks;
++#endif
+
+ struct mm_struct *mm;
+ struct mm_struct *active_mm;
+@@ -1672,6 +1695,15 @@ static inline bool sched_proxy_exec(void)
+ }
+ #endif
+
++#ifdef CONFIG_SCHED_ALT
++#define tsk_seruntime(t) ((t)->sched_time)
++/* replace the uncertian rt_timeout with 0UL */
++#define tsk_rttimeout(t) (0UL)
++#else /* !CONFIG_SCHED_ALT: */
++#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
++#define tsk_rttimeout(t) ((t)->rt.timeout)
++#endif /* !CONFIG_SCHED_ALT */
++
+ #define TASK_REPORT_IDLE (TASK_REPORT + 1)
+ #define TASK_REPORT_MAX (TASK_REPORT_IDLE << 1)
+
+@@ -2236,7 +2268,11 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu)
+
+ static inline bool task_is_runnable(struct task_struct *p)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return p->on_rq;
++#else
+ return p->on_rq && !p->se.sched_delayed;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ extern bool sched_task_on_rq(struct task_struct *p);
+diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
+index c40115d4e34d..ddc97ddeed47 100644
+--- a/include/linux/sched/deadline.h
++++ b/include/linux/sched/deadline.h
+@@ -2,6 +2,25 @@
+ #ifndef _LINUX_SCHED_DEADLINE_H
+ #define _LINUX_SCHED_DEADLINE_H
+
++#ifdef CONFIG_SCHED_ALT
++
++static inline int dl_task(struct task_struct *p)
++{
++ return 0;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#define __tsk_deadline(p) (0UL)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
++#endif
++
++#else
++
++#define __tsk_deadline(p) ((p)->dl.deadline)
++
+ /*
+ * SCHED_DEADLINE tasks has negative priorities, reflecting
+ * the fact that any of them has higher prio than RT and
+@@ -23,6 +42,7 @@ static inline bool dl_task(struct task_struct *p)
+ {
+ return dl_prio(p->prio);
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ static inline bool dl_time_before(u64 a, u64 b)
+ {
+diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
+index 6ab43b4f72f9..ef1cff556c5e 100644
+--- a/include/linux/sched/prio.h
++++ b/include/linux/sched/prio.h
+@@ -19,6 +19,28 @@
+ #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
+ #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
+
++#ifdef CONFIG_SCHED_ALT
++
++/* Undefine MAX_PRIO and DEFAULT_PRIO */
++#undef MAX_PRIO
++#undef DEFAULT_PRIO
++
++/* +/- priority levels from the base priority */
++#ifdef CONFIG_SCHED_BMQ
++#define MAX_PRIORITY_ADJ (12)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define MAX_PRIORITY_ADJ (0)
++#endif
++
++#define MIN_NORMAL_PRIO (128)
++#define NORMAL_PRIO_NUM (64)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
++#define DEFAULT_PRIO (MAX_PRIO - MAX_PRIORITY_ADJ - NICE_WIDTH / 2)
++
++#endif /* CONFIG_SCHED_ALT */
++
+ /*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
+index 4e3338103654..6dfef878fe3b 100644
+--- a/include/linux/sched/rt.h
++++ b/include/linux/sched/rt.h
+@@ -45,8 +45,10 @@ static inline bool rt_or_dl_task_policy(struct task_struct *tsk)
+
+ if (policy == SCHED_FIFO || policy == SCHED_RR)
+ return true;
++#ifndef CONFIG_SCHED_ALT
+ if (policy == SCHED_DEADLINE)
+ return true;
++#endif
+ return false;
+ }
+
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 5263746b63e8..693e5e3b6b26 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -196,7 +196,8 @@ extern void sched_update_asym_prefer_cpu(int cpu, int old_prio, int new_prio);
+ #define SDTL_INIT(maskfn, flagsfn, dname) ((struct sched_domain_topology_level) \
+ { .mask = maskfn, .sd_flags = flagsfn, .name = #dname })
+
+-#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
++#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
++ !defined(CONFIG_SCHED_ALT)
+ extern void rebuild_sched_domains_energy(void);
+ #else
+ static inline void rebuild_sched_domains_energy(void)
+diff --git a/init/Kconfig b/init/Kconfig
+index ecddb94db8dc..a0afff9dbf4c 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -678,6 +678,7 @@ config TASK_IO_ACCOUNTING
+
+ config PSI
+ bool "Pressure stall information tracking"
++ depends on !SCHED_ALT
+ select KERNFS
+ help
+ Collect metrics that indicate how overcommitted the CPU, memory,
+@@ -901,6 +902,35 @@ config SCHED_PROXY_EXEC
+ This option enables proxy execution, a mechanism for mutex-owning
+ tasks to inherit the scheduling context of higher priority waiters.
+
++menuconfig SCHED_ALT
++ bool "Alternative CPU Schedulers"
++ default y
++ help
++ This feature enable alternative CPU scheduler"
++
++if SCHED_ALT
++
++choice
++ prompt "Alternative CPU Scheduler"
++ default SCHED_BMQ
++
++config SCHED_BMQ
++ bool "BMQ CPU scheduler"
++ help
++ The BitMap Queue CPU scheduler for excellent interactivity and
++ responsiveness on the desktop and solid scalability on normal
++ hardware and commodity servers.
++
++config SCHED_PDS
++ bool "PDS CPU scheduler"
++ help
++ The Priority and Deadline based Skip list multiple queue CPU
++ Scheduler.
++
++endchoice
++
++endif
++
+ endmenu
+
+ #
+@@ -966,6 +996,7 @@ config NUMA_BALANCING
+ depends on ARCH_SUPPORTS_NUMA_BALANCING
+ depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
+ depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
++ depends on !SCHED_ALT
+ help
+ This option adds support for automatic NUMA aware memory/task placement.
+ The mechanism is quite primitive and is based on migrating memory when
+@@ -1415,6 +1446,7 @@ config CHECKPOINT_RESTORE
+
+ config SCHED_AUTOGROUP
+ bool "Automatic process group scheduling"
++ depends on !SCHED_ALT
+ select CGROUPS
+ select CGROUP_SCHED
+ select FAIR_GROUP_SCHED
+diff --git a/init/init_task.c b/init/init_task.c
+index e557f622bd90..99e59c2082e0 100644
+--- a/init/init_task.c
++++ b/init/init_task.c
+@@ -72,9 +72,16 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
+ .stack = init_stack,
+ .usage = REFCOUNT_INIT(2),
+ .flags = PF_KTHREAD,
++#ifdef CONFIG_SCHED_ALT
++ .on_cpu = 1,
++ .prio = DEFAULT_PRIO,
++ .static_prio = DEFAULT_PRIO,
++ .normal_prio = DEFAULT_PRIO,
++#else
+ .prio = MAX_PRIO - 20,
+ .static_prio = MAX_PRIO - 20,
+ .normal_prio = MAX_PRIO - 20,
++#endif
+ .policy = SCHED_NORMAL,
+ .cpus_ptr = &init_task.cpus_mask,
+ .user_cpus_ptr = NULL,
+@@ -87,6 +94,16 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
+ .restart_block = {
+ .fn = do_no_restart_syscall,
+ },
++#ifdef CONFIG_SCHED_ALT
++ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
++#ifdef CONFIG_SCHED_BMQ
++ .boost_prio = 0,
++#endif
++#ifdef CONFIG_SCHED_PDS
++ .deadline = 0,
++#endif
++ .time_slice = HZ,
++#else
+ .se = {
+ .group_node = LIST_HEAD_INIT(init_task.se.group_node),
+ },
+@@ -94,10 +111,13 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
+ .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
+ .time_slice = RR_TIMESLICE,
+ },
++#endif
+ .tasks = LIST_HEAD_INIT(init_task.tasks),
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+ .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
+ #endif
++#endif
+ #ifdef CONFIG_CGROUP_SCHED
+ .sched_task_group = &root_task_group,
+ #endif
+diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
+index 54ea59ff8fbe..a6d3560cef75 100644
+--- a/kernel/Kconfig.preempt
++++ b/kernel/Kconfig.preempt
+@@ -134,7 +134,7 @@ config PREEMPT_DYNAMIC
+
+ config SCHED_CORE
+ bool "Core Scheduling for SMT"
+- depends on SCHED_SMT
++ depends on SCHED_SMT && !SCHED_ALT
+ help
+ This option permits Core Scheduling, a means of coordinated task
+ selection across SMT siblings. When enabled -- see
+@@ -152,7 +152,7 @@ config SCHED_CORE
+
+ config SCHED_CLASS_EXT
+ bool "Extensible Scheduling Class"
+- depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF
++ depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF && !SCHED_ALT
+ select STACKTRACE if STACKTRACE_SUPPORT
+ help
+ This option enables a new scheduler class sched_ext (SCX), which
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index 27adb04df675..b88d31c14417 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -662,7 +662,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
+ return ret;
+ }
+
+-#ifdef CONFIG_SMP
++#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * Helper routine for generate_sched_domains().
+ * Do cpusets a, b have overlapping effective cpus_allowed masks?
+@@ -1075,7 +1075,7 @@ void rebuild_sched_domains_locked(void)
+ /* Have scheduler rebuild the domains */
+ partition_sched_domains(ndoms, doms, attr);
+ }
+-#else /* !CONFIG_SMP */
++#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
+ void rebuild_sched_domains_locked(void)
+ {
+ }
+@@ -3049,12 +3049,15 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ goto out_unlock;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(task)) {
+ cs->nr_migrate_dl_tasks++;
+ cs->sum_migrate_dl_bw += task->dl.dl_bw;
+ }
++#endif
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (!cs->nr_migrate_dl_tasks)
+ goto out_success;
+
+@@ -3075,6 +3078,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ }
+
+ out_success:
++#endif
+ /*
+ * Mark attach is in progress. This makes validate_change() fail
+ * changes which zero cpus/mems_allowed.
+@@ -3096,12 +3100,14 @@ static void cpuset_cancel_attach(struct cgroup_taskset *tset)
+ mutex_lock(&cpuset_mutex);
+ dec_attach_in_progress_locked(cs);
+
++#ifndef CONFIG_SCHED_ALT
+ if (cs->nr_migrate_dl_tasks) {
+ int cpu = cpumask_any(cs->effective_cpus);
+
+ dl_bw_free(cpu, cs->sum_migrate_dl_bw);
+ reset_migrate_dl_data(cs);
+ }
++#endif
+
+ mutex_unlock(&cpuset_mutex);
+ }
+diff --git a/kernel/delayacct.c b/kernel/delayacct.c
+index 30e7912ebb0d..f6b7e29d2018 100644
+--- a/kernel/delayacct.c
++++ b/kernel/delayacct.c
+@@ -164,7 +164,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
+ */
+ t1 = tsk->sched_info.pcount;
+ t2 = tsk->sched_info.run_delay;
+- t3 = tsk->se.sum_exec_runtime;
++ t3 = tsk_seruntime(tsk);
+
+ d->cpu_count += t1;
+
+diff --git a/kernel/exit.c b/kernel/exit.c
+index 343eb97543d5..bd34de061dff 100644
+--- a/kernel/exit.c
++++ b/kernel/exit.c
+@@ -207,7 +207,7 @@ static void __exit_signal(struct release_task_post *post, struct task_struct *ts
+ sig->inblock += task_io_get_inblock(tsk);
+ sig->oublock += task_io_get_oublock(tsk);
+ task_io_accounting_add(&sig->ioac, &tsk->ioac);
+- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
++ sig->sum_sched_runtime += tsk_seruntime(tsk);
+ sig->nr_threads--;
+ __unhash_process(post, tsk, group_dead);
+ write_sequnlock(&sig->stats_lock);
+@@ -291,8 +291,8 @@ void release_task(struct task_struct *p)
+ write_unlock_irq(&tasklist_lock);
+ /* @thread_pid can't go away until free_pids() below */
+ proc_flush_pid(thread_pid);
+- add_device_randomness(&p->se.sum_exec_runtime,
+- sizeof(p->se.sum_exec_runtime));
++ add_device_randomness((const void*) &tsk_seruntime(p),
++ sizeof(unsigned long long));
+ free_pids(post.pids);
+ release_thread(p);
+ /*
+diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
+index c80902eacd79..b1d388145968 100644
+--- a/kernel/locking/rtmutex.c
++++ b/kernel/locking/rtmutex.c
+@@ -366,7 +366,7 @@ waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry));
+
+ waiter->tree.prio = __waiter_prio(task);
+- waiter->tree.deadline = task->dl.deadline;
++ waiter->tree.deadline = __tsk_deadline(task);
+ }
+
+ /*
+@@ -387,16 +387,20 @@ waiter_clone_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ * Only use with rt_waiter_node_{less,equal}()
+ */
+ #define task_to_waiter_node(p) \
+- &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
++ &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
+ #define task_to_waiter(p) \
+ &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
+
+ static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
+ struct rt_waiter_node *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline < right->deadline);
++#else
+ if (left->prio < right->prio)
+ return 1;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -405,16 +409,22 @@ static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
+ */
+ if (dl_prio(left->prio))
+ return dl_time_before(left->deadline, right->deadline);
++#endif
+
+ return 0;
++#endif
+ }
+
+ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
+ struct rt_waiter_node *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline == right->deadline);
++#else
+ if (left->prio != right->prio)
+ return 0;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -423,8 +433,10 @@ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
+ */
+ if (dl_prio(left->prio))
+ return left->deadline == right->deadline;
++#endif
+
+ return 1;
++#endif
+ }
+
+ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h
+index 31a785afee6c..0e7df1f689e0 100644
+--- a/kernel/locking/ww_mutex.h
++++ b/kernel/locking/ww_mutex.h
+@@ -247,6 +247,7 @@ __ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b)
+
+ /* equal static prio */
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_prio(a_prio)) {
+ if (dl_time_before(b->task->dl.deadline,
+ a->task->dl.deadline))
+@@ -256,6 +257,7 @@ __ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b)
+ b->task->dl.deadline))
+ return false;
+ }
++#endif
+
+ /* equal prio */
+ }
+diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
+index 8ae86371ddcd..a972ef1e31a7 100644
+--- a/kernel/sched/Makefile
++++ b/kernel/sched/Makefile
+@@ -33,7 +33,12 @@ endif
+ # These compilation units have roughly the same size and complexity - so their
+ # build parallelizes well and finishes roughly at once:
+ #
++ifdef CONFIG_SCHED_ALT
++obj-y += alt_core.o
++obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
++else
+ obj-y += core.o
+ obj-y += fair.o
++endif
+ obj-y += build_policy.o
+ obj-y += build_utility.o
+diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
+new file mode 100644
+index 000000000000..8f03f5312e4d
+--- /dev/null
++++ b/kernel/sched/alt_core.c
+@@ -0,0 +1,7648 @@
++/*
++ * kernel/sched/alt_core.c
++ *
++ * Core alternative kernel scheduler code and related syscalls
++ *
++ * Copyright (C) 1991-2002 Linus Torvalds
++ *
++ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
++ * a whole lot of those previous things.
++ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
++ * scheduler by Alfred Chen.
++ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
++ */
++#include <linux/sched/clock.h>
++#include <linux/sched/cputime.h>
++#include <linux/sched/debug.h>
++#include <linux/sched/hotplug.h>
++#include <linux/sched/init.h>
++#include <linux/sched/isolation.h>
++#include <linux/sched/loadavg.h>
++#include <linux/sched/mm.h>
++#include <linux/sched/nohz.h>
++#include <linux/sched/stat.h>
++#include <linux/sched/wake_q.h>
++
++#include <linux/blkdev.h>
++#include <linux/context_tracking.h>
++#include <linux/cpuset.h>
++#include <linux/delayacct.h>
++#include <linux/init_task.h>
++#include <linux/kcov.h>
++#include <linux/kprobes.h>
++#include <linux/nmi.h>
++#include <linux/rseq.h>
++#include <linux/scs.h>
++
++#include <uapi/linux/sched/types.h>
++
++#include <asm/irq_regs.h>
++#include <asm/switch_to.h>
++
++#define CREATE_TRACE_POINTS
++#include <trace/events/sched.h>
++#include <trace/events/ipi.h>
++#undef CREATE_TRACE_POINTS
++
++#include "sched.h"
++#include "smp.h"
++
++#include "pelt.h"
++
++#include "../../io_uring/io-wq.h"
++#include "../smpboot.h"
++
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpu);
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpumask);
++
++/*
++ * Export tracepoints that act as a bare tracehook (ie: have no trace event
++ * associated with them) to allow external modules to probe them.
++ */
++EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
++
++#define sched_feat(x) (1)
++/*
++ * Print a warning if need_resched is set for the given duration (if
++ * LATENCY_WARN is enabled).
++ *
++ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
++ * per boot.
++ */
++__read_mostly int sysctl_resched_latency_warn_ms = 100;
++__read_mostly int sysctl_resched_latency_warn_once = 1;
++
++#define ALT_SCHED_VERSION "v6.17-r0"
++
++#define STOP_PRIO (MAX_RT_PRIO - 1)
++
++/*
++ * Time slice
++ * (default: 4 msec, units: nanoseconds)
++ */
++unsigned int sysctl_sched_base_slice __read_mostly = (4 << 20);
++
++#include "alt_core.h"
++#include "alt_topology.h"
++
++/* Reschedule if less than this many μs left */
++#define RESCHED_NS (100 << 10)
++
++/**
++ * sched_yield_type - Type of sched_yield() will be performed.
++ * 0: No yield.
++ * 1: Requeue task. (default)
++ */
++int sched_yield_type __read_mostly = 1;
++
++cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DEFINE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_topo_end_mask);
++
++#ifdef CONFIG_SCHED_SMT
++DEFINE_STATIC_KEY_FALSE(sched_smt_present);
++EXPORT_SYMBOL_GPL(sched_smt_present);
++
++cpumask_t sched_smt_mask ____cacheline_aligned_in_smp;
++#endif
++
++/*
++ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
++ * the domain), this allows us to quickly tell if two cpus are in the same cache
++ * domain, see cpus_share_cache().
++ */
++DEFINE_PER_CPU(int, sd_llc_id);
++
++DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static cpumask_t sched_preempt_mask[SCHED_QUEUE_BITS + 2] ____cacheline_aligned_in_smp;
++
++cpumask_t *const sched_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS - 1];
++cpumask_t *const sched_sg_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS];
++cpumask_t *const sched_pcore_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS];
++cpumask_t *const sched_ecore_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS + 1];
++
++/* task function */
++static inline const struct cpumask *task_user_cpus(struct task_struct *p)
++{
++ if (!p->user_cpus_ptr)
++ return cpu_possible_mask; /* &init_task.cpus_mask */
++ return p->user_cpus_ptr;
++}
++
++/* sched_queue related functions */
++static inline void sched_queue_init(struct sched_queue *q)
++{
++ int i;
++
++ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
++ for(i = 0; i < SCHED_LEVELS; i++)
++ INIT_LIST_HEAD(&q->heads[i]);
++}
++
++/*
++ * Init idle task and put into queue structure of rq
++ * IMPORTANT: may be called multiple times for a single cpu
++ */
++static inline void sched_queue_init_idle(struct sched_queue *q,
++ struct task_struct *idle)
++{
++ INIT_LIST_HEAD(&q->heads[IDLE_TASK_SCHED_PRIO]);
++ list_add_tail(&idle->sq_node, &q->heads[IDLE_TASK_SCHED_PRIO]);
++ idle->on_rq = TASK_ON_RQ_QUEUED;
++}
++
++#define CLEAR_CACHED_PREEMPT_MASK(pr, low, high, cpu) \
++ if (low < pr && pr <= high) \
++ cpumask_clear_cpu(cpu, sched_preempt_mask + pr);
++
++#define SET_CACHED_PREEMPT_MASK(pr, low, high, cpu) \
++ if (low < pr && pr <= high) \
++ cpumask_set_cpu(cpu, sched_preempt_mask + pr);
++
++static atomic_t sched_prio_record = ATOMIC_INIT(0);
++
++/* water mark related functions */
++static inline void update_sched_preempt_mask(struct rq *rq)
++{
++ int prio = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
++ int last_prio = rq->prio;
++ int cpu, pr;
++
++ if (prio == last_prio)
++ return;
++
++ rq->prio = prio;
++#ifdef CONFIG_SCHED_PDS
++ rq->prio_idx = sched_prio2idx(rq->prio, rq);
++#endif
++ cpu = cpu_of(rq);
++ pr = atomic_read(&sched_prio_record);
++
++ if (prio < last_prio) {
++ if (IDLE_TASK_SCHED_PRIO == last_prio) {
++ rq->clear_idle_mask_func(cpu, sched_idle_mask);
++ last_prio -= 2;
++ }
++ CLEAR_CACHED_PREEMPT_MASK(pr, prio, last_prio, cpu);
++
++ return;
++ }
++ /* last_prio < prio */
++ if (IDLE_TASK_SCHED_PRIO == prio) {
++ rq->set_idle_mask_func(cpu, sched_idle_mask);
++ prio -= 2;
++ }
++ SET_CACHED_PREEMPT_MASK(pr, last_prio, prio, cpu);
++}
++
++/* need a wrapper since we may need to trace from modules */
++EXPORT_TRACEPOINT_SYMBOL(sched_set_state_tp);
++
++/* Call via the helper macro trace_set_current_state. */
++void __trace_set_current_state(int state_value)
++{
++ trace_sched_set_state_tp(current, state_value);
++}
++EXPORT_SYMBOL(__trace_set_current_state);
++
++/*
++ * Serialization rules:
++ *
++ * Lock order:
++ *
++ * p->pi_lock
++ * rq->lock
++ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
++ *
++ * rq1->lock
++ * rq2->lock where: rq1 < rq2
++ *
++ * Regular state:
++ *
++ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
++ * local CPU's rq->lock, it optionally removes the task from the runqueue and
++ * always looks at the local rq data structures to find the most eligible task
++ * to run next.
++ *
++ * Task enqueue is also under rq->lock, possibly taken from another CPU.
++ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
++ * the local CPU to avoid bouncing the runqueue state around [ see
++ * ttwu_queue_wakelist() ]
++ *
++ * Task wakeup, specifically wakeups that involve migration, are horribly
++ * complicated to avoid having to take two rq->locks.
++ *
++ * Special state:
++ *
++ * System-calls and anything external will use task_rq_lock() which acquires
++ * both p->pi_lock and rq->lock. As a consequence the state they change is
++ * stable while holding either lock:
++ *
++ * - sched_setaffinity()/
++ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
++ * - set_user_nice(): p->se.load, p->*prio
++ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
++ * p->se.load, p->rt_priority,
++ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
++ * - sched_setnuma(): p->numa_preferred_nid
++ * - sched_move_task(): p->sched_task_group
++ * - uclamp_update_active() p->uclamp*
++ *
++ * p->state <- TASK_*:
++ *
++ * is changed locklessly using set_current_state(), __set_current_state() or
++ * set_special_state(), see their respective comments, or by
++ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
++ * concurrent self.
++ *
++ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
++ *
++ * is set by activate_task() and cleared by deactivate_task(), under
++ * rq->lock. Non-zero indicates the task is runnable, the special
++ * ON_RQ_MIGRATING state is used for migration without holding both
++ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
++ *
++ * Additionally it is possible to be ->on_rq but still be considered not
++ * runnable when p->se.sched_delayed is true. These tasks are on the runqueue
++ * but will be dequeued as soon as they get picked again. See the
++ * task_is_runnable() helper.
++ *
++ * p->on_cpu <- { 0, 1 }:
++ *
++ * is set by prepare_task() and cleared by finish_task() such that it will be
++ * set before p is scheduled-in and cleared after p is scheduled-out, both
++ * under rq->lock. Non-zero indicates the task is running on its CPU.
++ *
++ * [ The astute reader will observe that it is possible for two tasks on one
++ * CPU to have ->on_cpu = 1 at the same time. ]
++ *
++ * task_cpu(p): is changed by set_task_cpu(), the rules are:
++ *
++ * - Don't call set_task_cpu() on a blocked task:
++ *
++ * We don't care what CPU we're not running on, this simplifies hotplug,
++ * the CPU assignment of blocked tasks isn't required to be valid.
++ *
++ * - for try_to_wake_up(), called under p->pi_lock:
++ *
++ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
++ *
++ * - for migration called under rq->lock:
++ * [ see task_on_rq_migrating() in task_rq_lock() ]
++ *
++ * o move_queued_task()
++ * o detach_task()
++ *
++ * - for migration called under double_rq_lock():
++ *
++ * o __migrate_swap_task()
++ * o push_rt_task() / pull_rt_task()
++ * o push_dl_task() / pull_dl_task()
++ * o dl_task_offline_migration()
++ *
++ */
++
++/*
++ * Context: p->pi_lock
++ */
++static inline struct rq *
++task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock, unsigned long *flags)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock_irqsave(&rq->lock, *flags);
++ if (likely((p->on_cpu || task_on_rq_queued(p)) && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, *flags);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ raw_spin_lock_irqsave(&p->pi_lock, *flags);
++ if (likely(!p->on_cpu && !p->on_rq && rq == task_rq(p))) {
++ *plock = &p->pi_lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
++ }
++ }
++}
++
++static inline void
++task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock, unsigned long *flags)
++{
++ raw_spin_unlock_irqrestore(lock, *flags);
++}
++
++/*
++ * __task_rq_lock - lock the rq @p resides on.
++ */
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ lockdep_assert_held(&p->pi_lock);
++
++ for (;;) {
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
++ return rq;
++ raw_spin_unlock(&rq->lock);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++/*
++ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
++ */
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ for (;;) {
++ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ /*
++ * move_queued_task() task_rq_lock()
++ *
++ * ACQUIRE (rq->lock)
++ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
++ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
++ * [S] ->cpu = new_cpu [L] task_rq()
++ * [L] ->on_rq
++ * RELEASE (rq->lock)
++ *
++ * If we observe the old CPU in task_rq_lock(), the acquire of
++ * the old rq->lock will fully serialize against the stores.
++ *
++ * If we observe the new CPU in task_rq_lock(), the address
++ * dependency headed by '[L] rq = task_rq()' and the acquire
++ * will pair with the WMB to ensure we then also see migrating.
++ */
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++static inline void rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irqsave(&rq->lock, rf->flags);
++}
++
++static inline void rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
++}
++
++DEFINE_LOCK_GUARD_1(rq_lock_irqsave, struct rq,
++ rq_lock_irqsave(_T->lock, &_T->rf),
++ rq_unlock_irqrestore(_T->lock, &_T->rf),
++ struct rq_flags rf)
++
++void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
++{
++ raw_spinlock_t *lock;
++
++ /* Matches synchronize_rcu() in __sched_core_enable() */
++ preempt_disable();
++
++ for (;;) {
++ lock = __rq_lockp(rq);
++ raw_spin_lock_nested(lock, subclass);
++ if (likely(lock == __rq_lockp(rq))) {
++ /* preempt_count *MUST* be > 1 */
++ preempt_enable_no_resched();
++ return;
++ }
++ raw_spin_unlock(lock);
++ }
++}
++
++void raw_spin_rq_unlock(struct rq *rq)
++{
++ raw_spin_unlock(rq_lockp(rq));
++}
++
++/*
++ * RQ-clock updating methods:
++ */
++
++static void update_rq_clock_task(struct rq *rq, s64 delta)
++{
++/*
++ * In theory, the compile should just see 0 here, and optimize out the call
++ * to sched_rt_avg_update. But I don't trust it...
++ */
++ s64 __maybe_unused steal = 0, irq_delta = 0;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ if (irqtime_enabled()) {
++ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
++
++ /*
++ * Since irq_time is only updated on {soft,}irq_exit, we might run into
++ * this case when a previous update_rq_clock() happened inside a
++ * {soft,}IRQ region.
++ *
++ * When this happens, we stop ->clock_task and only update the
++ * prev_irq_time stamp to account for the part that fit, so that a next
++ * update will consume the rest. This ensures ->clock_task is
++ * monotonic.
++ *
++ * It does however cause some slight miss-attribution of {soft,}IRQ
++ * time, a more accurate solution would be to update the irq_time using
++ * the current rq->clock timestamp, except that would require using
++ * atomic ops.
++ */
++ if (irq_delta > delta)
++ irq_delta = delta;
++
++ rq->prev_irq_time += irq_delta;
++ delta -= irq_delta;
++ delayacct_irq(rq->curr, irq_delta);
++ }
++#endif
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ if (static_key_false((¶virt_steal_rq_enabled))) {
++ u64 prev_steal;
++
++ steal = prev_steal = paravirt_steal_clock(cpu_of(rq));
++ steal -= rq->prev_steal_time_rq;
++
++ if (unlikely(steal > delta))
++ steal = delta;
++
++ rq->prev_steal_time_rq = prev_steal;
++ delta -= steal;
++ }
++#endif
++
++ rq->clock_task += delta;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ if ((irq_delta + steal))
++ update_irq_load_avg(rq, irq_delta + steal);
++#endif
++}
++
++static inline void update_rq_clock(struct rq *rq)
++{
++ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
++
++ if (unlikely(delta <= 0))
++ return;
++ rq->clock += delta;
++ sched_update_rq_clock(rq);
++ update_rq_clock_task(rq, delta);
++}
++
++/*
++ * RQ Load update routine
++ */
++#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
++#define RQ_UTIL_SHIFT (8)
++#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
++
++#define LOAD_BLOCK(t) ((t) >> 17)
++#define LOAD_HALF_BLOCK(t) ((t) >> 16)
++#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
++#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
++#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
++
++static inline void rq_load_update(struct rq *rq)
++{
++ u64 time = rq->clock;
++ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp), RQ_LOAD_HISTORY_BITS - 1);
++ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
++ u64 curr = !!rq->nr_running;
++
++ if (delta) {
++ rq->load_history = rq->load_history >> delta;
++
++ if (delta < RQ_UTIL_SHIFT) {
++ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
++ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
++ rq->load_history ^= LOAD_BLOCK_BIT(delta);
++ }
++
++ rq->load_block = BLOCK_MASK(time) * prev;
++ } else {
++ rq->load_block += (time - rq->load_stamp) * prev;
++ }
++ if (prev ^ curr)
++ rq->load_history ^= CURRENT_LOAD_BIT;
++ rq->load_stamp = time;
++}
++
++unsigned long rq_load_util(struct rq *rq, unsigned long max)
++{
++ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
++}
++
++unsigned long sched_cpu_util(int cpu)
++{
++ return rq_load_util(cpu_rq(cpu), arch_scale_cpu_capacity(cpu));
++}
++
++#ifdef CONFIG_CPU_FREQ
++/**
++ * cpufreq_update_util - Take a note about CPU utilization changes.
++ * @rq: Runqueue to carry out the update for.
++ * @flags: Update reason flags.
++ *
++ * This function is called by the scheduler on the CPU whose utilization is
++ * being updated.
++ *
++ * It can only be called from RCU-sched read-side critical sections.
++ *
++ * The way cpufreq is currently arranged requires it to evaluate the CPU
++ * performance state (frequency/voltage) on a regular basis to prevent it from
++ * being stuck in a completely inadequate performance level for too long.
++ * That is not guaranteed to happen if the updates are only triggered from CFS
++ * and DL, though, because they may not be coming in if only RT tasks are
++ * active all the time (or there are RT tasks only).
++ *
++ * As a workaround for that issue, this function is called periodically by the
++ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
++ * but that really is a band-aid. Going forward it should be replaced with
++ * solutions targeted more specifically at RT tasks.
++ */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ struct update_util_data *data;
++
++ rq_load_update(rq);
++ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data, cpu_of(rq)));
++ if (data)
++ data->func(data, rq_clock(rq), flags);
++}
++#else /* !CONFIG_CPU_FREQ: */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ rq_load_update(rq);
++}
++#endif /* !CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++/*
++ * Tick may be needed by tasks in the runqueue depending on their policy and
++ * requirements. If tick is needed, lets send the target an IPI to kick it out
++ * of nohz mode if necessary.
++ */
++static inline void sched_update_tick_dependency(struct rq *rq)
++{
++ int cpu = cpu_of(rq);
++
++ if (!tick_nohz_full_cpu(cpu))
++ return;
++
++ if (rq->nr_running < 2)
++ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
++ else
++ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
++}
++#else /* !CONFIG_NO_HZ_FULL: */
++static inline void sched_update_tick_dependency(struct rq *rq) { }
++#endif /* !CONFIG_NO_HZ_FULL */
++
++static inline void add_nr_running(struct rq *rq, unsigned count)
++{
++ rq->nr_running += count;
++ if (rq->nr_running > 1) {
++ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
++ rq->prio_balance_time = rq->clock;
++ }
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void sub_nr_running(struct rq *rq, unsigned count)
++{
++ rq->nr_running -= count;
++ if (rq->nr_running < 2) {
++ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
++ rq->prio_balance_time = 0;
++ }
++
++ sched_update_tick_dependency(rq);
++}
++
++bool sched_task_on_rq(struct task_struct *p)
++{
++ return task_on_rq_queued(p);
++}
++
++unsigned long get_wchan(struct task_struct *p)
++{
++ unsigned long ip = 0;
++ unsigned int state;
++
++ if (!p || p == current)
++ return 0;
++
++ /* Only get wchan if task is blocked and we can keep it that way. */
++ raw_spin_lock_irq(&p->pi_lock);
++ state = READ_ONCE(p->__state);
++ smp_rmb(); /* see try_to_wake_up() */
++ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
++ ip = __get_wchan(p);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ return ip;
++}
++
++/*
++ * Add/Remove/Requeue task to/from the runqueue routines
++ * Context: rq->lock
++ */
++#define __SCHED_DEQUEUE_TASK(p, rq, flags, func) \
++ sched_info_dequeue(rq, p); \
++ \
++ __list_del_entry(&p->sq_node); \
++ if (p->sq_node.prev == p->sq_node.next) { \
++ clear_bit(sched_idx2prio(p->sq_node.next - &rq->queue.heads[0], rq), \
++ rq->queue.bitmap); \
++ func; \
++ }
++
++#define __SCHED_ENQUEUE_TASK(p, rq, flags, func) \
++ sched_info_enqueue(rq, p); \
++ { \
++ int idx, prio; \
++ TASK_SCHED_PRIO_IDX(p, rq, idx, prio); \
++ list_add_tail(&p->sq_node, &rq->queue.heads[idx]); \
++ if (list_is_first(&p->sq_node, &rq->queue.heads[idx])) { \
++ set_bit(prio, rq->queue.bitmap); \
++ func; \
++ } \
++ }
++
++static inline void __dequeue_task(struct task_struct *p, struct rq *rq)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_DEQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
++}
++
++static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++ __dequeue_task(p, rq);
++ sub_nr_running(rq, 1);
++}
++
++static inline void __enqueue_task(struct task_struct *p, struct rq *rq)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: enqueue(%d) %px %d\n", cpu_of(rq), p, p->prio);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_ENQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
++}
++
++static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++ __enqueue_task(p, rq);
++ add_nr_running(rq, 1);
++}
++
++void requeue_task(struct task_struct *p, struct rq *rq)
++{
++ struct list_head *node = &p->sq_node;
++ int deq_idx, idx, prio;
++
++ TASK_SCHED_PRIO_IDX(p, rq, idx, prio);
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
++ cpu_of(rq), task_cpu(p));
++#endif
++ if (list_is_last(node, &rq->queue.heads[idx]))
++ return;
++
++ __list_del_entry(node);
++ if (node->prev == node->next && (deq_idx = node->next - &rq->queue.heads[0]) != idx)
++ clear_bit(sched_idx2prio(deq_idx, rq), rq->queue.bitmap);
++
++ list_add_tail(node, &rq->queue.heads[idx]);
++ if (list_is_first(node, &rq->queue.heads[idx]))
++ set_bit(prio, rq->queue.bitmap);
++ update_sched_preempt_mask(rq);
++}
++
++/*
++ * try_cmpxchg based fetch_or() macro so it works for different integer types:
++ */
++#define fetch_or(ptr, mask) \
++ ({ \
++ typeof(ptr) _ptr = (ptr); \
++ typeof(mask) _mask = (mask); \
++ typeof(*_ptr) _val = *_ptr; \
++ \
++ do { \
++ } while (!try_cmpxchg(_ptr, &_val, _val | _mask)); \
++ _val; \
++})
++
++#ifdef TIF_POLLING_NRFLAG
++/*
++ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
++ * this avoids any races wrt polling state changes and thereby avoids
++ * spurious IPIs.
++ */
++static inline bool set_nr_and_not_polling(struct thread_info *ti, int tif)
++{
++ return !(fetch_or(&ti->flags, 1 << tif) & _TIF_POLLING_NRFLAG);
++}
++
++/*
++ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
++ *
++ * If this returns true, then the idle task promises to call
++ * sched_ttwu_pending() and reschedule soon.
++ */
++static bool set_nr_if_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ typeof(ti->flags) val = READ_ONCE(ti->flags);
++
++ do {
++ if (!(val & _TIF_POLLING_NRFLAG))
++ return false;
++ if (val & _TIF_NEED_RESCHED)
++ return true;
++ } while (!try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED));
++
++ return true;
++}
++
++#else /* !TIF_POLLING_NRFLAG: */
++static inline bool set_nr_and_not_polling(struct thread_info *ti, int tif)
++{
++ set_ti_thread_flag(ti, tif);
++ return true;
++}
++
++static inline bool set_nr_if_polling(struct task_struct *p)
++{
++ return false;
++}
++#endif /* !TIF_POLLING_NRFLAG */
++
++static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ struct wake_q_node *node = &task->wake_q;
++
++ /*
++ * Atomically grab the task, if ->wake_q is !nil already it means
++ * it's already queued (either by us or someone else) and will get the
++ * wakeup due to that.
++ *
++ * In order to ensure that a pending wakeup will observe our pending
++ * state, even in the failed case, an explicit smp_mb() must be used.
++ */
++ smp_mb__before_atomic();
++ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
++ return false;
++
++ /*
++ * The head is context local, there can be no concurrency.
++ */
++ *head->lastp = node;
++ head->lastp = &node->next;
++ return true;
++}
++
++/**
++ * wake_q_add() - queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ */
++void wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ if (__wake_q_add(head, task))
++ get_task_struct(task);
++}
++
++/**
++ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ *
++ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
++ * that already hold reference to @task can call the 'safe' version and trust
++ * wake_q to do the right thing depending whether or not the @task is already
++ * queued for wakeup.
++ */
++void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
++{
++ if (!__wake_q_add(head, task))
++ put_task_struct(task);
++}
++
++void wake_up_q(struct wake_q_head *head)
++{
++ struct wake_q_node *node = head->first;
++
++ while (node != WAKE_Q_TAIL) {
++ struct task_struct *task;
++
++ task = container_of(node, struct task_struct, wake_q);
++ node = node->next;
++ /* pairs with cmpxchg_relaxed() in __wake_q_add() */
++ WRITE_ONCE(task->wake_q.next, NULL);
++ /* Task can safely be re-inserted now. */
++
++ /*
++ * wake_up_process() executes a full barrier, which pairs with
++ * the queueing in wake_q_add() so as not to miss wakeups.
++ */
++ wake_up_process(task);
++ put_task_struct(task);
++ }
++}
++
++/*
++ * resched_curr - mark rq's current task 'to be rescheduled now'.
++ *
++ * On UP this means the setting of the need_resched flag, on SMP it
++ * might also involve a cross-CPU call to trigger the scheduler on
++ * the target CPU.
++ */
++static inline void __resched_curr(struct rq *rq, int tif)
++{
++ struct task_struct *curr = rq->curr;
++ struct thread_info *cti = task_thread_info(curr);
++ int cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Always immediately preempt the idle task; no point in delaying doing
++ * actual work.
++ */
++ if (is_idle_task(curr) && tif == TIF_NEED_RESCHED_LAZY)
++ tif = TIF_NEED_RESCHED;
++
++ if (cti->flags & ((1 << tif) | _TIF_NEED_RESCHED))
++ return;
++
++ cpu = cpu_of(rq);
++
++ trace_sched_set_need_resched_tp(curr, cpu, tif);
++ if (cpu == smp_processor_id()) {
++ set_ti_thread_flag(cti, tif);
++ if (tif == TIF_NEED_RESCHED)
++ set_preempt_need_resched();
++ return;
++ }
++
++ if (set_nr_and_not_polling(cti, tif)) {
++ if (tif == TIF_NEED_RESCHED)
++ smp_send_reschedule(cpu);
++ } else {
++ trace_sched_wake_idle_without_ipi(cpu);
++ }
++}
++
++void __trace_set_need_resched(struct task_struct *curr, int tif)
++{
++ trace_sched_set_need_resched_tp(curr, smp_processor_id(), tif);
++}
++
++static inline void resched_curr(struct rq *rq)
++{
++ __resched_curr(rq, TIF_NEED_RESCHED);
++}
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_preempt_lazy);
++static __always_inline bool dynamic_preempt_lazy(void)
++{
++ return static_branch_unlikely(&sk_dynamic_preempt_lazy);
++}
++#else /* !CONFIG_PREEMPT_DYNAMIC: */
++static __always_inline bool dynamic_preempt_lazy(void)
++{
++ return IS_ENABLED(CONFIG_PREEMPT_LAZY);
++}
++#endif /* !CONFIG_PREEMPT_DYNAMIC */
++
++static __always_inline int get_lazy_tif_bit(void)
++{
++ if (dynamic_preempt_lazy())
++ return TIF_NEED_RESCHED_LAZY;
++
++ return TIF_NEED_RESCHED;
++}
++
++static inline void resched_curr_lazy(struct rq *rq)
++{
++ __resched_curr(rq, get_lazy_tif_bit());
++}
++
++void resched_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (cpu_online(cpu) || cpu == smp_processor_id())
++ resched_curr(cpu_rq(cpu));
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++#ifdef CONFIG_NO_HZ_COMMON
++/*
++ * This routine will record that the CPU is going idle with tick stopped.
++ * This info will be used in performing idle load balancing in the future.
++ */
++void nohz_balance_enter_idle(int cpu) {}
++
++/*
++ * In the semi idle case, use the nearest busy CPU for migrating timers
++ * from an idle CPU. This is good for power-savings.
++ *
++ * We don't do similar optimization for completely idle system, as
++ * selecting an idle CPU will add more delays to the timers than intended
++ * (as that CPU's timer base may not be up to date wrt jiffies etc).
++ */
++int get_nohz_timer_target(void)
++{
++ int i, cpu = smp_processor_id(), default_cpu = -1;
++ struct cpumask *mask;
++ const struct cpumask *hk_mask;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE)) {
++ if (!idle_cpu(cpu))
++ return cpu;
++ default_cpu = cpu;
++ }
++
++ hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
++
++ for (mask = per_cpu(sched_cpu_topo_masks, cpu);
++ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
++ for_each_cpu_and(i, mask, hk_mask)
++ if (!idle_cpu(i))
++ return i;
++
++ if (default_cpu == -1)
++ default_cpu = housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE);
++ cpu = default_cpu;
++
++ return cpu;
++}
++
++/*
++ * When add_timer_on() enqueues a timer into the timer wheel of an
++ * idle CPU then this timer might expire before the next timer event
++ * which is scheduled to wake up that CPU. In case of a completely
++ * idle system the next event might even be infinite time into the
++ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
++ * leaves the inner idle loop so the newly added timer is taken into
++ * account when the CPU goes back to idle and evaluates the timer
++ * wheel for the next timer event.
++ */
++static inline void wake_up_idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (cpu == smp_processor_id())
++ return;
++
++ /*
++ * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
++ * part of the idle loop. This forces an exit from the idle loop
++ * and a round trip to schedule(). Now this could be optimized
++ * because a simple new idle loop iteration is enough to
++ * re-evaluate the next tick. Provided some re-ordering of tick
++ * nohz functions that would need to follow TIF_NR_POLLING
++ * clearing:
++ *
++ * - On most architectures, a simple fetch_or on ti::flags with a
++ * "0" value would be enough to know if an IPI needs to be sent.
++ *
++ * - x86 needs to perform a last need_resched() check between
++ * monitor and mwait which doesn't take timers into account.
++ * There a dedicated TIF_TIMER flag would be required to
++ * fetch_or here and be checked along with TIF_NEED_RESCHED
++ * before mwait().
++ *
++ * However, remote timer enqueue is not such a frequent event
++ * and testing of the above solutions didn't appear to report
++ * much benefits.
++ */
++ if (set_nr_and_not_polling(task_thread_info(rq->idle), TIF_NEED_RESCHED))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++static inline bool wake_up_full_nohz_cpu(int cpu)
++{
++ /*
++ * We just need the target to call irq_exit() and re-evaluate
++ * the next tick. The nohz full kick at least implies that.
++ * If needed we can still optimize that later with an
++ * empty IRQ.
++ */
++ if (cpu_is_offline(cpu))
++ return true; /* Don't try to wake offline CPUs. */
++ if (tick_nohz_full_cpu(cpu)) {
++ if (cpu != smp_processor_id() ||
++ tick_nohz_tick_stopped())
++ tick_nohz_full_kick_cpu(cpu);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_nohz_cpu(int cpu)
++{
++ if (!wake_up_full_nohz_cpu(cpu))
++ wake_up_idle_cpu(cpu);
++}
++
++static void nohz_csd_func(void *info)
++{
++ struct rq *rq = info;
++ int cpu = cpu_of(rq);
++ unsigned int flags;
++
++ /*
++ * Release the rq::nohz_csd.
++ */
++ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
++ WARN_ON(!(flags & NOHZ_KICK_MASK));
++
++ rq->idle_balance = idle_cpu(cpu);
++ if (rq->idle_balance) {
++ rq->nohz_idle_balance = flags;
++ __raise_softirq_irqoff(SCHED_SOFTIRQ);
++ }
++}
++
++#endif /* CONFIG_NO_HZ_COMMON */
++
++static inline void wakeup_preempt(struct rq *rq)
++{
++ if (sched_rq_first_task(rq) != rq->curr)
++ resched_curr(rq);
++}
++
++static __always_inline
++int __task_state_match(struct task_struct *p, unsigned int state)
++{
++ if (READ_ONCE(p->__state) & state)
++ return 1;
++
++ if (READ_ONCE(p->saved_state) & state)
++ return -1;
++
++ return 0;
++}
++
++static __always_inline
++int task_state_match(struct task_struct *p, unsigned int state)
++{
++ /*
++ * Serialize against current_save_and_set_rtlock_wait_state(),
++ * current_restore_rtlock_saved_state(), and __refrigerator().
++ */
++ guard(raw_spinlock_irq)(&p->pi_lock);
++
++ return __task_state_match(p, state);
++}
++
++/*
++ * wait_task_inactive - wait for a thread to unschedule.
++ *
++ * Wait for the thread to block in any of the states set in @match_state.
++ * If it changes, i.e. @p might have woken up, then return zero. When we
++ * succeed in waiting for @p to be off its CPU, we return a positive number
++ * (its total switch count). If a second call a short while later returns the
++ * same number, the caller can be sure that @p has remained unscheduled the
++ * whole time.
++ *
++ * The caller must ensure that the task *will* unschedule sometime soon,
++ * else this function might spin for a *long* time. This function can't
++ * be called with interrupts off, or it may introduce deadlock with
++ * smp_call_function() if an IPI is sent by the same process we are
++ * waiting to become inactive.
++ */
++unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
++{
++ unsigned long flags;
++ int running, queued, match;
++ unsigned long ncsw;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ for (;;) {
++ rq = task_rq(p);
++
++ /*
++ * If the task is actively running on another CPU
++ * still, just relax and busy-wait without holding
++ * any locks.
++ *
++ * NOTE! Since we don't hold any locks, it's not
++ * even sure that "rq" stays as the right runqueue!
++ * But we don't care, since this will return false
++ * if the runqueue has changed and p is actually now
++ * running somewhere else!
++ */
++ while (task_on_cpu(p)) {
++ if (!task_state_match(p, match_state))
++ return 0;
++ cpu_relax();
++ }
++
++ /*
++ * Ok, time to look more closely! We need the rq
++ * lock now, to be *sure*. If we're wrong, we'll
++ * just go back and repeat.
++ */
++ task_access_lock_irqsave(p, &lock, &flags);
++ trace_sched_wait_task(p);
++ running = task_on_cpu(p);
++ queued = p->on_rq;
++ ncsw = 0;
++ if ((match = __task_state_match(p, match_state))) {
++ /*
++ * When matching on p->saved_state, consider this task
++ * still queued so it will wait.
++ */
++ if (match < 0)
++ queued = 1;
++ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
++ }
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ /*
++ * If it changed from the expected state, bail out now.
++ */
++ if (unlikely(!ncsw))
++ break;
++
++ /*
++ * Was it really running after all now that we
++ * checked with the proper locks actually held?
++ *
++ * Oops. Go back and try again..
++ */
++ if (unlikely(running)) {
++ cpu_relax();
++ continue;
++ }
++
++ /*
++ * It's not enough that it's not actively running,
++ * it must be off the runqueue _entirely_, and not
++ * preempted!
++ *
++ * So if it was still runnable (but just not actively
++ * running right now), it's preempted, and we should
++ * yield - it could be a while.
++ */
++ if (unlikely(queued)) {
++ ktime_t to = NSEC_PER_SEC / HZ;
++
++ set_current_state(TASK_UNINTERRUPTIBLE);
++ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
++ continue;
++ }
++
++ /*
++ * Ahh, all good. It wasn't running, and it wasn't
++ * runnable, which means that it will never become
++ * running in the future either. We're all done!
++ */
++ break;
++ }
++
++ return ncsw;
++}
++
++#ifdef CONFIG_SCHED_HRTICK
++/*
++ * Use HR-timers to deliver accurate preemption points.
++ */
++
++static void hrtick_clear(struct rq *rq)
++{
++ if (hrtimer_active(&rq->hrtick_timer))
++ hrtimer_cancel(&rq->hrtick_timer);
++}
++
++/*
++ * High-resolution timer tick.
++ * Runs from hardirq context with interrupts disabled.
++ */
++static enum hrtimer_restart hrtick(struct hrtimer *timer)
++{
++ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
++
++ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
++
++ raw_spin_lock(&rq->lock);
++ resched_curr(rq);
++ raw_spin_unlock(&rq->lock);
++
++ return HRTIMER_NORESTART;
++}
++
++/*
++ * Use hrtick when:
++ * - enabled by features
++ * - hrtimer is actually high res
++ */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ /**
++ * Alt schedule FW doesn't support sched_feat yet
++ if (!sched_feat(HRTICK))
++ return 0;
++ */
++ if (!cpu_active(cpu_of(rq)))
++ return 0;
++ return hrtimer_is_hres_active(&rq->hrtick_timer);
++}
++
++static void __hrtick_restart(struct rq *rq)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ ktime_t time = rq->hrtick_time;
++
++ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
++}
++
++/*
++ * called from hardirq (IPI) context
++ */
++static void __hrtick_start(void *arg)
++{
++ struct rq *rq = arg;
++
++ raw_spin_lock(&rq->lock);
++ __hrtick_restart(rq);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and IRQs disabled
++ */
++static inline void hrtick_start(struct rq *rq, u64 delay)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ s64 delta;
++
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense and can cause timer DoS.
++ */
++ delta = max_t(s64, delay, 10000LL);
++
++ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
++
++ if (rq == this_rq())
++ __hrtick_restart(rq);
++ else
++ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
++}
++
++static void hrtick_rq_init(struct rq *rq)
++{
++ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
++ hrtimer_setup(&rq->hrtick_timer, hrtick, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
++}
++#else /* !CONFIG_SCHED_HRTICK: */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ return 0;
++}
++
++static inline void hrtick_clear(struct rq *rq)
++{
++}
++
++static inline void hrtick_rq_init(struct rq *rq)
++{
++}
++#endif /* !CONFIG_SCHED_HRTICK */
++
++/*
++ * activate_task - move a task to the runqueue.
++ *
++ * Context: rq->lock
++ */
++static void activate_task(struct task_struct *p, struct rq *rq)
++{
++ enqueue_task(p, rq, ENQUEUE_WAKEUP);
++
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
++ ASSERT_EXCLUSIVE_WRITER(p->on_rq);
++
++ /*
++ * If in_iowait is set, the code below may not trigger any cpufreq
++ * utilization updates, so do it here explicitly with the IOWAIT flag
++ * passed.
++ */
++ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
++}
++
++static void block_task(struct rq *rq, struct task_struct *p)
++{
++ dequeue_task(p, rq, DEQUEUE_SLEEP);
++
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible++;
++
++ if (p->in_iowait) {
++ atomic_inc(&rq->nr_iowait);
++ delayacct_blkio_start();
++ }
++
++ ASSERT_EXCLUSIVE_WRITER(p->on_rq);
++
++ /*
++ * The moment this write goes through, ttwu() can swoop in and migrate
++ * this task, rendering our rq->__lock ineffective.
++ *
++ * __schedule() try_to_wake_up()
++ * LOCK rq->__lock LOCK p->pi_lock
++ * pick_next_task()
++ * pick_next_task_fair()
++ * pick_next_entity()
++ * dequeue_entities()
++ * __block_task()
++ * RELEASE p->on_rq = 0 if (p->on_rq && ...)
++ * break;
++ *
++ * ACQUIRE (after ctrl-dep)
++ *
++ * cpu = select_task_rq();
++ * set_task_cpu(p, cpu);
++ * ttwu_queue()
++ * ttwu_do_activate()
++ * LOCK rq->__lock
++ * activate_task()
++ * STORE p->on_rq = 1
++ * UNLOCK rq->__lock
++ *
++ * Callers must ensure to not reference @p after this -- we no longer
++ * own it.
++ */
++ smp_store_release(&p->on_rq, 0);
++}
++
++static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
++{
++ /*
++ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
++ * successfully executed on another CPU. We must ensure that updates of
++ * per-task data have been completed by this moment.
++ */
++ smp_wmb();
++
++ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
++}
++
++void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * We should never call set_task_cpu() on a blocked task,
++ * ttwu() will sort out the placement.
++ */
++ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
++
++#ifdef CONFIG_LOCKDEP
++ /*
++ * The caller should hold either p->pi_lock or rq->lock, when changing
++ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
++ *
++ * sched_move_task() holds both and thus holding either pins the cgroup,
++ * see task_group().
++ */
++ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
++ lockdep_is_held(&task_rq(p)->lock)));
++#endif
++ /*
++ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
++ */
++ WARN_ON_ONCE(!cpu_online(new_cpu));
++
++ WARN_ON_ONCE(is_migration_disabled(p));
++ trace_sched_migrate_task(p, new_cpu);
++
++ if (task_cpu(p) != new_cpu)
++ {
++ rseq_migrate(p);
++ sched_mm_cid_migrate_from(p);
++ perf_event_task_migrate(p);
++ }
++
++ __set_task_cpu(p, new_cpu);
++}
++
++static void
++__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ /*
++ * This here violates the locking rules for affinity, since we're only
++ * supposed to change these variables while holding both rq->lock and
++ * p->pi_lock.
++ *
++ * HOWEVER, it magically works, because ttwu() is the only code that
++ * accesses these variables under p->pi_lock and only does so after
++ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
++ * before finish_task().
++ *
++ * XXX do further audits, this smells like something putrid.
++ */
++ WARN_ON_ONCE(!p->on_cpu);
++ p->cpus_ptr = new_mask;
++}
++
++void migrate_disable(void)
++{
++ struct task_struct *p = current;
++ int cpu;
++
++ if (p->migration_disabled) {
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Warn about overflow half-way through the range.
++ */
++ WARN_ON_ONCE((s16)p->migration_disabled < 0);
++#endif
++ p->migration_disabled++;
++ return;
++ }
++
++ guard(preempt)();
++ cpu = smp_processor_id();
++ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
++ cpu_rq(cpu)->nr_pinned++;
++ p->migration_disabled = 1;
++ /*
++ * Violates locking rules! see comment in __do_set_cpus_ptr().
++ */
++ if (p->cpus_ptr == &p->cpus_mask)
++ __do_set_cpus_ptr(p, cpumask_of(cpu));
++ }
++}
++EXPORT_SYMBOL_GPL(migrate_disable);
++
++void migrate_enable(void)
++{
++ struct task_struct *p = current;
++
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Check both overflow from migrate_disable() and superfluous
++ * migrate_enable().
++ */
++ if (WARN_ON_ONCE((s16)p->migration_disabled <= 0))
++ return;
++#endif
++
++ if (p->migration_disabled > 1) {
++ p->migration_disabled--;
++ return;
++ }
++
++ /*
++ * Ensure stop_task runs either before or after this, and that
++ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
++ */
++ guard(preempt)();
++ /*
++ * Assumption: current should be running on allowed cpu
++ */
++ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
++ if (p->cpus_ptr != &p->cpus_mask)
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ /*
++ * Mustn't clear migration_disabled() until cpus_ptr points back at the
++ * regular cpus_mask, otherwise things that race (eg.
++ * select_fallback_rq) get confused.
++ */
++ barrier();
++ p->migration_disabled = 0;
++ this_rq()->nr_pinned--;
++}
++EXPORT_SYMBOL_GPL(migrate_enable);
++
++static void __migrate_force_enable(struct task_struct *p, struct rq *rq)
++{
++ if (likely(p->cpus_ptr != &p->cpus_mask))
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ p->migration_disabled = 0;
++ /* When p is migrate_disabled, rq->lock should be held */
++ rq->nr_pinned--;
++}
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return rq->nr_pinned;
++}
++
++/*
++ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
++ * __set_cpus_allowed_ptr() and select_fallback_rq().
++ */
++static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
++{
++ /* When not in the task's cpumask, no point in looking further. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /* migrate_disabled() must be allowed to finish. */
++ if (is_migration_disabled(p))
++ return cpu_online(cpu);
++
++ /* Non kernel threads are not allowed during either online or offline. */
++ if (!(p->flags & PF_KTHREAD))
++ return cpu_active(cpu) && task_cpu_possible(cpu, p);
++
++ /* KTHREAD_IS_PER_CPU is always allowed. */
++ if (kthread_is_per_cpu(p))
++ return cpu_online(cpu);
++
++ /* Regular kernel threads don't get to stay during offline. */
++ if (cpu_dying(cpu))
++ return false;
++
++ /* But are allowed during online. */
++ return cpu_online(cpu);
++}
++
++/*
++ * This is how migration works:
++ *
++ * 1) we invoke migration_cpu_stop() on the target CPU using
++ * stop_one_cpu().
++ * 2) stopper starts to run (implicitly forcing the migrated thread
++ * off the CPU)
++ * 3) it checks whether the migrated task is still in the wrong runqueue.
++ * 4) if it's in the wrong runqueue then the migration thread removes
++ * it and puts it into the right queue.
++ * 5) stopper completes and stop_one_cpu() returns and the migration
++ * is done.
++ */
++
++/*
++ * move_queued_task - move a queued task to new rq.
++ *
++ * Returns (locked) new rq. Old rq's lock is released.
++ */
++struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu)
++{
++ lockdep_assert_held(&rq->lock);
++
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ set_task_cpu(p, new_cpu);
++ raw_spin_unlock(&rq->lock);
++
++ rq = cpu_rq(new_cpu);
++
++ raw_spin_lock(&rq->lock);
++ WARN_ON_ONCE(task_cpu(p) != new_cpu);
++
++ sched_mm_cid_migrate_to(rq, p);
++
++ sched_task_sanity_check(p, rq);
++ enqueue_task(p, rq, 0);
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
++ wakeup_preempt(rq);
++
++ return rq;
++}
++
++struct migration_arg {
++ struct task_struct *task;
++ int dest_cpu;
++};
++
++/*
++ * Move (not current) task off this CPU, onto the destination CPU. We're doing
++ * this because either it can't run here any more (set_cpus_allowed()
++ * away from this CPU, or CPU going down), or because we're
++ * attempting to rebalance this task on exec (sched_exec).
++ *
++ * So we race with normal scheduler movements, but that's OK, as long
++ * as the task is no longer on this CPU.
++ */
++static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int dest_cpu)
++{
++ /* Affinity changed (again). */
++ if (!is_cpu_allowed(p, dest_cpu))
++ return rq;
++
++ return move_queued_task(rq, p, dest_cpu);
++}
++
++/*
++ * migration_cpu_stop - this will be executed by a high-prio stopper thread
++ * and performs thread migration by bumping thread off CPU then
++ * 'pushing' onto another runqueue.
++ */
++static int migration_cpu_stop(void *data)
++{
++ struct migration_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++
++ /*
++ * The original target CPU might have gone down and we might
++ * be on another CPU but it doesn't matter.
++ */
++ local_irq_save(flags);
++ /*
++ * We need to explicitly wake pending tasks before running
++ * __migrate_task() such that we will not miss enforcing cpus_ptr
++ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
++ */
++ flush_smp_call_function_queue();
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++ /*
++ * If task_rq(p) != rq, it cannot be migrated here, because we're
++ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
++ * we're holding p->pi_lock.
++ */
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ rq = __migrate_task(rq, p, arg->dest_cpu);
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++static inline void
++set_cpus_allowed_common(struct task_struct *p, struct affinity_context *ctx)
++{
++ cpumask_copy(&p->cpus_mask, ctx->new_mask);
++ p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
++
++ /*
++ * Swap in a new user_cpus_ptr if SCA_USER flag set
++ */
++ if (ctx->flags & SCA_USER)
++ swap(p->user_cpus_ptr, ctx->user_mask);
++}
++
++static void
++__do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
++{
++ lockdep_assert_held(&p->pi_lock);
++ set_cpus_allowed_common(p, ctx);
++ mm_set_cpus_allowed(p->mm, ctx->new_mask);
++}
++
++/*
++ * Used for kthread_bind() and select_fallback_rq(), in both cases the user
++ * affinity (if any) should be destroyed too.
++ */
++void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .user_mask = NULL,
++ .flags = SCA_USER, /* clear the user requested mask */
++ };
++ union cpumask_rcuhead {
++ cpumask_t cpumask;
++ struct rcu_head rcu;
++ };
++
++ __do_set_cpus_allowed(p, &ac);
++
++ if (is_migration_disabled(p) && !cpumask_test_cpu(task_cpu(p), &p->cpus_mask))
++ __migrate_force_enable(p, task_rq(p));
++
++ /*
++ * Because this is called with p->pi_lock held, it is not possible
++ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
++ * kfree_rcu().
++ */
++ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
++}
++
++int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
++ int node)
++{
++ cpumask_t *user_mask;
++ unsigned long flags;
++
++ /*
++ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
++ * may differ by now due to racing.
++ */
++ dst->user_cpus_ptr = NULL;
++
++ /*
++ * This check is racy and losing the race is a valid situation.
++ * It is not worth the extra overhead of taking the pi_lock on
++ * every fork/clone.
++ */
++ if (data_race(!src->user_cpus_ptr))
++ return 0;
++
++ user_mask = alloc_user_cpus_ptr(node);
++ if (!user_mask)
++ return -ENOMEM;
++
++ /*
++ * Use pi_lock to protect content of user_cpus_ptr
++ *
++ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
++ * do_set_cpus_allowed().
++ */
++ raw_spin_lock_irqsave(&src->pi_lock, flags);
++ if (src->user_cpus_ptr) {
++ swap(dst->user_cpus_ptr, user_mask);
++ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
++ }
++ raw_spin_unlock_irqrestore(&src->pi_lock, flags);
++
++ if (unlikely(user_mask))
++ kfree(user_mask);
++
++ return 0;
++}
++
++static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
++{
++ struct cpumask *user_mask = NULL;
++
++ swap(p->user_cpus_ptr, user_mask);
++
++ return user_mask;
++}
++
++void release_user_cpus_ptr(struct task_struct *p)
++{
++ kfree(clear_user_cpus_ptr(p));
++}
++
++/**
++ * task_curr - is this task currently executing on a CPU?
++ * @p: the task in question.
++ *
++ * Return: 1 if the task is currently executing. 0 otherwise.
++ */
++inline int task_curr(const struct task_struct *p)
++{
++ return cpu_curr(task_cpu(p)) == p;
++}
++
++/***
++ * kick_process - kick a running thread to enter/exit the kernel
++ * @p: the to-be-kicked thread
++ *
++ * Cause a process which is running on another CPU to enter
++ * kernel-mode, without any delay. (to get signals handled.)
++ *
++ * NOTE: this function doesn't have to take the runqueue lock,
++ * because all it wants to ensure is that the remote task enters
++ * the kernel. If the IPI races and the task has been migrated
++ * to another CPU then no harm is done and the purpose has been
++ * achieved as well.
++ */
++void kick_process(struct task_struct *p)
++{
++ guard(preempt)();
++ int cpu = task_cpu(p);
++
++ if ((cpu != smp_processor_id()) && task_curr(p))
++ smp_send_reschedule(cpu);
++}
++EXPORT_SYMBOL_GPL(kick_process);
++
++/*
++ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
++ *
++ * A few notes on cpu_active vs cpu_online:
++ *
++ * - cpu_active must be a subset of cpu_online
++ *
++ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
++ * see __set_cpus_allowed_ptr(). At this point the newly online
++ * CPU isn't yet part of the sched domains, and balancing will not
++ * see it.
++ *
++ * - on cpu-down we clear cpu_active() to mask the sched domains and
++ * avoid the load balancer to place new tasks on the to be removed
++ * CPU. Existing tasks will remain running there and will be taken
++ * off.
++ *
++ * This means that fallback selection must not select !active CPUs.
++ * And can assume that any active CPU must be online. Conversely
++ * select_task_rq() below may allow selection of !active CPUs in order
++ * to satisfy the above rules.
++ */
++static int select_fallback_rq(int cpu, struct task_struct *p)
++{
++ int nid = cpu_to_node(cpu);
++ const struct cpumask *nodemask = NULL;
++ enum { cpuset, possible, fail } state = cpuset;
++ int dest_cpu;
++
++ /*
++ * If the node that the CPU is on has been offlined, cpu_to_node()
++ * will return -1. There is no CPU on the node, and we should
++ * select the CPU on the other node.
++ */
++ if (nid != -1) {
++ nodemask = cpumask_of_node(nid);
++
++ /* Look for allowed, online CPU in same node. */
++ for_each_cpu(dest_cpu, nodemask) {
++ if (is_cpu_allowed(p, dest_cpu))
++ return dest_cpu;
++ }
++ }
++
++ for (;;) {
++ /* Any allowed, online CPU? */
++ for_each_cpu(dest_cpu, p->cpus_ptr) {
++ if (!is_cpu_allowed(p, dest_cpu))
++ continue;
++ goto out;
++ }
++
++ /* No more Mr. Nice Guy. */
++ switch (state) {
++ case cpuset:
++ if (cpuset_cpus_allowed_fallback(p)) {
++ state = possible;
++ break;
++ }
++ fallthrough;
++ case possible:
++ /*
++ * XXX When called from select_task_rq() we only
++ * hold p->pi_lock and again violate locking order.
++ *
++ * More yuck to audit.
++ */
++ do_set_cpus_allowed(p, task_cpu_fallback_mask(p));
++ state = fail;
++ break;
++
++ case fail:
++ BUG();
++ break;
++ }
++ }
++
++out:
++ if (state != cpuset) {
++ /*
++ * Don't tell them about moving exiting tasks or
++ * kernel threads (both mm NULL), since they never
++ * leave kernel.
++ */
++ if (p->mm && printk_ratelimit()) {
++ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
++ task_pid_nr(p), p->comm, cpu);
++ }
++ }
++
++ return dest_cpu;
++}
++
++static inline void
++sched_preempt_mask_flush(cpumask_t *mask, int prio, int ref)
++{
++ int cpu;
++
++ cpumask_copy(mask, sched_preempt_mask + ref);
++ if (prio < ref) {
++ for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits) {
++ if (prio < cpu_rq(cpu)->prio)
++ cpumask_set_cpu(cpu, mask);
++ }
++ } else {
++ for_each_cpu_andnot(cpu, mask, sched_idle_mask) {
++ if (prio >= cpu_rq(cpu)->prio)
++ cpumask_clear_cpu(cpu, mask);
++ }
++ }
++}
++
++static inline int
++preempt_mask_check(cpumask_t *preempt_mask, const cpumask_t *allow_mask, int prio)
++{
++ cpumask_t *mask = sched_preempt_mask + prio;
++ int pr = atomic_read(&sched_prio_record);
++
++ if (pr != prio && SCHED_QUEUE_BITS - 1 != prio) {
++ sched_preempt_mask_flush(mask, prio, pr);
++ atomic_set(&sched_prio_record, prio);
++ }
++
++ return cpumask_and(preempt_mask, allow_mask, mask);
++}
++
++__read_mostly idle_select_func_t idle_select_func ____cacheline_aligned_in_smp = cpumask_and;
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ cpumask_t allow_mask, mask;
++
++ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
++ return select_fallback_rq(task_cpu(p), p);
++
++ if (idle_select_func(&mask, &allow_mask, sched_idle_mask) ||
++ preempt_mask_check(&mask, &allow_mask, task_sched_prio(p)))
++ return best_mask_cpu(task_cpu(p), &mask);
++
++ return best_mask_cpu(task_cpu(p), &allow_mask);
++}
++
++void sched_set_stop_task(int cpu, struct task_struct *stop)
++{
++ static struct lock_class_key stop_pi_lock;
++ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
++ struct sched_param start_param = { .sched_priority = 0 };
++ struct task_struct *old_stop = cpu_rq(cpu)->stop;
++
++ if (stop) {
++ /*
++ * Make it appear like a SCHED_FIFO task, its something
++ * userspace knows about and won't get confused about.
++ *
++ * Also, it will make PI more or less work without too
++ * much confusion -- but then, stop work should not
++ * rely on PI working anyway.
++ */
++ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
++
++ /*
++ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
++ * adjust the effective priority of a task. As a result,
++ * rt_mutex_setprio() can trigger (RT) balancing operations,
++ * which can then trigger wakeups of the stop thread to push
++ * around the current task.
++ *
++ * The stop task itself will never be part of the PI-chain, it
++ * never blocks, therefore that ->pi_lock recursion is safe.
++ * Tell lockdep about this by placing the stop->pi_lock in its
++ * own class.
++ */
++ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
++ }
++
++ cpu_rq(cpu)->stop = stop;
++
++ if (old_stop) {
++ /*
++ * Reset it back to a normal scheduling policy so that
++ * it can die in pieces.
++ */
++ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
++ }
++}
++
++static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
++ raw_spinlock_t *lock, unsigned long irq_flags)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ /* Can the task run on the task's current CPU? If so, we're done */
++ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
++ if (is_migration_disabled(p))
++ __migrate_force_enable(p, rq);
++
++ if (task_on_cpu(p) || READ_ONCE(p->__state) == TASK_WAKING) {
++ struct migration_arg arg = { p, dest_cpu };
++
++ /* Need help from migration thread: drop lock and wait. */
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
++ return 0;
++ }
++ if (task_on_rq_queued(p)) {
++ /*
++ * OK, since we're going to drop the lock immediately
++ * afterwards anyway.
++ */
++ update_rq_clock(rq);
++ rq = move_queued_task(rq, p, dest_cpu);
++ lock = &rq->lock;
++ }
++ }
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return 0;
++}
++
++static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
++ struct affinity_context *ctx,
++ struct rq *rq,
++ raw_spinlock_t *lock,
++ unsigned long irq_flags)
++{
++ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
++ const struct cpumask *cpu_valid_mask = cpu_active_mask;
++ bool kthread = p->flags & PF_KTHREAD;
++ int dest_cpu;
++ int ret = 0;
++
++ if (kthread || is_migration_disabled(p)) {
++ /*
++ * Kernel threads are allowed on online && !active CPUs,
++ * however, during cpu-hot-unplug, even these might get pushed
++ * away if not KTHREAD_IS_PER_CPU.
++ *
++ * Specifically, migration_disabled() tasks must not fail the
++ * cpumask_any_and_distribute() pick below, esp. so on
++ * SCA_MIGRATE_ENABLE, otherwise we'll not call
++ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
++ */
++ cpu_valid_mask = cpu_online_mask;
++ }
++
++ if (!kthread && !cpumask_subset(ctx->new_mask, cpu_allowed_mask)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ /*
++ * Must re-check here, to close a race against __kthread_bind(),
++ * sched_setaffinity() is not guaranteed to observe the flag.
++ */
++ if ((ctx->flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ if (cpumask_equal(&p->cpus_mask, ctx->new_mask))
++ goto out;
++
++ dest_cpu = cpumask_any_and(cpu_valid_mask, ctx->new_mask);
++ if (dest_cpu >= nr_cpu_ids) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __do_set_cpus_allowed(p, ctx);
++
++ return affine_move_task(rq, p, dest_cpu, lock, irq_flags);
++
++out:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++
++ return ret;
++}
++
++/*
++ * Change a given task's CPU affinity. Migrate the thread to a
++ * is removed from the allowed bitmask.
++ *
++ * NOTE: the caller must have a valid reference to the task, the
++ * task must not exit() & deallocate itself prematurely. The
++ * call is not atomic; no spinlocks may be held.
++ */
++int __set_cpus_allowed_ptr(struct task_struct *p,
++ struct affinity_context *ctx)
++{
++ unsigned long irq_flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
++ * flags are set.
++ */
++ if (p->user_cpus_ptr &&
++ !(ctx->flags & SCA_USER) &&
++ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
++ ctx->new_mask = rq->scratch_mask;
++
++
++ return __set_cpus_allowed_ptr_locked(p, ctx, rq, lock, irq_flags);
++}
++
++int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++
++ return __set_cpus_allowed_ptr(p, &ac);
++}
++EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
++
++/*
++ * Change a given task's CPU affinity to the intersection of its current
++ * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
++ * If user_cpus_ptr is defined, use it as the basis for restricting CPU
++ * affinity or use cpu_online_mask instead.
++ *
++ * If the resulting mask is empty, leave the affinity unchanged and return
++ * -EINVAL.
++ */
++static int restrict_cpus_allowed_ptr(struct task_struct *p,
++ struct cpumask *new_mask,
++ const struct cpumask *subset_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++ unsigned long irq_flags;
++ raw_spinlock_t *lock;
++ struct rq *rq;
++ int err;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++
++ if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
++ err = -EINVAL;
++ goto err_unlock;
++ }
++
++ return __set_cpus_allowed_ptr_locked(p, &ac, rq, lock, irq_flags);
++
++err_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return err;
++}
++
++/*
++ * Restrict the CPU affinity of task @p so that it is a subset of
++ * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
++ * old affinity mask. If the resulting mask is empty, we warn and walk
++ * up the cpuset hierarchy until we find a suitable mask.
++ */
++void force_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ cpumask_var_t new_mask;
++ const struct cpumask *override_mask = task_cpu_possible_mask(p);
++
++ alloc_cpumask_var(&new_mask, GFP_KERNEL);
++
++ /*
++ * __migrate_task() can fail silently in the face of concurrent
++ * offlining of the chosen destination CPU, so take the hotplug
++ * lock to ensure that the migration succeeds.
++ */
++ cpus_read_lock();
++ if (!cpumask_available(new_mask))
++ goto out_set_mask;
++
++ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
++ goto out_free_mask;
++
++ /*
++ * We failed to find a valid subset of the affinity mask for the
++ * task, so override it based on its cpuset hierarchy.
++ */
++ cpuset_cpus_allowed(p, new_mask);
++ override_mask = new_mask;
++
++out_set_mask:
++ if (printk_ratelimit()) {
++ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
++ task_pid_nr(p), p->comm,
++ cpumask_pr_args(override_mask));
++ }
++
++ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
++out_free_mask:
++ cpus_read_unlock();
++ free_cpumask_var(new_mask);
++}
++
++/*
++ * Restore the affinity of a task @p which was previously restricted by a
++ * call to force_compatible_cpus_allowed_ptr().
++ *
++ * It is the caller's responsibility to serialise this with any calls to
++ * force_compatible_cpus_allowed_ptr(@p).
++ */
++void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ struct affinity_context ac = {
++ .new_mask = task_user_cpus(p),
++ .flags = 0,
++ };
++ int ret;
++
++ /*
++ * Try to restore the old affinity mask with __sched_setaffinity().
++ * Cpuset masking will be done there too.
++ */
++ ret = __sched_setaffinity(p, &ac);
++ WARN_ON_ONCE(ret);
++}
++
++static void
++ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq;
++
++ if (!schedstat_enabled())
++ return;
++
++ rq = this_rq();
++
++ if (cpu == rq->cpu) {
++ __schedstat_inc(rq->ttwu_local);
++ __schedstat_inc(p->stats.nr_wakeups_local);
++ } else {
++ /** Alt schedule FW ToDo:
++ * How to do ttwu_wake_remote
++ */
++ }
++
++ __schedstat_inc(rq->ttwu_count);
++ __schedstat_inc(p->stats.nr_wakeups);
++}
++
++/*
++ * Mark the task runnable.
++ */
++static inline void ttwu_do_wakeup(struct task_struct *p)
++{
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ trace_sched_wakeup(p);
++}
++
++static inline void
++ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
++{
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible--;
++
++ if (!(wake_flags & WF_MIGRATED) && p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ activate_task(p, rq);
++ wakeup_preempt(rq);
++
++ ttwu_do_wakeup(p);
++}
++
++/*
++ * Consider @p being inside a wait loop:
++ *
++ * for (;;) {
++ * set_current_state(TASK_UNINTERRUPTIBLE);
++ *
++ * if (CONDITION)
++ * break;
++ *
++ * schedule();
++ * }
++ * __set_current_state(TASK_RUNNING);
++ *
++ * between set_current_state() and schedule(). In this case @p is still
++ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
++ * an atomic manner.
++ *
++ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
++ * then schedule() must still happen and p->state can be changed to
++ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
++ * need to do a full wakeup with enqueue.
++ *
++ * Returns: %true when the wakeup is done,
++ * %false otherwise.
++ */
++static int ttwu_runnable(struct task_struct *p, int wake_flags)
++{
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ int ret = 0;
++
++ rq = __task_access_lock(p, &lock);
++ if (task_on_rq_queued(p)) {
++ if (!task_on_cpu(p)) {
++ /*
++ * When on_rq && !on_cpu the task is preempted, see if
++ * it should preempt the task that is current now.
++ */
++ update_rq_clock(rq);
++ wakeup_preempt(rq);
++ }
++ ttwu_do_wakeup(p);
++ ret = 1;
++ }
++ __task_access_unlock(p, lock);
++
++ return ret;
++}
++
++void sched_ttwu_pending(void *arg)
++{
++ struct llist_node *llist = arg;
++ struct rq *rq = this_rq();
++ struct task_struct *p, *t;
++ struct rq_flags rf;
++
++ if (!llist)
++ return;
++
++ rq_lock_irqsave(rq, &rf);
++ update_rq_clock(rq);
++
++ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
++ if (WARN_ON_ONCE(p->on_cpu))
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
++ set_task_cpu(p, cpu_of(rq));
++
++ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
++ }
++
++ /*
++ * Must be after enqueueing at least once task such that
++ * idle_cpu() does not observe a false-negative -- if it does,
++ * it is possible for select_idle_siblings() to stack a number
++ * of tasks on this CPU during that window.
++ *
++ * It is OK to clear ttwu_pending when another task pending.
++ * We will receive IPI after local IRQ enabled and then enqueue it.
++ * Since now nr_running > 0, idle_cpu() will always get correct result.
++ */
++ WRITE_ONCE(rq->ttwu_pending, 0);
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Prepare the scene for sending an IPI for a remote smp_call
++ *
++ * Returns true if the caller can proceed with sending the IPI.
++ * Returns false otherwise.
++ */
++bool call_function_single_prep_ipi(int cpu)
++{
++ if (set_nr_if_polling(cpu_rq(cpu)->idle)) {
++ trace_sched_wake_idle_without_ipi(cpu);
++ return false;
++ }
++
++ return true;
++}
++
++/*
++ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
++ * necessary. The wakee CPU on receipt of the IPI will queue the task
++ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
++ * of the wakeup instead of the waker.
++ */
++static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
++
++ WRITE_ONCE(rq->ttwu_pending, 1);
++ __smp_call_single_queue(cpu, &p->wake_entry.llist);
++}
++
++static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
++{
++ /*
++ * Do not complicate things with the async wake_list while the CPU is
++ * in hotplug state.
++ */
++ if (!cpu_active(cpu))
++ return false;
++
++ /* Ensure the task will still be allowed to run on the CPU. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /*
++ * If the CPU does not share cache, then queue the task on the
++ * remote rqs wakelist to avoid accessing remote data.
++ */
++ if (!cpus_share_cache(smp_processor_id(), cpu))
++ return true;
++
++ if (cpu == smp_processor_id())
++ return false;
++
++ /*
++ * If the wakee cpu is idle, or the task is descheduling and the
++ * only running task on the CPU, then use the wakelist to offload
++ * the task activation to the idle (or soon-to-be-idle) CPU as
++ * the current CPU is likely busy. nr_running is checked to
++ * avoid unnecessary task stacking.
++ *
++ * Note that we can only get here with (wakee) p->on_rq=0,
++ * p->on_cpu can be whatever, we've done the dequeue, so
++ * the wakee has been accounted out of ->nr_running.
++ */
++ if (!cpu_rq(cpu)->nr_running)
++ return true;
++
++ return false;
++}
++
++static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
++ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
++ __ttwu_queue_wakelist(p, cpu, wake_flags);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_if_idle(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ guard(rcu)();
++ if (is_idle_task(rcu_dereference(rq->curr))) {
++ guard(raw_spinlock_irqsave)(&rq->lock);
++ if (is_idle_task(rq->curr))
++ resched_curr(rq);
++ }
++}
++
++extern struct static_key_false sched_asym_cpucapacity;
++
++static __always_inline bool sched_asym_cpucap_active(void)
++{
++ return static_branch_unlikely(&sched_asym_cpucapacity);
++}
++
++bool cpus_equal_capacity(int this_cpu, int that_cpu)
++{
++ if (!sched_asym_cpucap_active())
++ return true;
++
++ if (this_cpu == that_cpu)
++ return true;
++
++ return arch_scale_cpu_capacity(this_cpu) == arch_scale_cpu_capacity(that_cpu);
++}
++
++bool cpus_share_cache(int this_cpu, int that_cpu)
++{
++ if (this_cpu == that_cpu)
++ return true;
++
++ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
++}
++
++static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (ttwu_queue_wakelist(p, cpu, wake_flags))
++ return;
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++ ttwu_do_activate(rq, p, wake_flags);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Invoked from try_to_wake_up() to check whether the task can be woken up.
++ *
++ * The caller holds p::pi_lock if p != current or has preemption
++ * disabled when p == current.
++ *
++ * The rules of saved_state:
++ *
++ * The related locking code always holds p::pi_lock when updating
++ * p::saved_state, which means the code is fully serialized in both cases.
++ *
++ * For PREEMPT_RT, the lock wait and lock wakeups happen via TASK_RTLOCK_WAIT.
++ * No other bits set. This allows to distinguish all wakeup scenarios.
++ *
++ * For FREEZER, the wakeup happens via TASK_FROZEN. No other bits set. This
++ * allows us to prevent early wakeup of tasks before they can be run on
++ * asymmetric ISA architectures (eg ARMv9).
++ */
++static __always_inline
++bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
++{
++ int match;
++
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
++ state != TASK_RTLOCK_WAIT);
++ }
++
++ *success = !!(match = __task_state_match(p, state));
++
++ /*
++ * Saved state preserves the task state across blocking on
++ * an RT lock or TASK_FREEZABLE tasks. If the state matches,
++ * set p::saved_state to TASK_RUNNING, but do not wake the task
++ * because it waits for a lock wakeup or __thaw_task(). Also
++ * indicate success because from the regular waker's point of
++ * view this has succeeded.
++ *
++ * After acquiring the lock the task will restore p::__state
++ * from p::saved_state which ensures that the regular
++ * wakeup is not lost. The restore will also set
++ * p::saved_state to TASK_RUNNING so any further tests will
++ * not result in false positives vs. @success
++ */
++ if (match < 0)
++ p->saved_state = TASK_RUNNING;
++
++ return match > 0;
++}
++
++/*
++ * Notes on Program-Order guarantees on SMP systems.
++ *
++ * MIGRATION
++ *
++ * The basic program-order guarantee on SMP systems is that when a task [t]
++ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
++ * execution on its new CPU [c1].
++ *
++ * For migration (of runnable tasks) this is provided by the following means:
++ *
++ * A) UNLOCK of the rq(c0)->lock scheduling out task t
++ * B) migration for t is required to synchronize *both* rq(c0)->lock and
++ * rq(c1)->lock (if not at the same time, then in that order).
++ * C) LOCK of the rq(c1)->lock scheduling in task
++ *
++ * Transitivity guarantees that B happens after A and C after B.
++ * Note: we only require RCpc transitivity.
++ * Note: the CPU doing B need not be c0 or c1
++ *
++ * Example:
++ *
++ * CPU0 CPU1 CPU2
++ *
++ * LOCK rq(0)->lock
++ * sched-out X
++ * sched-in Y
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(0)->lock // orders against CPU0
++ * dequeue X
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(1)->lock
++ * enqueue X
++ * UNLOCK rq(1)->lock
++ *
++ * LOCK rq(1)->lock // orders against CPU2
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(1)->lock
++ *
++ *
++ * BLOCKING -- aka. SLEEP + WAKEUP
++ *
++ * For blocking we (obviously) need to provide the same guarantee as for
++ * migration. However the means are completely different as there is no lock
++ * chain to provide order. Instead we do:
++ *
++ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
++ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
++ *
++ * Example:
++ *
++ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
++ *
++ * LOCK rq(0)->lock LOCK X->pi_lock
++ * dequeue X
++ * sched-out X
++ * smp_store_release(X->on_cpu, 0);
++ *
++ * smp_cond_load_acquire(&X->on_cpu, !VAL);
++ * X->state = WAKING
++ * set_task_cpu(X,2)
++ *
++ * LOCK rq(2)->lock
++ * enqueue X
++ * X->state = RUNNING
++ * UNLOCK rq(2)->lock
++ *
++ * LOCK rq(2)->lock // orders against CPU1
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(2)->lock
++ *
++ * UNLOCK X->pi_lock
++ * UNLOCK rq(0)->lock
++ *
++ *
++ * However; for wakeups there is a second guarantee we must provide, namely we
++ * must observe the state that lead to our wakeup. That is, not only must our
++ * task observe its own prior state, it must also observe the stores prior to
++ * its wakeup.
++ *
++ * This means that any means of doing remote wakeups must order the CPU doing
++ * the wakeup against the CPU the task is going to end up running on. This,
++ * however, is already required for the regular Program-Order guarantee above,
++ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
++ *
++ */
++
++/**
++ * try_to_wake_up - wake up a thread
++ * @p: the thread to be awakened
++ * @state: the mask of task states that can be woken
++ * @wake_flags: wake modifier flags (WF_*)
++ *
++ * Conceptually does:
++ *
++ * If (@state & @p->state) @p->state = TASK_RUNNING.
++ *
++ * If the task was not queued/runnable, also place it back on a runqueue.
++ *
++ * This function is atomic against schedule() which would dequeue the task.
++ *
++ * It issues a full memory barrier before accessing @p->state, see the comment
++ * with set_current_state().
++ *
++ * Uses p->pi_lock to serialize against concurrent wake-ups.
++ *
++ * Relies on p->pi_lock stabilizing:
++ * - p->sched_class
++ * - p->cpus_ptr
++ * - p->sched_task_group
++ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
++ *
++ * Tries really hard to only take one task_rq(p)->lock for performance.
++ * Takes rq->lock in:
++ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
++ * - ttwu_queue() -- new rq, for enqueue of the task;
++ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
++ *
++ * As a consequence we race really badly with just about everything. See the
++ * many memory barriers and their comments for details.
++ *
++ * Return: %true if @p->state changes (an actual wakeup was done),
++ * %false otherwise.
++ */
++int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
++{
++ guard(preempt)();
++ int cpu, success = 0;
++
++ if (p == current) {
++ /*
++ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
++ * == smp_processor_id()'. Together this means we can special
++ * case the whole 'p->on_rq && ttwu_runnable()' case below
++ * without taking any locks.
++ *
++ * In particular:
++ * - we rely on Program-Order guarantees for all the ordering,
++ * - we're serialized against set_special_state() by virtue of
++ * it disabling IRQs (this allows not taking ->pi_lock).
++ */
++ if (!ttwu_state_match(p, state, &success))
++ goto out;
++
++ trace_sched_waking(p);
++ ttwu_do_wakeup(p);
++ goto out;
++ }
++
++ /*
++ * If we are going to wake up a thread waiting for CONDITION we
++ * need to ensure that CONDITION=1 done by the caller can not be
++ * reordered with p->state check below. This pairs with smp_store_mb()
++ * in set_current_state() that the waiting thread does.
++ */
++ scoped_guard (raw_spinlock_irqsave, &p->pi_lock) {
++ smp_mb__after_spinlock();
++ if (!ttwu_state_match(p, state, &success))
++ break;
++
++ trace_sched_waking(p);
++
++ /*
++ * Ensure we load p->on_rq _after_ p->state, otherwise it would
++ * be possible to, falsely, observe p->on_rq == 0 and get stuck
++ * in smp_cond_load_acquire() below.
++ *
++ * sched_ttwu_pending() try_to_wake_up()
++ * STORE p->on_rq = 1 LOAD p->state
++ * UNLOCK rq->lock
++ *
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * UNLOCK rq->lock
++ *
++ * [task p]
++ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * A similar smp_rmb() lives in __task_needs_rq_lock().
++ */
++ smp_rmb();
++ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
++ break;
++
++ /*
++ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
++ * possible to, falsely, observe p->on_cpu == 0.
++ *
++ * One must be running (->on_cpu == 1) in order to remove oneself
++ * from the runqueue.
++ *
++ * __schedule() (switch to task 'p') try_to_wake_up()
++ * STORE p->on_cpu = 1 LOAD p->on_rq
++ * UNLOCK rq->lock
++ *
++ * __schedule() (put 'p' to sleep)
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * STORE p->on_rq = 0 LOAD p->on_cpu
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
++ * schedule()'s deactivate_task() has 'happened' and p will no longer
++ * care about it's own p->state. See the comment in __schedule().
++ */
++ smp_acquire__after_ctrl_dep();
++
++ /*
++ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
++ * == 0), which means we need to do an enqueue, change p->state to
++ * TASK_WAKING such that we can unlock p->pi_lock before doing the
++ * enqueue, such as ttwu_queue_wakelist().
++ */
++ WRITE_ONCE(p->__state, TASK_WAKING);
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, considering queueing p on the remote CPUs wake_list
++ * which potentially sends an IPI instead of spinning on p->on_cpu to
++ * let the waker make forward progress. This is safe because IRQs are
++ * disabled and the IPI will deliver after on_cpu is cleared.
++ *
++ * Ensure we load task_cpu(p) after p->on_cpu:
++ *
++ * set_task_cpu(p, cpu);
++ * STORE p->cpu = @cpu
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock
++ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
++ * STORE p->on_cpu = 1 LOAD p->cpu
++ *
++ * to ensure we observe the correct CPU on which the task is currently
++ * scheduling.
++ */
++ if (smp_load_acquire(&p->on_cpu) &&
++ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
++ break;
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, wait until it's done referencing the task.
++ *
++ * Pairs with the smp_store_release() in finish_task().
++ *
++ * This ensures that tasks getting woken will be fully ordered against
++ * their previous state and preserve Program Order.
++ */
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ sched_task_ttwu(p);
++
++ if ((wake_flags & WF_CURRENT_CPU) &&
++ cpumask_test_cpu(smp_processor_id(), p->cpus_ptr))
++ cpu = smp_processor_id();
++ else
++ cpu = select_task_rq(p);
++
++ if (cpu != task_cpu(p)) {
++ if (p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ wake_flags |= WF_MIGRATED;
++ set_task_cpu(p, cpu);
++ }
++
++ ttwu_queue(p, cpu, wake_flags);
++ }
++out:
++ if (success)
++ ttwu_stat(p, task_cpu(p), wake_flags);
++
++ return success;
++}
++
++static bool __task_needs_rq_lock(struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
++ * the task is blocked. Make sure to check @state since ttwu() can drop
++ * locks at the end, see ttwu_queue_wakelist().
++ */
++ if (state == TASK_RUNNING || state == TASK_WAKING)
++ return true;
++
++ /*
++ * Ensure we load p->on_rq after p->__state, otherwise it would be
++ * possible to, falsely, observe p->on_rq == 0.
++ *
++ * See try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ if (p->on_rq)
++ return true;
++
++ /*
++ * Ensure the task has finished __schedule() and will not be referenced
++ * anymore. Again, see try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ return false;
++}
++
++/**
++ * task_call_func - Invoke a function on task in fixed state
++ * @p: Process for which the function is to be invoked, can be @current.
++ * @func: Function to invoke.
++ * @arg: Argument to function.
++ *
++ * Fix the task in it's current state by avoiding wakeups and or rq operations
++ * and call @func(@arg) on it. This function can use task_is_runnable() and
++ * task_curr() to work out what the state is, if required. Given that @func
++ * can be invoked with a runqueue lock held, it had better be quite
++ * lightweight.
++ *
++ * Returns:
++ * Whatever @func returns
++ */
++int task_call_func(struct task_struct *p, task_call_f func, void *arg)
++{
++ struct rq *rq = NULL;
++ struct rq_flags rf;
++ int ret;
++
++ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
++
++ if (__task_needs_rq_lock(p))
++ rq = __task_rq_lock(p, &rf);
++
++ /*
++ * At this point the task is pinned; either:
++ * - blocked and we're holding off wakeups (pi->lock)
++ * - woken, and we're holding off enqueue (rq->lock)
++ * - queued, and we're holding off schedule (rq->lock)
++ * - running, and we're holding off de-schedule (rq->lock)
++ *
++ * The called function (@func) can use: task_curr(), p->on_rq and
++ * p->__state to differentiate between these states.
++ */
++ ret = func(p, arg);
++
++ if (rq)
++ __task_rq_unlock(rq, &rf);
++
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
++ return ret;
++}
++
++/**
++ * cpu_curr_snapshot - Return a snapshot of the currently running task
++ * @cpu: The CPU on which to snapshot the task.
++ *
++ * Returns the task_struct pointer of the task "currently" running on
++ * the specified CPU. If the same task is running on that CPU throughout,
++ * the return value will be a pointer to that task's task_struct structure.
++ * If the CPU did any context switches even vaguely concurrently with the
++ * execution of this function, the return value will be a pointer to the
++ * task_struct structure of a randomly chosen task that was running on
++ * that CPU somewhere around the time that this function was executing.
++ *
++ * If the specified CPU was offline, the return value is whatever it
++ * is, perhaps a pointer to the task_struct structure of that CPU's idle
++ * task, but there is no guarantee. Callers wishing a useful return
++ * value must take some action to ensure that the specified CPU remains
++ * online throughout.
++ *
++ * This function executes full memory barriers before and after fetching
++ * the pointer, which permits the caller to confine this function's fetch
++ * with respect to the caller's accesses to other shared variables.
++ */
++struct task_struct *cpu_curr_snapshot(int cpu)
++{
++ struct task_struct *t;
++
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ t = rcu_dereference(cpu_curr(cpu));
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ return t;
++}
++
++/**
++ * wake_up_process - Wake up a specific process
++ * @p: The process to be woken up.
++ *
++ * Attempt to wake up the nominated process and move it to the set of runnable
++ * processes.
++ *
++ * Return: 1 if the process was woken up, 0 if it was already running.
++ *
++ * This function executes a full memory barrier before accessing the task state.
++ */
++int wake_up_process(struct task_struct *p)
++{
++ return try_to_wake_up(p, TASK_NORMAL, 0);
++}
++EXPORT_SYMBOL(wake_up_process);
++
++int wake_up_state(struct task_struct *p, unsigned int state)
++{
++ return try_to_wake_up(p, state, 0);
++}
++
++/*
++ * Perform scheduler related setup for a newly forked process p.
++ * p is forked by current.
++ *
++ * __sched_fork() is basic setup which is also used by sched_init() to
++ * initialize the boot CPU's idle task.
++ */
++static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ p->on_rq = 0;
++ p->on_cpu = 0;
++ p->utime = 0;
++ p->stime = 0;
++ p->sched_time = 0;
++
++#ifdef CONFIG_SCHEDSTATS
++ /* Even if schedstat is disabled, there should not be garbage */
++ memset(&p->stats, 0, sizeof(p->stats));
++#endif
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++ INIT_HLIST_HEAD(&p->preempt_notifiers);
++#endif
++
++#ifdef CONFIG_COMPACTION
++ p->capture_control = NULL;
++#endif
++ p->wake_entry.u_flags = CSD_TYPE_TTWU;
++ init_sched_mm_cid(p);
++}
++
++/*
++ * fork()/clone()-time setup:
++ */
++int sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ __sched_fork(clone_flags, p);
++ /*
++ * We mark the process as NEW here. This guarantees that
++ * nobody will actually run it, and a signal or other external
++ * event cannot wake it up and insert it on the runqueue either.
++ */
++ p->__state = TASK_NEW;
++
++ /*
++ * Make sure we do not leak PI boosting priority to the child.
++ */
++ p->prio = current->normal_prio;
++
++ /*
++ * Revert to default priority/policy on fork if requested.
++ */
++ if (unlikely(p->sched_reset_on_fork)) {
++ if (task_has_rt_policy(p)) {
++ p->policy = SCHED_NORMAL;
++ p->static_prio = NICE_TO_PRIO(0);
++ p->rt_priority = 0;
++ } else if (PRIO_TO_NICE(p->static_prio) < 0)
++ p->static_prio = NICE_TO_PRIO(0);
++
++ p->prio = p->normal_prio = p->static_prio;
++
++ /*
++ * We don't need the reset flag anymore after the fork. It has
++ * fulfilled its duty:
++ */
++ p->sched_reset_on_fork = 0;
++ }
++
++#ifdef CONFIG_SCHED_INFO
++ if (unlikely(sched_info_on()))
++ memset(&p->sched_info, 0, sizeof(p->sched_info));
++#endif
++ init_task_preempt_count(p);
++
++ return 0;
++}
++
++int sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ /*
++ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
++ * required yet, but lockdep gets upset if rules are violated.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ /*
++ * Share the timeslice between parent and child, thus the
++ * total amount of pending timeslices in the system doesn't change,
++ * resulting in more scheduling fairness.
++ */
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ rq->curr->time_slice /= 2;
++ p->time_slice = rq->curr->time_slice;
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, rq->curr->time_slice);
++#endif
++
++ if (p->time_slice < RESCHED_NS) {
++ p->time_slice = sysctl_sched_base_slice;
++ resched_curr(rq);
++ }
++ sched_task_fork(p, rq);
++ raw_spin_unlock(&rq->lock);
++
++ rseq_migrate(p);
++ /*
++ * We're setting the CPU for the first time, we don't migrate,
++ * so use __set_task_cpu().
++ */
++ __set_task_cpu(p, smp_processor_id());
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++void sched_cancel_fork(struct task_struct *p)
++{
++}
++
++void sched_post_fork(struct task_struct *p)
++{
++}
++
++#ifdef CONFIG_SCHEDSTATS
++
++DEFINE_STATIC_KEY_FALSE(sched_schedstats);
++
++static void set_schedstats(bool enabled)
++{
++ if (enabled)
++ static_branch_enable(&sched_schedstats);
++ else
++ static_branch_disable(&sched_schedstats);
++}
++
++void force_schedstat_enabled(void)
++{
++ if (!schedstat_enabled()) {
++ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
++ static_branch_enable(&sched_schedstats);
++ }
++}
++
++static int __init setup_schedstats(char *str)
++{
++ int ret = 0;
++ if (!str)
++ goto out;
++
++ if (!strcmp(str, "enable")) {
++ set_schedstats(true);
++ ret = 1;
++ } else if (!strcmp(str, "disable")) {
++ set_schedstats(false);
++ ret = 1;
++ }
++out:
++ if (!ret)
++ pr_warn("Unable to parse schedstats=\n");
++
++ return ret;
++}
++__setup("schedstats=", setup_schedstats);
++
++#ifdef CONFIG_PROC_SYSCTL
++static int sysctl_schedstats(const struct ctl_table *table, int write, void *buffer,
++ size_t *lenp, loff_t *ppos)
++{
++ struct ctl_table t;
++ int err;
++ int state = static_branch_likely(&sched_schedstats);
++
++ if (write && !capable(CAP_SYS_ADMIN))
++ return -EPERM;
++
++ t = *table;
++ t.data = &state;
++ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
++ if (err < 0)
++ return err;
++ if (write)
++ set_schedstats(state);
++ return err;
++}
++#endif /* CONFIG_PROC_SYSCTL */
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_SYSCTL
++static const struct ctl_table sched_core_sysctls[] = {
++#ifdef CONFIG_SCHEDSTATS
++ {
++ .procname = "sched_schedstats",
++ .data = NULL,
++ .maxlen = sizeof(unsigned int),
++ .mode = 0644,
++ .proc_handler = sysctl_schedstats,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++#endif /* CONFIG_SCHEDSTATS */
++};
++static int __init sched_core_sysctl_init(void)
++{
++ register_sysctl_init("kernel", sched_core_sysctls);
++ return 0;
++}
++late_initcall(sched_core_sysctl_init);
++#endif /* CONFIG_SYSCTL */
++
++/*
++ * wake_up_new_task - wake up a newly created task for the first time.
++ *
++ * This function will do some initial scheduler statistics housekeeping
++ * that must be done for every newly created context, then puts the task
++ * on the runqueue and wakes it.
++ */
++void wake_up_new_task(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ rq = cpu_rq(select_task_rq(p));
++ rseq_migrate(p);
++ /*
++ * Fork balancing, do it here and not earlier because:
++ * - cpus_ptr can change in the fork path
++ * - any previously selected CPU might disappear through hotplug
++ *
++ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
++ * as we're not fully set-up yet.
++ */
++ __set_task_cpu(p, cpu_of(rq));
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ activate_task(p, rq);
++ trace_sched_wakeup_new(p);
++ wakeup_preempt(rq);
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++
++static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
++
++void preempt_notifier_inc(void)
++{
++ static_branch_inc(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_inc);
++
++void preempt_notifier_dec(void)
++{
++ static_branch_dec(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_dec);
++
++/**
++ * preempt_notifier_register - tell me when current is being preempted & rescheduled
++ * @notifier: notifier struct to register
++ */
++void preempt_notifier_register(struct preempt_notifier *notifier)
++{
++ if (!static_branch_unlikely(&preempt_notifier_key))
++ WARN(1, "registering preempt_notifier while notifiers disabled\n");
++
++ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_register);
++
++/**
++ * preempt_notifier_unregister - no longer interested in preemption notifications
++ * @notifier: notifier struct to unregister
++ *
++ * This is *not* safe to call from within a preemption notifier.
++ */
++void preempt_notifier_unregister(struct preempt_notifier *notifier)
++{
++ hlist_del(¬ifier->link);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
++
++static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_in(notifier, raw_smp_processor_id());
++}
++
++static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_in_preempt_notifiers(curr);
++}
++
++static void
++__fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_out(notifier, next);
++}
++
++static __always_inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_out_preempt_notifiers(curr, next);
++}
++
++#else /* !CONFIG_PREEMPT_NOTIFIERS: */
++
++static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++}
++
++static inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++}
++
++#endif /* !CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void prepare_task(struct task_struct *next)
++{
++ /*
++ * Claim the task as running, we do this before switching to it
++ * such that any running task will have this set.
++ *
++ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
++ * its ordering comment.
++ */
++ WRITE_ONCE(next->on_cpu, 1);
++}
++
++static inline void finish_task(struct task_struct *prev)
++{
++ /*
++ * This must be the very last reference to @prev from this CPU. After
++ * p->on_cpu is cleared, the task can be moved to a different CPU. We
++ * must ensure this doesn't happen until the switch is completely
++ * finished.
++ *
++ * In particular, the load of prev->state in finish_task_switch() must
++ * happen before this.
++ *
++ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
++ */
++ smp_store_release(&prev->on_cpu, 0);
++}
++
++static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ void (*func)(struct rq *rq);
++ struct balance_callback *next;
++
++ lockdep_assert_held(&rq->lock);
++
++ while (head) {
++ func = (void (*)(struct rq *))head->func;
++ next = head->next;
++ head->next = NULL;
++ head = next;
++
++ func(rq);
++ }
++}
++
++static void balance_push(struct rq *rq);
++
++/*
++ * balance_push_callback is a right abuse of the callback interface and plays
++ * by significantly different rules.
++ *
++ * Where the normal balance_callback's purpose is to be ran in the same context
++ * that queued it (only later, when it's safe to drop rq->lock again),
++ * balance_push_callback is specifically targeted at __schedule().
++ *
++ * This abuse is tolerated because it places all the unlikely/odd cases behind
++ * a single test, namely: rq->balance_callback == NULL.
++ */
++struct balance_callback balance_push_callback = {
++ .next = NULL,
++ .func = balance_push,
++};
++
++static inline struct balance_callback *
++__splice_balance_callbacks(struct rq *rq, bool split)
++{
++ struct balance_callback *head = rq->balance_callback;
++
++ if (likely(!head))
++ return NULL;
++
++ lockdep_assert_rq_held(rq);
++ /*
++ * Must not take balance_push_callback off the list when
++ * splice_balance_callbacks() and balance_callbacks() are not
++ * in the same rq->lock section.
++ *
++ * In that case it would be possible for __schedule() to interleave
++ * and observe the list empty.
++ */
++ if (split && head == &balance_push_callback)
++ head = NULL;
++ else
++ rq->balance_callback = NULL;
++
++ return head;
++}
++
++struct balance_callback *splice_balance_callbacks(struct rq *rq)
++{
++ return __splice_balance_callbacks(rq, true);
++}
++
++static void __balance_callbacks(struct rq *rq)
++{
++ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
++}
++
++void balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ unsigned long flags;
++
++ if (unlikely(head)) {
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ do_balance_callbacks(rq, head);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++ }
++}
++
++static inline void
++prepare_lock_switch(struct rq *rq, struct task_struct *next)
++{
++ /*
++ * Since the runqueue lock will be released by the next
++ * task (which is an invalid locking op but in the case
++ * of the scheduler it's an obvious special-case), so we
++ * do an early lockdep release here:
++ */
++ spin_release(&rq->lock.dep_map, _THIS_IP_);
++#ifdef CONFIG_DEBUG_SPINLOCK
++ /* this is a valid case when another task releases the spinlock */
++ rq->lock.owner = next;
++#endif
++}
++
++static inline void finish_lock_switch(struct rq *rq)
++{
++ /*
++ * If we are tracking spinlock dependencies then we have to
++ * fix up the runqueue lock - which gets 'carried over' from
++ * prev into current:
++ */
++ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++/*
++ * NOP if the arch has not defined these:
++ */
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static inline void kmap_local_sched_out(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_out();
++#endif
++}
++
++static inline void kmap_local_sched_in(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_in();
++#endif
++}
++
++/**
++ * prepare_task_switch - prepare to switch tasks
++ * @rq: the runqueue preparing to switch
++ * @next: the task we are going to switch to.
++ *
++ * This is called with the rq lock held and interrupts off. It must
++ * be paired with a subsequent finish_task_switch after the context
++ * switch.
++ *
++ * prepare_task_switch sets up locking and calls architecture specific
++ * hooks.
++ */
++static inline void
++prepare_task_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ kcov_prepare_switch(prev);
++ sched_info_switch(rq, prev, next);
++ perf_event_task_sched_out(prev, next);
++ rseq_preempt(prev);
++ fire_sched_out_preempt_notifiers(prev, next);
++ kmap_local_sched_out();
++ prepare_task(next);
++ prepare_arch_switch(next);
++}
++
++/**
++ * finish_task_switch - clean up after a task-switch
++ * @rq: runqueue associated with task-switch
++ * @prev: the thread we just switched away from.
++ *
++ * finish_task_switch must be called after the context switch, paired
++ * with a prepare_task_switch call before the context switch.
++ * finish_task_switch will reconcile locking set up by prepare_task_switch,
++ * and do any other architecture-specific cleanup actions.
++ *
++ * Note that we may have delayed dropping an mm in context_switch(). If
++ * so, we finish that here outside of the runqueue lock. (Doing it
++ * with the lock held can cause deadlocks; see schedule() for
++ * details.)
++ *
++ * The context switch have flipped the stack from under us and restored the
++ * local variables which were saved when this task called schedule() in the
++ * past. 'prev == current' is still correct but we need to recalculate this_rq
++ * because prev may have moved to another CPU.
++ */
++static struct rq *finish_task_switch(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ struct rq *rq = this_rq();
++ struct mm_struct *mm = rq->prev_mm;
++ unsigned int prev_state;
++
++ /*
++ * The previous task will have left us with a preempt_count of 2
++ * because it left us after:
++ *
++ * schedule()
++ * preempt_disable(); // 1
++ * __schedule()
++ * raw_spin_lock_irq(&rq->lock) // 2
++ *
++ * Also, see FORK_PREEMPT_COUNT.
++ */
++ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
++ "corrupted preempt_count: %s/%d/0x%x\n",
++ current->comm, current->pid, preempt_count()))
++ preempt_count_set(FORK_PREEMPT_COUNT);
++
++ rq->prev_mm = NULL;
++
++ /*
++ * A task struct has one reference for the use as "current".
++ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
++ * schedule one last time. The schedule call will never return, and
++ * the scheduled task must drop that reference.
++ *
++ * We must observe prev->state before clearing prev->on_cpu (in
++ * finish_task), otherwise a concurrent wakeup can get prev
++ * running on another CPU and we could rave with its RUNNING -> DEAD
++ * transition, resulting in a double drop.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ vtime_task_switch(prev);
++ perf_event_task_sched_in(prev, current);
++ finish_task(prev);
++ tick_nohz_task_switch();
++ finish_lock_switch(rq);
++ finish_arch_post_lock_switch();
++ kcov_finish_switch(current);
++ /*
++ * kmap_local_sched_out() is invoked with rq::lock held and
++ * interrupts disabled. There is no requirement for that, but the
++ * sched out code does not have an interrupt enabled section.
++ * Restoring the maps on sched in does not require interrupts being
++ * disabled either.
++ */
++ kmap_local_sched_in();
++
++ fire_sched_in_preempt_notifiers(current);
++ /*
++ * When switching through a kernel thread, the loop in
++ * membarrier_{private,global}_expedited() may have observed that
++ * kernel thread and not issued an IPI. It is therefore possible to
++ * schedule between user->kernel->user threads without passing though
++ * switch_mm(). Membarrier requires a barrier after storing to
++ * rq->curr, before returning to userspace, so provide them here:
++ *
++ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
++ * provided by mmdrop_lazy_tlb(),
++ * - a sync_core for SYNC_CORE.
++ */
++ if (mm) {
++ membarrier_mm_sync_core_before_usermode(mm);
++ mmdrop_lazy_tlb_sched(mm);
++ }
++ if (unlikely(prev_state == TASK_DEAD)) {
++ /* Task is done with its stack. */
++ put_task_stack(prev);
++
++ put_task_struct_rcu_user(prev);
++ }
++
++ return rq;
++}
++
++/**
++ * schedule_tail - first thing a freshly forked thread must call.
++ * @prev: the thread we just switched away from.
++ */
++asmlinkage __visible void schedule_tail(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ /*
++ * New tasks start with FORK_PREEMPT_COUNT, see there and
++ * finish_task_switch() for details.
++ *
++ * finish_task_switch() will drop rq->lock() and lower preempt_count
++ * and the preempt_enable() will end up enabling preemption (on
++ * PREEMPT_COUNT kernels).
++ */
++
++ finish_task_switch(prev);
++ /*
++ * This is a special case: the newly created task has just
++ * switched the context for the first time. It is returning from
++ * schedule for the first time in this path.
++ */
++ trace_sched_exit_tp(true);
++ preempt_enable();
++
++ if (current->set_child_tid)
++ put_user(task_pid_vnr(current), current->set_child_tid);
++
++ calculate_sigpending();
++}
++
++/*
++ * context_switch - switch to the new MM and the new thread's register state.
++ */
++static __always_inline struct rq *
++context_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ prepare_task_switch(rq, prev, next);
++
++ /*
++ * For paravirt, this is coupled with an exit in switch_to to
++ * combine the page table reload and the switch backend into
++ * one hypercall.
++ */
++ arch_start_context_switch(prev);
++
++ /*
++ * kernel -> kernel lazy + transfer active
++ * user -> kernel lazy + mmgrab_lazy_tlb() active
++ *
++ * kernel -> user switch + mmdrop_lazy_tlb() active
++ * user -> user switch
++ *
++ * switch_mm_cid() needs to be updated if the barriers provided
++ * by context_switch() are modified.
++ */
++ if (!next->mm) { // to kernel
++ enter_lazy_tlb(prev->active_mm, next);
++
++ next->active_mm = prev->active_mm;
++ if (prev->mm) // from user
++ mmgrab_lazy_tlb(prev->active_mm);
++ else
++ prev->active_mm = NULL;
++ } else { // to user
++ membarrier_switch_mm(rq, prev->active_mm, next->mm);
++ /*
++ * sys_membarrier() requires an smp_mb() between setting
++ * rq->curr / membarrier_switch_mm() and returning to userspace.
++ *
++ * The below provides this either through switch_mm(), or in
++ * case 'prev->active_mm == next->mm' through
++ * finish_task_switch()'s mmdrop().
++ */
++ switch_mm_irqs_off(prev->active_mm, next->mm, next);
++ lru_gen_use_mm(next->mm);
++
++ if (!prev->mm) { // from kernel
++ /* will mmdrop_lazy_tlb() in finish_task_switch(). */
++ rq->prev_mm = prev->active_mm;
++ prev->active_mm = NULL;
++ }
++ }
++
++ /* switch_mm_cid() requires the memory barriers above. */
++ switch_mm_cid(rq, prev, next);
++
++ prepare_lock_switch(rq, next);
++
++ /* Here we just switch the register state and the stack. */
++ switch_to(prev, next, prev);
++ barrier();
++
++ return finish_task_switch(prev);
++}
++
++/*
++ * nr_running, nr_uninterruptible and nr_context_switches:
++ *
++ * externally visible scheduler statistics: current number of runnable
++ * threads, total number of context switches performed since bootup.
++ */
++unsigned int nr_running(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_online_cpu(i)
++ sum += cpu_rq(i)->nr_running;
++
++ return sum;
++}
++
++/*
++ * Check if only the current task is running on the CPU.
++ *
++ * Caution: this function does not check that the caller has disabled
++ * preemption, thus the result might have a time-of-check-to-time-of-use
++ * race. The caller is responsible to use it correctly, for example:
++ *
++ * - from a non-preemptible section (of course)
++ *
++ * - from a thread that is bound to a single CPU
++ *
++ * - in a loop with very short iterations (e.g. a polling loop)
++ */
++bool single_task_running(void)
++{
++ return raw_rq()->nr_running == 1;
++}
++EXPORT_SYMBOL(single_task_running);
++
++unsigned long long nr_context_switches_cpu(int cpu)
++{
++ return cpu_rq(cpu)->nr_switches;
++}
++
++unsigned long long nr_context_switches(void)
++{
++ int i;
++ unsigned long long sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += cpu_rq(i)->nr_switches;
++
++ return sum;
++}
++
++/*
++ * Consumers of these two interfaces, like for example the cpuidle menu
++ * governor, are using nonsensical data. Preferring shallow idle state selection
++ * for a CPU that has IO-wait which might not even end up running the task when
++ * it does become runnable.
++ */
++
++unsigned int nr_iowait_cpu(int cpu)
++{
++ return atomic_read(&cpu_rq(cpu)->nr_iowait);
++}
++
++/*
++ * IO-wait accounting, and how it's mostly bollocks (on SMP).
++ *
++ * The idea behind IO-wait account is to account the idle time that we could
++ * have spend running if it were not for IO. That is, if we were to improve the
++ * storage performance, we'd have a proportional reduction in IO-wait time.
++ *
++ * This all works nicely on UP, where, when a task blocks on IO, we account
++ * idle time as IO-wait, because if the storage were faster, it could've been
++ * running and we'd not be idle.
++ *
++ * This has been extended to SMP, by doing the same for each CPU. This however
++ * is broken.
++ *
++ * Imagine for instance the case where two tasks block on one CPU, only the one
++ * CPU will have IO-wait accounted, while the other has regular idle. Even
++ * though, if the storage were faster, both could've ran at the same time,
++ * utilising both CPUs.
++ *
++ * This means, that when looking globally, the current IO-wait accounting on
++ * SMP is a lower bound, by reason of under accounting.
++ *
++ * Worse, since the numbers are provided per CPU, they are sometimes
++ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
++ * associated with any one particular CPU, it can wake to another CPU than it
++ * blocked on. This means the per CPU IO-wait number is meaningless.
++ *
++ * Task CPU affinities can make all that even more 'interesting'.
++ */
++
++unsigned int nr_iowait(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += nr_iowait_cpu(i);
++
++ return sum;
++}
++
++/*
++ * sched_exec - execve() is a valuable balancing opportunity, because at
++ * this point the task has the smallest effective memory and cache
++ * footprint.
++ */
++void sched_exec(void)
++{
++}
++
++DEFINE_PER_CPU(struct kernel_stat, kstat);
++DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
++
++EXPORT_PER_CPU_SYMBOL(kstat);
++EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
++
++static inline void update_curr(struct rq *rq, struct task_struct *p)
++{
++ s64 ns = rq->clock_task - p->last_ran;
++
++ p->sched_time += ns;
++ cgroup_account_cputime(p, ns);
++ account_group_exec_runtime(p, ns);
++
++ p->time_slice -= ns;
++ p->last_ran = rq->clock_task;
++}
++
++/*
++ * Return accounted runtime for the task.
++ * Return separately the current's pending runtime that have not been
++ * accounted yet.
++ */
++unsigned long long task_sched_runtime(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ u64 ns;
++
++#ifdef CONFIG_64BIT
++ /*
++ * 64-bit doesn't need locks to atomically read a 64-bit value.
++ * So we have a optimization chance when the task's delta_exec is 0.
++ * Reading ->on_cpu is racy, but this is OK.
++ *
++ * If we race with it leaving CPU, we'll take a lock. So we're correct.
++ * If we race with it entering CPU, unaccounted time is 0. This is
++ * indistinguishable from the read occurring a few cycles earlier.
++ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
++ * been accounted, so we're correct here as well.
++ */
++ if (!p->on_cpu || !task_on_rq_queued(p))
++ return tsk_seruntime(p);
++#endif
++
++ rq = task_access_lock_irqsave(p, &lock, &flags);
++ /*
++ * Must be ->curr _and_ ->on_rq. If dequeued, we would
++ * project cycles that may never be accounted to this
++ * thread, breaking clock_gettime().
++ */
++ if (p == rq->curr && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ update_curr(rq, p);
++ }
++ ns = tsk_seruntime(p);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ return ns;
++}
++
++/* This manages tasks that have run out of timeslice during a scheduler_tick */
++static inline void scheduler_task_tick(struct rq *rq)
++{
++ struct task_struct *p = rq->curr;
++
++ if (is_idle_task(p))
++ return;
++
++ update_curr(rq, p);
++ cpufreq_update_util(rq, 0);
++
++ /*
++ * Tasks have less than RESCHED_NS of time slice left they will be
++ * rescheduled.
++ */
++ if (p->time_slice >= RESCHED_NS)
++ return;
++ set_tsk_need_resched(p);
++ set_preempt_need_resched();
++}
++
++static u64 cpu_resched_latency(struct rq *rq)
++{
++ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
++ u64 resched_latency, now = rq_clock(rq);
++ static bool warned_once;
++
++ if (sysctl_resched_latency_warn_once && warned_once)
++ return 0;
++
++ if (!need_resched() || !latency_warn_ms)
++ return 0;
++
++ if (system_state == SYSTEM_BOOTING)
++ return 0;
++
++ if (!rq->last_seen_need_resched_ns) {
++ rq->last_seen_need_resched_ns = now;
++ rq->ticks_without_resched = 0;
++ return 0;
++ }
++
++ rq->ticks_without_resched++;
++ resched_latency = now - rq->last_seen_need_resched_ns;
++ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
++ return 0;
++
++ warned_once = true;
++
++ return resched_latency;
++}
++
++static int __init setup_resched_latency_warn_ms(char *str)
++{
++ long val;
++
++ if ((kstrtol(str, 0, &val))) {
++ pr_warn("Unable to set resched_latency_warn_ms\n");
++ return 1;
++ }
++
++ sysctl_resched_latency_warn_ms = val;
++ return 1;
++}
++__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
++
++/*
++ * This function gets called by the timer code, with HZ frequency.
++ * We call it with interrupts disabled.
++ */
++void sched_tick(void)
++{
++ int cpu __maybe_unused = smp_processor_id();
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *curr = rq->curr;
++ u64 resched_latency;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
++ arch_scale_freq_tick();
++
++ sched_clock_tick();
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ if (dynamic_preempt_lazy() && tif_test_bit(TIF_NEED_RESCHED_LAZY))
++ resched_curr(rq);
++
++ scheduler_task_tick(rq);
++ if (sched_feat(LATENCY_WARN))
++ resched_latency = cpu_resched_latency(rq);
++ calc_global_load_tick(rq);
++
++ task_tick_mm_cid(rq, rq->curr);
++
++ raw_spin_unlock(&rq->lock);
++
++ if (sched_feat(LATENCY_WARN) && resched_latency)
++ resched_latency_warn(cpu, resched_latency);
++
++ perf_event_task_tick();
++
++ if (curr->flags & PF_WQ_WORKER)
++ wq_worker_tick(curr);
++}
++
++#ifdef CONFIG_NO_HZ_FULL
++
++struct tick_work {
++ int cpu;
++ atomic_t state;
++ struct delayed_work work;
++};
++/* Values for ->state, see diagram below. */
++#define TICK_SCHED_REMOTE_OFFLINE 0
++#define TICK_SCHED_REMOTE_OFFLINING 1
++#define TICK_SCHED_REMOTE_RUNNING 2
++
++/*
++ * State diagram for ->state:
++ *
++ *
++ * TICK_SCHED_REMOTE_OFFLINE
++ * | ^
++ * | |
++ * | | sched_tick_remote()
++ * | |
++ * | |
++ * +--TICK_SCHED_REMOTE_OFFLINING
++ * | ^
++ * | |
++ * sched_tick_start() | | sched_tick_stop()
++ * | |
++ * V |
++ * TICK_SCHED_REMOTE_RUNNING
++ *
++ *
++ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
++ * and sched_tick_start() are happy to leave the state in RUNNING.
++ */
++
++static struct tick_work __percpu *tick_work_cpu;
++
++static void sched_tick_remote(struct work_struct *work)
++{
++ struct delayed_work *dwork = to_delayed_work(work);
++ struct tick_work *twork = container_of(dwork, struct tick_work, work);
++ int cpu = twork->cpu;
++ struct rq *rq = cpu_rq(cpu);
++ int os;
++
++ /*
++ * Handle the tick only if it appears the remote CPU is running in full
++ * dynticks mode. The check is racy by nature, but missing a tick or
++ * having one too much is no big deal because the scheduler tick updates
++ * statistics and checks timeslices in a time-independent way, regardless
++ * of when exactly it is running.
++ */
++ if (tick_nohz_tick_stopped_cpu(cpu)) {
++ guard(raw_spinlock_irqsave)(&rq->lock);
++ struct task_struct *curr = rq->curr;
++
++ if (cpu_online(cpu)) {
++ update_rq_clock(rq);
++
++ if (!is_idle_task(curr)) {
++ /*
++ * Make sure the next tick runs within a
++ * reasonable amount of time.
++ */
++ u64 delta = rq_clock_task(rq) - curr->last_ran;
++ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
++ }
++ scheduler_task_tick(rq);
++
++ calc_load_nohz_remote(rq);
++ }
++ }
++
++ /*
++ * Run the remote tick once per second (1Hz). This arbitrary
++ * frequency is large enough to avoid overload but short enough
++ * to keep scheduler internal stats reasonably up to date. But
++ * first update state to reflect hotplug activity if required.
++ */
++ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
++ if (os == TICK_SCHED_REMOTE_RUNNING)
++ queue_delayed_work(system_unbound_wq, dwork, HZ);
++}
++
++static void sched_tick_start(int cpu)
++{
++ int os;
++ struct tick_work *twork;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
++ if (os == TICK_SCHED_REMOTE_OFFLINE) {
++ twork->cpu = cpu;
++ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
++ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
++ }
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++static void sched_tick_stop(int cpu)
++{
++ struct tick_work *twork;
++ int os;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ /* There cannot be competing actions, but don't rely on stop-machine. */
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
++ WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
++ /* Don't cancel, as this would mess up the state machine. */
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++int __init sched_tick_offload_init(void)
++{
++ tick_work_cpu = alloc_percpu(struct tick_work);
++ BUG_ON(!tick_work_cpu);
++ return 0;
++}
++
++#else /* !CONFIG_NO_HZ_FULL: */
++static inline void sched_tick_start(int cpu) { }
++static inline void sched_tick_stop(int cpu) { }
++#endif /* !CONFIG_NO_HZ_FULL */
++
++#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
++ defined(CONFIG_PREEMPT_TRACER))
++/*
++ * If the value passed in is equal to the current preempt count
++ * then we just disabled preemption. Start timing the latency.
++ */
++static inline void preempt_latency_start(int val)
++{
++ if (preempt_count() == val) {
++ unsigned long ip = get_lock_parent_ip();
++#ifdef CONFIG_DEBUG_PREEMPT
++ current->preempt_disable_ip = ip;
++#endif
++ trace_preempt_off(CALLER_ADDR0, ip);
++ }
++}
++
++void preempt_count_add(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
++ return;
++#endif
++ __preempt_count_add(val);
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Spinlock count overflowing soon?
++ */
++ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
++ PREEMPT_MASK - 10);
++#endif
++ preempt_latency_start(val);
++}
++EXPORT_SYMBOL(preempt_count_add);
++NOKPROBE_SYMBOL(preempt_count_add);
++
++/*
++ * If the value passed in equals to the current preempt count
++ * then we just enabled preemption. Stop timing the latency.
++ */
++static inline void preempt_latency_stop(int val)
++{
++ if (preempt_count() == val)
++ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
++}
++
++void preempt_count_sub(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
++ return;
++ /*
++ * Is the spinlock portion underflowing?
++ */
++ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
++ !(preempt_count() & PREEMPT_MASK)))
++ return;
++#endif
++
++ preempt_latency_stop(val);
++ __preempt_count_sub(val);
++}
++EXPORT_SYMBOL(preempt_count_sub);
++NOKPROBE_SYMBOL(preempt_count_sub);
++
++#else
++static inline void preempt_latency_start(int val) { }
++static inline void preempt_latency_stop(int val) { }
++#endif
++
++static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ return p->preempt_disable_ip;
++#else
++ return 0;
++#endif
++}
++
++/*
++ * Print scheduling while atomic bug:
++ */
++static noinline void __schedule_bug(struct task_struct *prev)
++{
++ /* Save this before calling printk(), since that will clobber it */
++ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
++
++ if (oops_in_progress)
++ return;
++
++ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
++ prev->comm, prev->pid, preempt_count());
++
++ debug_show_held_locks(prev);
++ print_modules();
++ if (irqs_disabled())
++ print_irqtrace_events(prev);
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, preempt_disable_ip);
++ }
++ check_panic_on_warn("scheduling while atomic");
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++
++/*
++ * Various schedule()-time debugging checks and statistics:
++ */
++static inline void schedule_debug(struct task_struct *prev, bool preempt)
++{
++#ifdef CONFIG_SCHED_STACK_END_CHECK
++ if (task_stack_end_corrupted(prev))
++ panic("corrupted stack end detected inside scheduler\n");
++
++ if (task_scs_end_corrupted(prev))
++ panic("corrupted shadow stack detected inside scheduler\n");
++#endif
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
++ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
++ prev->comm, prev->pid, prev->non_block_count);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++ }
++#endif
++
++ if (unlikely(in_atomic_preempt_off())) {
++ __schedule_bug(prev);
++ preempt_count_set(PREEMPT_DISABLED);
++ }
++ rcu_sleep_check();
++ WARN_ON_ONCE(ct_state() == CT_STATE_USER);
++
++ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
++
++ schedstat_inc(this_rq()->sched_count);
++}
++
++#ifdef ALT_SCHED_DEBUG
++void alt_sched_debug(void)
++{
++ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx,"
++ " ecore_idle: 0x%04lx\n",
++ sched_rq_pending_mask.bits[0],
++ sched_idle_mask->bits[0],
++ sched_pcore_idle_mask->bits[0],
++ sched_ecore_idle_mask->bits[0]);
++}
++#endif
++
++
++#ifdef CONFIG_PREEMPT_RT
++#define SCHED_NR_MIGRATE_BREAK 8
++#else /* !CONFIG_PREEMPT_RT: */
++#define SCHED_NR_MIGRATE_BREAK 32
++#endif /* !CONFIG_PREEMPT_RT */
++
++__read_mostly unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
++
++/*
++ * Migrate pending tasks in @rq to @dest_cpu
++ */
++static inline int
++migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
++{
++ struct task_struct *p, *skip = rq->curr;
++ int nr_migrated = 0;
++ int nr_tries = min(rq->nr_running / 2, sysctl_sched_nr_migrate);
++
++ /* WA to check rq->curr is still on rq */
++ if (!task_on_rq_queued(skip))
++ return 0;
++
++ while (skip != rq->idle && nr_tries &&
++ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
++ skip = sched_rq_next_task(p, rq);
++ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
++ __SCHED_DEQUEUE_TASK(p, rq, 0, );
++ set_task_cpu(p, dest_cpu);
++ sched_task_sanity_check(p, dest_rq);
++ sched_mm_cid_migrate_to(dest_rq, p);
++ __SCHED_ENQUEUE_TASK(p, dest_rq, 0, );
++ nr_migrated++;
++ }
++ nr_tries--;
++ }
++
++ return nr_migrated;
++}
++
++static inline int take_other_rq_tasks(struct rq *rq, int cpu)
++{
++ cpumask_t *topo_mask, *end_mask, chk;
++
++ if (unlikely(!rq->online))
++ return 0;
++
++ if (cpumask_empty(&sched_rq_pending_mask))
++ return 0;
++
++ topo_mask = per_cpu(sched_cpu_topo_masks, cpu);
++ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
++ do {
++ int i;
++
++ if (!cpumask_and(&chk, &sched_rq_pending_mask, topo_mask))
++ continue;
++
++ for_each_cpu_wrap(i, &chk, cpu) {
++ int nr_migrated;
++ struct rq *src_rq;
++
++ src_rq = cpu_rq(i);
++ if (!do_raw_spin_trylock(&src_rq->lock))
++ continue;
++ spin_acquire(&src_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
++ sub_nr_running(src_rq, nr_migrated);
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++
++ add_nr_running(rq, nr_migrated);
++
++ update_sched_preempt_mask(rq);
++ cpufreq_update_util(rq, 0);
++
++ return 1;
++ }
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++ }
++ } while (++topo_mask < end_mask);
++
++ return 0;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sysctl_sched_base_slice;
++
++ sched_task_renew(p, rq);
++
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
++ requeue_task(p, rq);
++}
++
++static inline int balance_select_task_rq(struct task_struct *p, cpumask_t *avail_mask)
++{
++ cpumask_t mask;
++
++ if (!preempt_mask_check(&mask, avail_mask, task_sched_prio(p)))
++ return -1;
++
++ if (cpumask_and(&mask, &mask, p->cpus_ptr))
++ return best_mask_cpu(task_cpu(p), &mask);
++
++ return task_cpu(p);
++}
++
++static inline void
++__move_queued_task(struct rq *rq, struct task_struct *p, struct rq *dest_rq, int dest_cpu)
++{
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ set_task_cpu(p, dest_cpu);
++
++ sched_mm_cid_migrate_to(dest_rq, p);
++
++ sched_task_sanity_check(p, dest_rq);
++ enqueue_task(p, dest_rq, 0);
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
++ wakeup_preempt(dest_rq);
++}
++
++static inline void prio_balance(struct rq *rq, const int cpu)
++{
++ struct task_struct *p, *next;
++ cpumask_t mask;
++
++ if (!rq->online)
++ return;
++
++ if (!cpumask_empty(sched_idle_mask))
++ return;
++
++ if (0 == rq->prio_balance_time)
++ return;
++
++ if (rq->clock - rq->prio_balance_time < sysctl_sched_base_slice << 1)
++ return;
++
++ rq->prio_balance_time = rq->clock;
++
++ cpumask_copy(&mask, cpu_active_mask);
++ cpumask_clear_cpu(cpu, &mask);
++
++ p = sched_rq_next_task(rq->curr, rq);
++ while (p != rq->idle) {
++ next = sched_rq_next_task(p, rq);
++ if (!is_migration_disabled(p)) {
++ int dest_cpu;
++
++ dest_cpu = balance_select_task_rq(p, &mask);
++ if (dest_cpu < 0)
++ return;
++
++ if (cpu != dest_cpu) {
++ struct rq *dest_rq = cpu_rq(dest_cpu);
++
++ if (do_raw_spin_trylock(&dest_rq->lock)) {
++ cpumask_clear_cpu(dest_cpu, &mask);
++
++ spin_acquire(&dest_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ __move_queued_task(rq, p, dest_rq, dest_cpu);
++
++ spin_release(&dest_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&dest_rq->lock);
++ }
++ }
++ }
++ p = next;
++ }
++}
++
++/*
++ * Timeslices below RESCHED_NS are considered as good as expired as there's no
++ * point rescheduling when there's so little time left.
++ */
++static inline void check_curr(struct task_struct *p, struct rq *rq)
++{
++ if (unlikely(rq->idle == p))
++ return;
++
++ update_curr(rq, p);
++
++ if (p->time_slice < RESCHED_NS)
++ time_slice_expired(p, rq);
++}
++
++static inline struct task_struct *
++choose_next_task(struct rq *rq, int cpu)
++{
++ struct task_struct *next = sched_rq_first_task(rq);
++
++ if (next == rq->idle) {
++ if (!take_other_rq_tasks(rq, cpu)) {
++ if (likely(rq->balance_func && rq->online))
++ rq->balance_func(rq, cpu);
++
++ schedstat_inc(rq->sched_goidle);
++ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
++ return next;
++ }
++ next = sched_rq_first_task(rq);
++ }
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, next->time_slice);
++#endif
++ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu, next);*/
++ return next;
++}
++
++/*
++ * Constants for the sched_mode argument of __schedule().
++ *
++ * The mode argument allows RT enabled kernels to differentiate a
++ * preemption from blocking on an 'sleeping' spin/rwlock.
++ */
++ #define SM_IDLE (-1)
++ #define SM_NONE 0
++ #define SM_PREEMPT 1
++ #define SM_RTLOCK_WAIT 2
++
++/*
++ * Helper function for __schedule()
++ *
++ * If a task does not have signals pending, deactivate it
++ * Otherwise marks the task's __state as RUNNING
++ */
++static bool try_to_block_task(struct rq *rq, struct task_struct *p,
++ unsigned long *task_state_p)
++{
++ unsigned long task_state = *task_state_p;
++ if (signal_pending_state(task_state, p)) {
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ *task_state_p = TASK_RUNNING;
++ return false;
++ }
++ p->sched_contributes_to_load =
++ (task_state & TASK_UNINTERRUPTIBLE) &&
++ !(task_state & TASK_NOLOAD) &&
++ !(task_state & TASK_FROZEN);
++
++ /*
++ * __schedule() ttwu()
++ * prev_state = prev->state; if (p->on_rq && ...)
++ * if (prev_state) goto out;
++ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
++ * p->state = TASK_WAKING
++ *
++ * Where __schedule() and ttwu() have matching control dependencies.
++ *
++ * After this, schedule() must not care about p->state any more.
++ */
++ sched_task_deactivate(p, rq);
++ block_task(rq, p);
++ return true;
++}
++
++/*
++ * schedule() is the main scheduler function.
++ *
++ * The main means of driving the scheduler and thus entering this function are:
++ *
++ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
++ *
++ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
++ * paths. For example, see arch/x86/entry_64.S.
++ *
++ * To drive preemption between tasks, the scheduler sets the flag in timer
++ * interrupt handler sched_tick().
++ *
++ * 3. Wakeups don't really cause entry into schedule(). They add a
++ * task to the run-queue and that's it.
++ *
++ * Now, if the new task added to the run-queue preempts the current
++ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
++ * called on the nearest possible occasion:
++ *
++ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
++ *
++ * - in syscall or exception context, at the next outmost
++ * preempt_enable(). (this might be as soon as the wake_up()'s
++ * spin_unlock()!)
++ *
++ * - in IRQ context, return from interrupt-handler to
++ * preemptible context
++ *
++ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
++ * then at the next:
++ *
++ * - cond_resched() call
++ * - explicit schedule() call
++ * - return from syscall or exception to user-space
++ * - return from interrupt-handler to user-space
++ *
++ * WARNING: must be called with preemption disabled!
++ */
++static void __sched notrace __schedule(int sched_mode)
++{
++ struct task_struct *prev, *next;
++ /*
++ * On PREEMPT_RT kernel, SM_RTLOCK_WAIT is noted
++ * as a preemption by schedule_debug() and RCU.
++ */
++ bool preempt = sched_mode > SM_NONE;
++ bool is_switch = false;
++ unsigned long *switch_count;
++ unsigned long prev_state;
++ struct rq *rq;
++ int cpu;
++
++ /* Trace preemptions consistently with task switches */
++ trace_sched_entry_tp(preempt);
++
++ cpu = smp_processor_id();
++ rq = cpu_rq(cpu);
++ prev = rq->curr;
++
++ schedule_debug(prev, preempt);
++
++ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
++ hrtick_clear(rq);
++
++ klp_sched_try_switch(prev);
++
++ local_irq_disable();
++ rcu_note_context_switch(preempt);
++
++ /*
++ * Make sure that signal_pending_state()->signal_pending() below
++ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
++ * done by the caller to avoid the race with signal_wake_up():
++ *
++ * __set_current_state(@state) signal_wake_up()
++ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
++ * wake_up_state(p, state)
++ * LOCK rq->lock LOCK p->pi_state
++ * smp_mb__after_spinlock() smp_mb__after_spinlock()
++ * if (signal_pending_state()) if (p->state & @state)
++ *
++ * Also, the membarrier system call requires a full memory barrier
++ * after coming from user-space, before storing to rq->curr; this
++ * barrier matches a full barrier in the proximity of the membarrier
++ * system call exit.
++ */
++ raw_spin_lock(&rq->lock);
++ smp_mb__after_spinlock();
++
++ update_rq_clock(rq);
++
++ switch_count = &prev->nivcsw;
++
++ /* Task state changes only considers SM_PREEMPT as preemption */
++ preempt = sched_mode == SM_PREEMPT;
++
++ /*
++ * We must load prev->state once (task_struct::state is volatile), such
++ * that we form a control dependency vs deactivate_task() below.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ if (sched_mode == SM_IDLE) {
++ if (!rq->nr_running) {
++ next = prev;
++ goto picked;
++ }
++ } else if (!preempt && prev_state) {
++ try_to_block_task(rq, prev, &prev_state);
++ switch_count = &prev->nvcsw;
++ }
++
++ check_curr(prev, rq);
++
++ next = choose_next_task(rq, cpu);
++picked:
++ clear_tsk_need_resched(prev);
++ clear_preempt_need_resched();
++ rq->last_seen_need_resched_ns = 0;
++
++ is_switch = prev != next;
++ if (likely(is_switch)) {
++ next->last_ran = rq->clock_task;
++
++ /*printk(KERN_INFO "sched: %px -> %px\n", prev, next);*/
++ rq->nr_switches++;
++ /*
++ * RCU users of rcu_dereference(rq->curr) may not see
++ * changes to task_struct made by pick_next_task().
++ */
++ RCU_INIT_POINTER(rq->curr, next);
++ /*
++ * The membarrier system call requires each architecture
++ * to have a full memory barrier after updating
++ * rq->curr, before returning to user-space.
++ *
++ * Here are the schemes providing that barrier on the
++ * various architectures:
++ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC,
++ * RISC-V. switch_mm() relies on membarrier_arch_switch_mm()
++ * on PowerPC and on RISC-V.
++ * - finish_lock_switch() for weakly-ordered
++ * architectures where spin_unlock is a full barrier,
++ * - switch_to() for arm64 (weakly-ordered, spin_unlock
++ * is a RELEASE barrier),
++ *
++ * The barrier matches a full barrier in the proximity of
++ * the membarrier system call entry.
++ *
++ * On RISC-V, this barrier pairing is also needed for the
++ * SYNC_CORE command when switching between processes, cf.
++ * the inline comments in membarrier_arch_switch_mm().
++ */
++ ++*switch_count;
++
++ trace_sched_switch(preempt, prev, next, prev_state);
++
++ /* Also unlocks the rq: */
++ rq = context_switch(rq, prev, next);
++
++ cpu = cpu_of(rq);
++ } else {
++ __balance_callbacks(rq);
++ prio_balance(rq, cpu);
++ raw_spin_unlock_irq(&rq->lock);
++ }
++ trace_sched_exit_tp(is_switch);
++}
++
++void __noreturn do_task_dead(void)
++{
++ /* Causes final put_task_struct in finish_task_switch(): */
++ set_special_state(TASK_DEAD);
++
++ /* Tell freezer to ignore us: */
++ current->flags |= PF_NOFREEZE;
++
++ __schedule(SM_NONE);
++ BUG();
++
++ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
++ for (;;)
++ cpu_relax();
++}
++
++static inline void sched_submit_work(struct task_struct *tsk)
++{
++ static DEFINE_WAIT_OVERRIDE_MAP(sched_map, LD_WAIT_CONFIG);
++ unsigned int task_flags;
++
++ /*
++ * Establish LD_WAIT_CONFIG context to ensure none of the code called
++ * will use a blocking primitive -- which would lead to recursion.
++ */
++ lock_map_acquire_try(&sched_map);
++
++ task_flags = tsk->flags;
++ /*
++ * If a worker goes to sleep, notify and ask workqueue whether it
++ * wants to wake up a task to maintain concurrency.
++ */
++ if (task_flags & PF_WQ_WORKER)
++ wq_worker_sleeping(tsk);
++ else if (task_flags & PF_IO_WORKER)
++ io_wq_worker_sleeping(tsk);
++
++ /*
++ * spinlock and rwlock must not flush block requests. This will
++ * deadlock if the callback attempts to acquire a lock which is
++ * already acquired.
++ */
++ WARN_ON_ONCE(current->__state & TASK_RTLOCK_WAIT);
++
++ /*
++ * If we are going to sleep and we have plugged IO queued,
++ * make sure to submit it to avoid deadlocks.
++ */
++ blk_flush_plug(tsk->plug, true);
++
++ lock_map_release(&sched_map);
++}
++
++static void sched_update_worker(struct task_struct *tsk)
++{
++ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER | PF_BLOCK_TS)) {
++ if (tsk->flags & PF_BLOCK_TS)
++ blk_plug_invalidate_ts(tsk);
++ if (tsk->flags & PF_WQ_WORKER)
++ wq_worker_running(tsk);
++ else if (tsk->flags & PF_IO_WORKER)
++ io_wq_worker_running(tsk);
++ }
++}
++
++static __always_inline void __schedule_loop(int sched_mode)
++{
++ do {
++ preempt_disable();
++ __schedule(sched_mode);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++}
++
++asmlinkage __visible void __sched schedule(void)
++{
++ struct task_struct *tsk = current;
++
++#ifdef CONFIG_RT_MUTEXES
++ lockdep_assert(!tsk->sched_rt_mutex);
++#endif
++
++ if (!task_is_running(tsk))
++ sched_submit_work(tsk);
++ __schedule_loop(SM_NONE);
++ sched_update_worker(tsk);
++}
++EXPORT_SYMBOL(schedule);
++
++/*
++ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
++ * state (have scheduled out non-voluntarily) by making sure that all
++ * tasks have either left the run queue or have gone into user space.
++ * As idle tasks do not do either, they must not ever be preempted
++ * (schedule out non-voluntarily).
++ *
++ * schedule_idle() is similar to schedule_preempt_disable() except that it
++ * never enables preemption because it does not call sched_submit_work().
++ */
++void __sched schedule_idle(void)
++{
++ /*
++ * As this skips calling sched_submit_work(), which the idle task does
++ * regardless because that function is a NOP when the task is in a
++ * TASK_RUNNING state, make sure this isn't used someplace that the
++ * current task can be in any other state. Note, idle is always in the
++ * TASK_RUNNING state.
++ */
++ WARN_ON_ONCE(current->__state);
++ do {
++ __schedule(SM_IDLE);
++ } while (need_resched());
++}
++
++#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
++asmlinkage __visible void __sched schedule_user(void)
++{
++ /*
++ * If we come here after a random call to set_need_resched(),
++ * or we have been woken up remotely but the IPI has not yet arrived,
++ * we haven't yet exited the RCU idle mode. Do it here manually until
++ * we find a better solution.
++ *
++ * NB: There are buggy callers of this function. Ideally we
++ * should warn if prev_state != CT_STATE_USER, but that will trigger
++ * too frequently to make sense yet.
++ */
++ enum ctx_state prev_state = exception_enter();
++ schedule();
++ exception_exit(prev_state);
++}
++#endif
++
++/**
++ * schedule_preempt_disabled - called with preemption disabled
++ *
++ * Returns with preemption disabled. Note: preempt_count must be 1
++ */
++void __sched schedule_preempt_disabled(void)
++{
++ sched_preempt_enable_no_resched();
++ schedule();
++ preempt_disable();
++}
++
++#ifdef CONFIG_PREEMPT_RT
++void __sched notrace schedule_rtlock(void)
++{
++ __schedule_loop(SM_RTLOCK_WAIT);
++}
++NOKPROBE_SYMBOL(schedule_rtlock);
++#endif
++
++static void __sched notrace preempt_schedule_common(void)
++{
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ __schedule(SM_PREEMPT);
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ } while (need_resched());
++}
++
++#ifdef CONFIG_PREEMPTION
++/*
++ * This is the entry point to schedule() from in-kernel preemption
++ * off of preempt_enable.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule(void)
++{
++ /*
++ * If there is a non-zero preempt_count or interrupts are disabled,
++ * we do not want to preempt the current task. Just return..
++ */
++ if (likely(!preemptible()))
++ return;
++
++ preempt_schedule_common();
++}
++NOKPROBE_SYMBOL(preempt_schedule);
++EXPORT_SYMBOL(preempt_schedule);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++# ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++# ifndef preempt_schedule_dynamic_enabled
++# define preempt_schedule_dynamic_enabled preempt_schedule
++# define preempt_schedule_dynamic_disabled NULL
++# endif
++DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
++void __sched notrace dynamic_preempt_schedule(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
++ return;
++ preempt_schedule();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule);
++EXPORT_SYMBOL(dynamic_preempt_schedule);
++# endif
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++/**
++ * preempt_schedule_notrace - preempt_schedule called by tracing
++ *
++ * The tracing infrastructure uses preempt_enable_notrace to prevent
++ * recursion and tracing preempt enabling caused by the tracing
++ * infrastructure itself. But as tracing can happen in areas coming
++ * from userspace or just about to enter userspace, a preempt enable
++ * can occur before user_exit() is called. This will cause the scheduler
++ * to be called when the system is still in usermode.
++ *
++ * To prevent this, the preempt_enable_notrace will use this function
++ * instead of preempt_schedule() to exit user context if needed before
++ * calling the scheduler.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
++{
++ enum ctx_state prev_ctx;
++
++ if (likely(!preemptible()))
++ return;
++
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ /*
++ * Needs preempt disabled in case user_exit() is traced
++ * and the tracer calls preempt_enable_notrace() causing
++ * an infinite recursion.
++ */
++ prev_ctx = exception_enter();
++ __schedule(SM_PREEMPT);
++ exception_exit(prev_ctx);
++
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++ } while (need_resched());
++}
++EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++# ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++# ifndef preempt_schedule_notrace_dynamic_enabled
++# define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
++# define preempt_schedule_notrace_dynamic_disabled NULL
++# endif
++DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
++void __sched notrace dynamic_preempt_schedule_notrace(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
++ return;
++ preempt_schedule_notrace();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
++EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
++# endif
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++#endif /* CONFIG_PREEMPTION */
++
++/*
++ * This is the entry point to schedule() from kernel preemption
++ * off of IRQ context.
++ * Note, that this is called and return with IRQs disabled. This will
++ * protect us against recursive calling from IRQ contexts.
++ */
++asmlinkage __visible void __sched preempt_schedule_irq(void)
++{
++ enum ctx_state prev_state;
++
++ /* Catch callers which need to be fixed */
++ BUG_ON(preempt_count() || !irqs_disabled());
++
++ prev_state = exception_enter();
++
++ do {
++ preempt_disable();
++ local_irq_enable();
++ __schedule(SM_PREEMPT);
++ local_irq_disable();
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++
++ exception_exit(prev_state);
++}
++
++int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
++ void *key)
++{
++ WARN_ON_ONCE(wake_flags & ~(WF_SYNC|WF_CURRENT_CPU));
++ return try_to_wake_up(curr->private, mode, wake_flags);
++}
++EXPORT_SYMBOL(default_wake_function);
++
++void check_task_changed(struct task_struct *p, struct rq *rq)
++{
++ /* Trigger resched if task sched_prio has been modified. */
++ if (task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ requeue_task(p, rq);
++ wakeup_preempt(rq);
++ }
++}
++
++void __setscheduler_prio(struct task_struct *p, int prio)
++{
++ p->prio = prio;
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++/*
++ * Would be more useful with typeof()/auto_type but they don't mix with
++ * bit-fields. Since it's a local thing, use int. Keep the generic sounding
++ * name such that if someone were to implement this function we get to compare
++ * notes.
++ */
++#define fetch_and_set(x, v) ({ int _x = (x); (x) = (v); _x; })
++
++void rt_mutex_pre_schedule(void)
++{
++ lockdep_assert(!fetch_and_set(current->sched_rt_mutex, 1));
++ sched_submit_work(current);
++}
++
++void rt_mutex_schedule(void)
++{
++ lockdep_assert(current->sched_rt_mutex);
++ __schedule_loop(SM_NONE);
++}
++
++void rt_mutex_post_schedule(void)
++{
++ sched_update_worker(current);
++ lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
++}
++
++/*
++ * rt_mutex_setprio - set the current priority of a task
++ * @p: task to boost
++ * @pi_task: donor task
++ *
++ * This function changes the 'effective' priority of a task. It does
++ * not touch ->normal_prio like __setscheduler().
++ *
++ * Used by the rt_mutex code to implement priority inheritance
++ * logic. Call site only calls if the priority of the task changed.
++ */
++void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
++{
++ int prio;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ /* XXX used to be waiter->prio, not waiter->task->prio */
++ prio = __rt_effective_prio(pi_task, p->normal_prio);
++
++ /*
++ * If nothing changed; bail early.
++ */
++ if (p->pi_top_task == pi_task && prio == p->prio)
++ return;
++
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Set under pi_lock && rq->lock, such that the value can be used under
++ * either lock.
++ *
++ * Note that there is loads of tricky to make this pointer cache work
++ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
++ * ensure a task is de-boosted (pi_task is set to NULL) before the
++ * task is allowed to run again (and can exit). This ensures the pointer
++ * points to a blocked task -- which guarantees the task is present.
++ */
++ p->pi_top_task = pi_task;
++
++ /*
++ * For FIFO/RR we only need to set prio, if that matches we're done.
++ */
++ if (prio == p->prio)
++ goto out_unlock;
++
++ /*
++ * Idle task boosting is a no-no in general. There is one
++ * exception, when PREEMPT_RT and NOHZ is active:
++ *
++ * The idle task calls get_next_timer_interrupt() and holds
++ * the timer wheel base->lock on the CPU and another CPU wants
++ * to access the timer (probably to cancel it). We can safely
++ * ignore the boosting request, as the idle CPU runs this code
++ * with interrupts disabled and will complete the lock
++ * protected section without being interrupted. So there is no
++ * real need to boost.
++ */
++ if (unlikely(p == rq->idle)) {
++ WARN_ON(p != rq->curr);
++ WARN_ON(p->pi_blocked_on);
++ goto out_unlock;
++ }
++
++ trace_sched_pi_setprio(p, pi_task);
++
++ __setscheduler_prio(p, prio);
++
++ check_task_changed(p, rq);
++out_unlock:
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++
++ if (task_on_rq_queued(p))
++ __balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++
++ preempt_enable();
++}
++#endif /* CONFIG_RT_MUTEXES */
++
++#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
++int __sched __cond_resched(void)
++{
++ if (should_resched(0) && !irqs_disabled()) {
++ preempt_schedule_common();
++ return 1;
++ }
++ /*
++ * In PREEMPT_RCU kernels, ->rcu_read_lock_nesting tells the tick
++ * whether the current CPU is in an RCU read-side critical section,
++ * so the tick can report quiescent states even for CPUs looping
++ * in kernel context. In contrast, in non-preemptible kernels,
++ * RCU readers leave no in-memory hints, which means that CPU-bound
++ * processes executing in kernel context might never report an
++ * RCU quiescent state. Therefore, the following code causes
++ * cond_resched() to report a quiescent state, but only when RCU
++ * is in urgent need of one.
++ * A third case, preemptible, but non-PREEMPT_RCU provides for
++ * urgently needed quiescent states via rcu_flavor_sched_clock_irq().
++ */
++#ifndef CONFIG_PREEMPT_RCU
++ rcu_all_qs();
++#endif
++ return 0;
++}
++EXPORT_SYMBOL(__cond_resched);
++#endif
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++# ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++# define cond_resched_dynamic_enabled __cond_resched
++# define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(cond_resched);
++
++# define might_resched_dynamic_enabled __cond_resched
++# define might_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(might_resched);
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
++int __sched dynamic_cond_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_cond_resched);
++
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
++int __sched dynamic_might_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_might_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_might_resched);
++# endif
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++/*
++ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
++ * call schedule, and on return reacquire the lock.
++ *
++ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
++ * operations here to prevent schedule() from being called twice (once via
++ * spin_unlock(), once by hand).
++ */
++int __cond_resched_lock(spinlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held(lock);
++
++ if (spin_needbreak(lock) || resched) {
++ spin_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ spin_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_lock);
++
++int __cond_resched_rwlock_read(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_read(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ read_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ read_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_read);
++
++int __cond_resched_rwlock_write(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_write(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ write_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ write_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_write);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++
++# ifdef CONFIG_GENERIC_ENTRY
++# include <linux/entry-common.h>
++# endif
++
++/*
++ * SC:cond_resched
++ * SC:might_resched
++ * SC:preempt_schedule
++ * SC:preempt_schedule_notrace
++ * SC:irqentry_exit_cond_resched
++ *
++ *
++ * NONE:
++ * cond_resched <- __cond_resched
++ * might_resched <- RET0
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ * dynamic_preempt_lazy <- false
++ *
++ * VOLUNTARY:
++ * cond_resched <- __cond_resched
++ * might_resched <- __cond_resched
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ * dynamic_preempt_lazy <- false
++ *
++ * FULL:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ * dynamic_preempt_lazy <- false
++ *
++ * LAZY:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ * dynamic_preempt_lazy <- true
++ */
++
++enum {
++ preempt_dynamic_undefined = -1,
++ preempt_dynamic_none,
++ preempt_dynamic_voluntary,
++ preempt_dynamic_full,
++ preempt_dynamic_lazy,
++};
++
++int preempt_dynamic_mode = preempt_dynamic_undefined;
++
++int sched_dynamic_mode(const char *str)
++{
++# ifndef CONFIG_PREEMPT_RT
++ if (!strcmp(str, "none"))
++ return preempt_dynamic_none;
++
++ if (!strcmp(str, "voluntary"))
++ return preempt_dynamic_voluntary;
++# endif
++
++ if (!strcmp(str, "full"))
++ return preempt_dynamic_full;
++
++# ifdef CONFIG_ARCH_HAS_PREEMPT_LAZY
++ if (!strcmp(str, "lazy"))
++ return preempt_dynamic_lazy;
++# endif
++
++ return -EINVAL;
++}
++
++# define preempt_dynamic_key_enable(f) static_key_enable(&sk_dynamic_##f.key)
++# define preempt_dynamic_key_disable(f) static_key_disable(&sk_dynamic_##f.key)
++
++# if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++# define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
++# define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++# define preempt_dynamic_enable(f) preempt_dynamic_key_enable(f)
++# define preempt_dynamic_disable(f) preempt_dynamic_key_disable(f)
++# else
++# error "Unsupported PREEMPT_DYNAMIC mechanism"
++# endif
++
++static DEFINE_MUTEX(sched_dynamic_mutex);
++
++static void __sched_dynamic_update(int mode)
++{
++ /*
++ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
++ * the ZERO state, which is invalid.
++ */
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++
++ switch (mode) {
++ case preempt_dynamic_none:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: none\n");
++ break;
++
++ case preempt_dynamic_voluntary:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: voluntary\n");
++ break;
++
++ case preempt_dynamic_full:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: full\n");
++ break;
++
++ case preempt_dynamic_lazy:
++ preempt_dynamic_disable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_enable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: lazy\n");
++ break;
++ }
++
++ preempt_dynamic_mode = mode;
++}
++
++void sched_dynamic_update(int mode)
++{
++ mutex_lock(&sched_dynamic_mutex);
++ __sched_dynamic_update(mode);
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++static int __init setup_preempt_mode(char *str)
++{
++ int mode = sched_dynamic_mode(str);
++ if (mode < 0) {
++ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
++ return 0;
++ }
++
++ sched_dynamic_update(mode);
++ return 1;
++}
++__setup("preempt=", setup_preempt_mode);
++
++static void __init preempt_dynamic_init(void)
++{
++ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
++ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
++ sched_dynamic_update(preempt_dynamic_none);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
++ sched_dynamic_update(preempt_dynamic_voluntary);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_LAZY)) {
++ sched_dynamic_update(preempt_dynamic_lazy);
++ } else {
++ /* Default static call setting, nothing to do */
++ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
++ preempt_dynamic_mode = preempt_dynamic_full;
++ pr_info("Dynamic Preempt: full\n");
++ }
++ }
++}
++
++# define PREEMPT_MODEL_ACCESSOR(mode) \
++ bool preempt_model_##mode(void) \
++ { \
++ WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
++ return preempt_dynamic_mode == preempt_dynamic_##mode; \
++ } \
++ EXPORT_SYMBOL_GPL(preempt_model_##mode)
++
++PREEMPT_MODEL_ACCESSOR(none);
++PREEMPT_MODEL_ACCESSOR(voluntary);
++PREEMPT_MODEL_ACCESSOR(full);
++PREEMPT_MODEL_ACCESSOR(lazy);
++
++#else /* !CONFIG_PREEMPT_DYNAMIC: */
++
++#define preempt_dynamic_mode -1
++
++static inline void preempt_dynamic_init(void) { }
++
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++const char *preempt_modes[] = {
++ "none", "voluntary", "full", "lazy", NULL,
++};
++
++const char *preempt_model_str(void)
++{
++ bool brace = IS_ENABLED(CONFIG_PREEMPT_RT) &&
++ (IS_ENABLED(CONFIG_PREEMPT_DYNAMIC) ||
++ IS_ENABLED(CONFIG_PREEMPT_LAZY));
++ static char buf[128];
++
++ if (IS_ENABLED(CONFIG_PREEMPT_BUILD)) {
++ struct seq_buf s;
++
++ seq_buf_init(&s, buf, sizeof(buf));
++ seq_buf_puts(&s, "PREEMPT");
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RT))
++ seq_buf_printf(&s, "%sRT%s",
++ brace ? "_{" : "_",
++ brace ? "," : "");
++
++ if (IS_ENABLED(CONFIG_PREEMPT_DYNAMIC)) {
++ seq_buf_printf(&s, "(%s)%s",
++ preempt_dynamic_mode >= 0 ?
++ preempt_modes[preempt_dynamic_mode] : "undef",
++ brace ? "}" : "");
++ return seq_buf_str(&s);
++ }
++
++ if (IS_ENABLED(CONFIG_PREEMPT_LAZY)) {
++ seq_buf_printf(&s, "LAZY%s",
++ brace ? "}" : "");
++ return seq_buf_str(&s);
++ }
++
++ return seq_buf_str(&s);
++ }
++
++ if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY_BUILD))
++ return "VOLUNTARY";
++
++ return "NONE";
++}
++
++int io_schedule_prepare(void)
++{
++ int old_iowait = current->in_iowait;
++
++ current->in_iowait = 1;
++ blk_flush_plug(current->plug, true);
++ return old_iowait;
++}
++
++void io_schedule_finish(int token)
++{
++ current->in_iowait = token;
++}
++
++/*
++ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
++ * that process accounting knows that this is a task in IO wait state.
++ *
++ * But don't do that if it is a deliberate, throttling IO wait (this task
++ * has set its backing_dev_info: the queue against which it should throttle)
++ */
++
++long __sched io_schedule_timeout(long timeout)
++{
++ int token;
++ long ret;
++
++ token = io_schedule_prepare();
++ ret = schedule_timeout(timeout);
++ io_schedule_finish(token);
++
++ return ret;
++}
++EXPORT_SYMBOL(io_schedule_timeout);
++
++void __sched io_schedule(void)
++{
++ int token;
++
++ token = io_schedule_prepare();
++ schedule();
++ io_schedule_finish(token);
++}
++EXPORT_SYMBOL(io_schedule);
++
++void sched_show_task(struct task_struct *p)
++{
++ unsigned long free;
++ int ppid;
++
++ if (!try_get_task_stack(p))
++ return;
++
++ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
++
++ if (task_is_running(p))
++ pr_cont(" running task ");
++ free = stack_not_used(p);
++ ppid = 0;
++ rcu_read_lock();
++ if (pid_alive(p))
++ ppid = task_pid_nr(rcu_dereference(p->real_parent));
++ rcu_read_unlock();
++ pr_cont(" stack:%-5lu pid:%-5d tgid:%-5d ppid:%-6d task_flags:0x%04x flags:0x%08lx\n",
++ free, task_pid_nr(p), task_tgid_nr(p),
++ ppid, p->flags, read_task_thread_flags(p));
++
++ print_worker_info(KERN_INFO, p);
++ print_stop_info(KERN_INFO, p);
++ show_stack(p, NULL, KERN_INFO);
++ put_task_stack(p);
++}
++EXPORT_SYMBOL_GPL(sched_show_task);
++
++static inline bool
++state_filter_match(unsigned long state_filter, struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /* no filter, everything matches */
++ if (!state_filter)
++ return true;
++
++ /* filter, but doesn't match */
++ if (!(state & state_filter))
++ return false;
++
++ /*
++ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
++ * TASK_KILLABLE).
++ */
++ if (state_filter == TASK_UNINTERRUPTIBLE && (state & TASK_NOLOAD))
++ return false;
++
++ return true;
++}
++
++
++void show_state_filter(unsigned int state_filter)
++{
++ struct task_struct *g, *p;
++
++ rcu_read_lock();
++ for_each_process_thread(g, p) {
++ /*
++ * reset the NMI-timeout, listing all files on a slow
++ * console might take a lot of time:
++ * Also, reset softlockup watchdogs on all CPUs, because
++ * another CPU might be blocked waiting for us to process
++ * an IPI.
++ */
++ touch_nmi_watchdog();
++ touch_all_softlockup_watchdogs();
++ if (state_filter_match(state_filter, p))
++ sched_show_task(p);
++ }
++
++ /* TODO: Alt schedule FW should support this
++ if (!state_filter)
++ sysrq_sched_debug_show();
++ */
++ rcu_read_unlock();
++ /*
++ * Only show locks if all tasks are dumped:
++ */
++ if (!state_filter)
++ debug_show_all_locks();
++}
++
++void dump_cpu_task(int cpu)
++{
++ if (in_hardirq() && cpu == smp_processor_id()) {
++ struct pt_regs *regs;
++
++ regs = get_irq_regs();
++ if (regs) {
++ show_regs(regs);
++ return;
++ }
++ }
++
++ if (trigger_single_cpu_backtrace(cpu))
++ return;
++
++ pr_info("Task dump for CPU %d:\n", cpu);
++ sched_show_task(cpu_curr(cpu));
++}
++
++/**
++ * init_idle - set up an idle thread for a given CPU
++ * @idle: task in question
++ * @cpu: CPU the idle task belongs to
++ *
++ * NOTE: this function does not set the idle thread's NEED_RESCHED
++ * flag, to make booting more robust.
++ */
++void __init init_idle(struct task_struct *idle, int cpu)
++{
++ struct affinity_context ac = (struct affinity_context) {
++ .new_mask = cpumask_of(cpu),
++ .flags = 0,
++ };
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&idle->pi_lock, flags);
++ raw_spin_lock(&rq->lock);
++
++ idle->last_ran = rq->clock_task;
++ idle->__state = TASK_RUNNING;
++ /*
++ * PF_KTHREAD should already be set at this point; regardless, make it
++ * look like a proper per-CPU kthread.
++ */
++ idle->flags |= PF_KTHREAD | PF_NO_SETAFFINITY;
++ kthread_set_per_cpu(idle, cpu);
++
++ sched_queue_init_idle(&rq->queue, idle);
++
++ /*
++ * No validation and serialization required at boot time and for
++ * setting up the idle tasks of not yet online CPUs.
++ */
++ set_cpus_allowed_common(idle, &ac);
++
++ /* Silence PROVE_RCU */
++ rcu_read_lock();
++ __set_task_cpu(idle, cpu);
++ rcu_read_unlock();
++
++ rq->idle = idle;
++ rcu_assign_pointer(rq->curr, idle);
++ idle->on_cpu = 1;
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
++
++ /* Set the preempt count _outside_ the spinlocks! */
++ init_idle_preempt_count(idle, cpu);
++
++ ftrace_graph_init_idle_task(idle, cpu);
++ vtime_init_idle(idle, cpu);
++ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
++}
++
++int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
++ const struct cpumask __maybe_unused *trial)
++{
++ return 1;
++}
++
++int task_can_attach(struct task_struct *p)
++{
++ int ret = 0;
++
++ /*
++ * Kthreads which disallow setaffinity shouldn't be moved
++ * to a new cpuset; we don't want to change their CPU
++ * affinity and isolating such threads by their set of
++ * allowed nodes is unnecessary. Thus, cpusets are not
++ * applicable for such threads. This prevents checking for
++ * success of set_cpus_allowed_ptr() on all attached tasks
++ * before cpus_mask may be changed.
++ */
++ if (p->flags & PF_NO_SETAFFINITY)
++ ret = -EINVAL;
++
++ return ret;
++}
++
++bool sched_smp_initialized __read_mostly;
++
++#ifdef CONFIG_HOTPLUG_CPU
++/*
++ * Invoked on the outgoing CPU in context of the CPU hotplug thread
++ * after ensuring that there are no user space tasks left on the CPU.
++ *
++ * If there is a lazy mm in use on the hotplug thread, drop it and
++ * switch to init_mm.
++ *
++ * The reference count on init_mm is dropped in finish_cpu().
++ */
++static void sched_force_init_mm(void)
++{
++ struct mm_struct *mm = current->active_mm;
++
++ if (mm != &init_mm) {
++ mmgrab_lazy_tlb(&init_mm);
++ local_irq_disable();
++ current->active_mm = &init_mm;
++ switch_mm_irqs_off(mm, &init_mm, current);
++ local_irq_enable();
++ finish_arch_post_lock_switch();
++ mmdrop_lazy_tlb(mm);
++ }
++
++ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
++}
++
++static int __balance_push_cpu_stop(void *arg)
++{
++ struct task_struct *p = arg;
++ struct rq *rq = this_rq();
++ struct rq_flags rf;
++ int cpu;
++
++ raw_spin_lock_irq(&p->pi_lock);
++ rq_lock(rq, &rf);
++
++ update_rq_clock(rq);
++
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ cpu = select_fallback_rq(rq->cpu, p);
++ rq = __migrate_task(rq, p, cpu);
++ }
++
++ rq_unlock(rq, &rf);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ put_task_struct(p);
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
++
++/*
++ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
++ * effective when the hotplug motion is down.
++ */
++static void balance_push(struct rq *rq)
++{
++ struct task_struct *push_task = rq->curr;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Ensure the thing is persistent until balance_push_set(.on = false);
++ */
++ rq->balance_callback = &balance_push_callback;
++
++ /*
++ * Only active while going offline and when invoked on the outgoing
++ * CPU.
++ */
++ if (!cpu_dying(rq->cpu) || rq != this_rq())
++ return;
++
++ /*
++ * Both the cpu-hotplug and stop task are in this case and are
++ * required to complete the hotplug process.
++ */
++ if (kthread_is_per_cpu(push_task) ||
++ is_migration_disabled(push_task)) {
++
++ /*
++ * If this is the idle task on the outgoing CPU try to wake
++ * up the hotplug control thread which might wait for the
++ * last task to vanish. The rcuwait_active() check is
++ * accurate here because the waiter is pinned on this CPU
++ * and can't obviously be running in parallel.
++ *
++ * On RT kernels this also has to check whether there are
++ * pinned and scheduled out tasks on the runqueue. They
++ * need to leave the migrate disabled section first.
++ */
++ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
++ rcuwait_active(&rq->hotplug_wait)) {
++ raw_spin_unlock(&rq->lock);
++ rcuwait_wake_up(&rq->hotplug_wait);
++ raw_spin_lock(&rq->lock);
++ }
++ return;
++ }
++
++ get_task_struct(push_task);
++ /*
++ * Temporarily drop rq->lock such that we can wake-up the stop task.
++ * Both preemption and IRQs are still disabled.
++ */
++ preempt_disable();
++ raw_spin_unlock(&rq->lock);
++ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
++ this_cpu_ptr(&push_work));
++ preempt_enable();
++ /*
++ * At this point need_resched() is true and we'll take the loop in
++ * schedule(). The next pick is obviously going to be the stop task
++ * which kthread_is_per_cpu() and will push this task away.
++ */
++ raw_spin_lock(&rq->lock);
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct rq_flags rf;
++
++ rq_lock_irqsave(rq, &rf);
++ if (on) {
++ WARN_ON_ONCE(rq->balance_callback);
++ rq->balance_callback = &balance_push_callback;
++ } else if (rq->balance_callback == &balance_push_callback) {
++ rq->balance_callback = NULL;
++ }
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Invoked from a CPUs hotplug control thread after the CPU has been marked
++ * inactive. All tasks which are not per CPU kernel threads are either
++ * pushed off this CPU now via balance_push() or placed on a different CPU
++ * during wakeup. Wait until the CPU is quiescent.
++ */
++static void balance_hotplug_wait(void)
++{
++ struct rq *rq = this_rq();
++
++ rcuwait_wait_event(&rq->hotplug_wait,
++ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
++ TASK_UNINTERRUPTIBLE);
++}
++
++#else /* !CONFIG_HOTPLUG_CPU: */
++
++static void balance_push(struct rq *rq)
++{
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++}
++
++static inline void balance_hotplug_wait(void)
++{
++}
++#endif /* !CONFIG_HOTPLUG_CPU */
++
++static void set_rq_offline(struct rq *rq)
++{
++ if (rq->online) {
++ update_rq_clock(rq);
++ rq->online = false;
++ }
++}
++
++static void set_rq_online(struct rq *rq)
++{
++ if (!rq->online)
++ rq->online = true;
++}
++
++static inline void sched_set_rq_online(struct rq *rq, int cpu)
++{
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_online(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++static inline void sched_set_rq_offline(struct rq *rq, int cpu)
++{
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_offline(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++/*
++ * used to mark begin/end of suspend/resume:
++ */
++static int num_cpus_frozen;
++
++/*
++ * Update cpusets according to cpu_active mask. If cpusets are
++ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
++ * around partition_sched_domains().
++ *
++ * If we come here as part of a suspend/resume, don't touch cpusets because we
++ * want to restore it back to its original state upon resume anyway.
++ */
++static void cpuset_cpu_active(void)
++{
++ if (cpuhp_tasks_frozen) {
++ /*
++ * num_cpus_frozen tracks how many CPUs are involved in suspend
++ * resume sequence. As long as this is not the last online
++ * operation in the resume sequence, just build a single sched
++ * domain, ignoring cpusets.
++ */
++ cpuset_reset_sched_domains();
++ if (--num_cpus_frozen)
++ return;
++ /*
++ * This is the last CPU online operation. So fall through and
++ * restore the original sched domains by considering the
++ * cpuset configurations.
++ */
++ cpuset_force_rebuild();
++ }
++
++ cpuset_update_active_cpus();
++}
++
++static void cpuset_cpu_inactive(unsigned int cpu)
++{
++ if (!cpuhp_tasks_frozen) {
++ cpuset_update_active_cpus();
++ } else {
++ num_cpus_frozen++;
++ cpuset_reset_sched_domains();
++ }
++}
++
++static inline void sched_smt_present_inc(int cpu)
++{
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_inc_cpuslocked(&sched_smt_present);
++ cpumask_or(&sched_smt_mask, &sched_smt_mask, cpu_smt_mask(cpu));
++ }
++#endif /* CONFIG_SCHED_SMT */
++}
++
++static inline void sched_smt_present_dec(int cpu)
++{
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_dec_cpuslocked(&sched_smt_present);
++ if (!static_branch_likely(&sched_smt_present))
++ cpumask_clear(sched_pcore_idle_mask);
++ cpumask_andnot(&sched_smt_mask, &sched_smt_mask, cpu_smt_mask(cpu));
++ }
++#endif /* CONFIG_SCHED_SMT */
++}
++
++int sched_cpu_activate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ /*
++ * Clear the balance_push callback and prepare to schedule
++ * regular tasks.
++ */
++ balance_push_set(cpu, false);
++
++ set_cpu_active(cpu, true);
++
++ if (sched_smp_initialized)
++ cpuset_cpu_active();
++
++ /*
++ * Put the rq online, if not already. This happens:
++ *
++ * 1) In the early boot process, because we build the real domains
++ * after all cpus have been brought up.
++ *
++ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
++ * domains.
++ */
++ sched_set_rq_online(rq, cpu);
++
++ /*
++ * When going up, increment the number of cores with SMT present.
++ */
++ sched_smt_present_inc(cpu);
++
++ return 0;
++}
++
++int sched_cpu_deactivate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ set_cpu_active(cpu, false);
++
++ /*
++ * From this point forward, this CPU will refuse to run any task that
++ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
++ * push those tasks away until this gets cleared, see
++ * sched_cpu_dying().
++ */
++ balance_push_set(cpu, true);
++
++ /*
++ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
++ * users of this state to go away such that all new such users will
++ * observe it.
++ *
++ * Specifically, we rely on ttwu to no longer target this CPU, see
++ * ttwu_queue_cond() and is_cpu_allowed().
++ *
++ * Do sync before park smpboot threads to take care the RCU boost case.
++ */
++ synchronize_rcu();
++
++ sched_set_rq_offline(rq, cpu);
++
++ /*
++ * When going down, decrement the number of cores with SMT present.
++ */
++ sched_smt_present_dec(cpu);
++
++ if (!sched_smp_initialized)
++ return 0;
++
++ cpuset_cpu_inactive(cpu);
++
++ return 0;
++}
++
++static void sched_rq_cpu_starting(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ rq->calc_load_update = calc_load_update;
++}
++
++int sched_cpu_starting(unsigned int cpu)
++{
++ sched_rq_cpu_starting(cpu);
++ sched_tick_start(cpu);
++ return 0;
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++
++/*
++ * Invoked immediately before the stopper thread is invoked to bring the
++ * CPU down completely. At this point all per CPU kthreads except the
++ * hotplug thread (current) and the stopper thread (inactive) have been
++ * either parked or have been unbound from the outgoing CPU. Ensure that
++ * any of those which might be on the way out are gone.
++ *
++ * If after this point a bound task is being woken on this CPU then the
++ * responsible hotplug callback has failed to do it's job.
++ * sched_cpu_dying() will catch it with the appropriate fireworks.
++ */
++int sched_cpu_wait_empty(unsigned int cpu)
++{
++ balance_hotplug_wait();
++ sched_force_init_mm();
++ return 0;
++}
++
++/*
++ * Since this CPU is going 'away' for a while, fold any nr_active delta we
++ * might have. Called from the CPU stopper task after ensuring that the
++ * stopper is the last running task on the CPU, so nr_active count is
++ * stable. We need to take the tear-down thread which is calling this into
++ * account, so we hand in adjust = 1 to the load calculation.
++ *
++ * Also see the comment "Global load-average calculations".
++ */
++static void calc_load_migrate(struct rq *rq)
++{
++ long delta = calc_load_fold_active(rq, 1);
++
++ if (delta)
++ atomic_long_add(delta, &calc_load_tasks);
++}
++
++static void dump_rq_tasks(struct rq *rq, const char *loglvl)
++{
++ struct task_struct *g, *p;
++ int cpu = cpu_of(rq);
++
++ lockdep_assert_held(&rq->lock);
++
++ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
++ for_each_process_thread(g, p) {
++ if (task_cpu(p) != cpu)
++ continue;
++
++ if (!task_on_rq_queued(p))
++ continue;
++
++ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
++ }
++}
++
++int sched_cpu_dying(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /* Handle pending wakeups and then migrate everything off */
++ sched_tick_stop(cpu);
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
++ WARN(true, "Dying CPU not properly vacated!");
++ dump_rq_tasks(rq, KERN_WARNING);
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ calc_load_migrate(rq);
++ hrtick_clear(rq);
++ return 0;
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++static void sched_init_topology_cpumask_early(void)
++{
++ int cpu;
++ cpumask_t *tmp;
++
++ for_each_possible_cpu(cpu) {
++ /* init topo masks */
++ tmp = per_cpu(sched_cpu_topo_masks, cpu);
++
++ cpumask_copy(tmp, cpu_possible_mask);
++ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
++ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
++ }
++}
++
++#define TOPOLOGY_CPUMASK(name, mask, last)\
++ if (cpumask_and(topo, topo, mask)) { \
++ cpumask_copy(topo, mask); \
++ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
++ cpu, (topo++)->bits[0]); \
++ } \
++ if (!last) \
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(mask), \
++ nr_cpumask_bits);
++
++static void sched_init_topology_cpumask(void)
++{
++ int cpu;
++ cpumask_t *topo;
++
++ for_each_online_cpu(cpu) {
++ topo = per_cpu(sched_cpu_topo_masks, cpu);
++
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(cpumask_of(cpu)),
++ nr_cpumask_bits);
++#ifdef CONFIG_SCHED_SMT
++ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
++#endif /* CONFIG_SCHED_SMT */
++ TOPOLOGY_CPUMASK(cluster, topology_cluster_cpumask(cpu), false);
++
++ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
++ per_cpu(sched_cpu_llc_mask, cpu) = topo;
++ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
++
++ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
++
++ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
++
++ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
++ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
++ cpu, per_cpu(sd_llc_id, cpu),
++ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
++ per_cpu(sched_cpu_topo_masks, cpu)));
++ }
++}
++
++void __init sched_init_smp(void)
++{
++ /* Move init over to a non-isolated CPU */
++ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
++ BUG();
++ current->flags &= ~PF_NO_SETAFFINITY;
++
++ sched_init_topology();
++ sched_init_topology_cpumask();
++
++ sched_smp_initialized = true;
++}
++
++static int __init migration_init(void)
++{
++ sched_cpu_starting(smp_processor_id());
++ return 0;
++}
++early_initcall(migration_init);
++
++int in_sched_functions(unsigned long addr)
++{
++ return in_lock_functions(addr) ||
++ (addr >= (unsigned long)__sched_text_start
++ && addr < (unsigned long)__sched_text_end);
++}
++
++#ifdef CONFIG_CGROUP_SCHED
++/*
++ * Default task group.
++ * Every task in system belongs to this group at bootup.
++ */
++struct task_group root_task_group;
++LIST_HEAD(task_groups);
++
++/* Cacheline aligned slab cache for task_group */
++static struct kmem_cache *task_group_cache __ro_after_init;
++#endif /* CONFIG_CGROUP_SCHED */
++
++void __init sched_init(void)
++{
++ int i;
++ struct rq *rq;
++
++ printk(KERN_INFO "sched/alt: "ALT_SCHED_NAME" CPU Scheduler "ALT_SCHED_VERSION\
++ " by Alfred Chen.\n");
++
++ wait_bit_init();
++
++ for (i = 0; i < SCHED_QUEUE_BITS; i++)
++ cpumask_copy(sched_preempt_mask + i, cpu_present_mask);
++
++#ifdef CONFIG_CGROUP_SCHED
++ task_group_cache = KMEM_CACHE(task_group, 0);
++
++ list_add(&root_task_group.list, &task_groups);
++ INIT_LIST_HEAD(&root_task_group.children);
++ INIT_LIST_HEAD(&root_task_group.siblings);
++#endif /* CONFIG_CGROUP_SCHED */
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++
++ sched_queue_init(&rq->queue);
++ rq->prio = IDLE_TASK_SCHED_PRIO;
++ rq->prio_balance_time = 0;
++#ifdef CONFIG_SCHED_PDS
++ rq->prio_idx = rq->prio;
++#endif
++
++ raw_spin_lock_init(&rq->lock);
++ rq->nr_running = rq->nr_uninterruptible = 0;
++ rq->calc_load_active = 0;
++ rq->calc_load_update = jiffies + LOAD_FREQ;
++ rq->online = false;
++ rq->cpu = i;
++
++ rq->clear_idle_mask_func = cpumask_clear_cpu;
++ rq->set_idle_mask_func = cpumask_set_cpu;
++ rq->balance_func = NULL;
++ rq->active_balance_arg.active = 0;
++
++#ifdef CONFIG_NO_HZ_COMMON
++ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
++#endif
++ rq->balance_callback = &balance_push_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ rcuwait_init(&rq->hotplug_wait);
++#endif
++ rq->nr_switches = 0;
++
++ hrtick_rq_init(rq);
++ atomic_set(&rq->nr_iowait, 0);
++
++ zalloc_cpumask_var_node(&rq->scratch_mask, GFP_KERNEL, cpu_to_node(i));
++ }
++ /* Set rq->online for cpu 0 */
++ cpu_rq(0)->online = true;
++ /*
++ * The boot idle thread does lazy MMU switching as well:
++ */
++ mmgrab_lazy_tlb(&init_mm);
++ enter_lazy_tlb(&init_mm, current);
++
++ /*
++ * The idle task doesn't need the kthread struct to function, but it
++ * is dressed up as a per-CPU kthread and thus needs to play the part
++ * if we want to avoid special-casing it in code that deals with per-CPU
++ * kthreads.
++ */
++ WARN_ON(!set_kthread_struct(current));
++
++ /*
++ * Make us the idle thread. Technically, schedule() should not be
++ * called from this thread, however somewhere below it might be,
++ * but because we are the idle thread, we just pick up running again
++ * when this runqueue becomes "idle".
++ */
++ __sched_fork(0, current);
++ init_idle(current, smp_processor_id());
++
++ calc_load_update = jiffies + LOAD_FREQ;
++
++ idle_thread_set_boot_cpu();
++ balance_push_set(smp_processor_id(), false);
++
++ sched_init_topology_cpumask_early();
++
++ preempt_dynamic_init();
++}
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++
++void __might_sleep(const char *file, int line)
++{
++ unsigned int state = get_current_state();
++ /*
++ * Blocking primitives will set (and therefore destroy) current->state,
++ * since we will exit with TASK_RUNNING make sure we enter with it,
++ * otherwise we will destroy state.
++ */
++ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
++ "do not call blocking ops when !TASK_RUNNING; "
++ "state=%x set at [<%p>] %pS\n", state,
++ (void *)current->task_state_change,
++ (void *)current->task_state_change);
++
++ __might_resched(file, line, 0);
++}
++EXPORT_SYMBOL(__might_sleep);
++
++static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
++{
++ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
++ return;
++
++ if (preempt_count() == preempt_offset)
++ return;
++
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, ip);
++}
++
++static inline bool resched_offsets_ok(unsigned int offsets)
++{
++ unsigned int nested = preempt_count();
++
++ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
++
++ return nested == offsets;
++}
++
++void __might_resched(const char *file, int line, unsigned int offsets)
++{
++ /* Ratelimiting timestamp: */
++ static unsigned long prev_jiffy;
++
++ unsigned long preempt_disable_ip;
++
++ /* WARN_ON_ONCE() by default, no rate limit required: */
++ rcu_sleep_check();
++
++ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
++ !is_idle_task(current) && !current->non_block_count) ||
++ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
++ oops_in_progress)
++ return;
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ /* Save this before calling printk(), since that will clobber it: */
++ preempt_disable_ip = get_preempt_disable_ip(current);
++
++ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
++ file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), current->non_block_count,
++ current->pid, current->comm);
++ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
++ offsets & MIGHT_RESCHED_PREEMPT_MASK);
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
++ pr_err("RCU nest depth: %d, expected: %u\n",
++ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
++ }
++
++ if (task_stack_end_corrupted(current))
++ pr_emerg("Thread overran stack, or stack corrupted\n");
++
++ debug_show_held_locks(current);
++ if (irqs_disabled())
++ print_irqtrace_events(current);
++
++ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
++ preempt_disable_ip);
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL(__might_resched);
++
++void __cant_sleep(const char *file, int line, int preempt_offset)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > preempt_offset)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
++ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_sleep);
++
++void __cant_migrate(const char *file, int line)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (is_migration_disabled(current))
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > 0)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), is_migration_disabled(current),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_migrate);
++#endif /* CONFIG_DEBUG_ATOMIC_SLEEP */
++
++#ifdef CONFIG_MAGIC_SYSRQ
++void normalize_rt_tasks(void)
++{
++ struct task_struct *g, *p;
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ };
++
++ read_lock(&tasklist_lock);
++ for_each_process_thread(g, p) {
++ /*
++ * Only normalize user tasks:
++ */
++ if (p->flags & PF_KTHREAD)
++ continue;
++
++ schedstat_set(p->stats.wait_start, 0);
++ schedstat_set(p->stats.sleep_start, 0);
++ schedstat_set(p->stats.block_start, 0);
++
++ if (!rt_or_dl_task(p)) {
++ /*
++ * Renice negative nice level userspace
++ * tasks back to 0:
++ */
++ if (task_nice(p) < 0)
++ set_user_nice(p, 0);
++ continue;
++ }
++
++ __sched_setscheduler(p, &attr, false, false);
++ }
++ read_unlock(&tasklist_lock);
++}
++#endif /* CONFIG_MAGIC_SYSRQ */
++
++#ifdef CONFIG_KGDB_KDB
++/*
++ * These functions are only useful for KDB.
++ *
++ * They can only be called when the whole system has been
++ * stopped - every CPU needs to be quiescent, and no scheduling
++ * activity can take place. Using them for anything else would
++ * be a serious bug, and as a result, they aren't even visible
++ * under any other configuration.
++ */
++
++/**
++ * curr_task - return the current task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ *
++ * Return: The current task for @cpu.
++ */
++struct task_struct *curr_task(int cpu)
++{
++ return cpu_curr(cpu);
++}
++
++#endif /* CONFIG_KGDB_KDB */
++
++#ifdef CONFIG_CGROUP_SCHED
++static void sched_free_group(struct task_group *tg)
++{
++ kmem_cache_free(task_group_cache, tg);
++}
++
++static void sched_free_group_rcu(struct rcu_head *rhp)
++{
++ sched_free_group(container_of(rhp, struct task_group, rcu));
++}
++
++static void sched_unregister_group(struct task_group *tg)
++{
++ /*
++ * We have to wait for yet another RCU grace period to expire, as
++ * print_cfs_stats() might run concurrently.
++ */
++ call_rcu(&tg->rcu, sched_free_group_rcu);
++}
++
++/* allocate runqueue etc for a new task group */
++struct task_group *sched_create_group(struct task_group *parent)
++{
++ struct task_group *tg;
++
++ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
++ if (!tg)
++ return ERR_PTR(-ENOMEM);
++
++ return tg;
++}
++
++void sched_online_group(struct task_group *tg, struct task_group *parent)
++{
++}
++
++/* RCU callback to free various structures associated with a task group */
++static void sched_unregister_group_rcu(struct rcu_head *rhp)
++{
++ /* Now it should be safe to free those cfs_rqs: */
++ sched_unregister_group(container_of(rhp, struct task_group, rcu));
++}
++
++void sched_destroy_group(struct task_group *tg)
++{
++ /* Wait for possible concurrent references to cfs_rqs complete: */
++ call_rcu(&tg->rcu, sched_unregister_group_rcu);
++}
++
++void sched_release_group(struct task_group *tg)
++{
++}
++
++static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
++{
++ return css ? container_of(css, struct task_group, css) : NULL;
++}
++
++static struct cgroup_subsys_state *
++cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
++{
++ struct task_group *parent = css_tg(parent_css);
++ struct task_group *tg;
++
++ if (!parent) {
++ /* This is early initialization for the top cgroup */
++ return &root_task_group.css;
++ }
++
++ tg = sched_create_group(parent);
++ if (IS_ERR(tg))
++ return ERR_PTR(-ENOMEM);
++ return &tg->css;
++}
++
++/* Expose task group only after completing cgroup initialization */
++static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++ struct task_group *parent = css_tg(css->parent);
++
++ if (parent)
++ sched_online_group(tg, parent);
++ return 0;
++}
++
++static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ sched_release_group(tg);
++}
++
++static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ /*
++ * Relies on the RCU grace period between css_released() and this.
++ */
++ sched_unregister_group(tg);
++}
++
++#ifdef CONFIG_RT_GROUP_SCHED
++static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
++{
++ return 0;
++}
++#endif /* CONFIG_RT_GROUP_SCHED */
++
++static void cpu_cgroup_attach(struct cgroup_taskset *tset)
++{
++}
++
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++static int sched_group_set_shares(struct task_group *tg, unsigned long shares)
++{
++ return 0;
++}
++
++static int sched_group_set_idle(struct task_group *tg, long idle)
++{
++ return 0;
++}
++
++static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 shareval)
++{
++ return sched_group_set_shares(css_tg(css), shareval);
++}
++
++static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static s64 cpu_idle_read_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_idle_write_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft, s64 idle)
++{
++ return sched_group_set_idle(css_tg(css), idle);
++}
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++
++#ifdef CONFIG_CFS_BANDWIDTH
++static s64 cpu_cfs_quota_read_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_cfs_quota_write_s64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, s64 cfs_quota_us)
++{
++ return 0;
++}
++
++static u64 cpu_cfs_period_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_cfs_period_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 cfs_period_us)
++{
++ return 0;
++}
++
++static u64 cpu_cfs_burst_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_cfs_burst_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 cfs_burst_us)
++{
++ return 0;
++}
++
++static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static int cpu_cfs_local_stat_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++#endif /* CONFIG_CFS_BANDWIDTH */
++
++#ifdef CONFIG_RT_GROUP_SCHED
++static int cpu_rt_runtime_write(struct cgroup_subsys_state *css,
++ struct cftype *cft, s64 val)
++{
++ return 0;
++}
++
++static s64 cpu_rt_runtime_read(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_rt_period_write_uint(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 rt_period_us)
++{
++ return 0;
++}
++
++static u64 cpu_rt_period_read_uint(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++#endif /* CONFIG_RT_GROUP_SCHED */
++
++#ifdef CONFIG_UCLAMP_TASK_GROUP
++static int cpu_uclamp_min_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static int cpu_uclamp_max_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static ssize_t cpu_uclamp_min_write(struct kernfs_open_file *of,
++ char *buf, size_t nbytes,
++ loff_t off)
++{
++ return nbytes;
++}
++
++static ssize_t cpu_uclamp_max_write(struct kernfs_open_file *of,
++ char *buf, size_t nbytes,
++ loff_t off)
++{
++ return nbytes;
++}
++#endif /* CONFIG_UCLAMP_TASK_GROUP */
++
++static struct cftype cpu_legacy_files[] = {
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++ {
++ .name = "shares",
++ .read_u64 = cpu_shares_read_u64,
++ .write_u64 = cpu_shares_write_u64,
++ },
++ {
++ .name = "idle",
++ .read_s64 = cpu_idle_read_s64,
++ .write_s64 = cpu_idle_write_s64,
++ },
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++#ifdef CONFIG_CFS_BANDWIDTH
++ {
++ .name = "cfs_quota_us",
++ .read_s64 = cpu_cfs_quota_read_s64,
++ .write_s64 = cpu_cfs_quota_write_s64,
++ },
++ {
++ .name = "cfs_period_us",
++ .read_u64 = cpu_cfs_period_read_u64,
++ .write_u64 = cpu_cfs_period_write_u64,
++ },
++ {
++ .name = "cfs_burst_us",
++ .read_u64 = cpu_cfs_burst_read_u64,
++ .write_u64 = cpu_cfs_burst_write_u64,
++ },
++ {
++ .name = "stat",
++ .seq_show = cpu_cfs_stat_show,
++ },
++ {
++ .name = "stat.local",
++ .seq_show = cpu_cfs_local_stat_show,
++ },
++#endif /* CONFIG_CFS_BANDWIDTH */
++#ifdef CONFIG_RT_GROUP_SCHED
++ {
++ .name = "rt_runtime_us",
++ .read_s64 = cpu_rt_runtime_read,
++ .write_s64 = cpu_rt_runtime_write,
++ },
++ {
++ .name = "rt_period_us",
++ .read_u64 = cpu_rt_period_read_uint,
++ .write_u64 = cpu_rt_period_write_uint,
++ },
++#endif /* CONFIG_RT_GROUP_SCHED */
++#ifdef CONFIG_UCLAMP_TASK_GROUP
++ {
++ .name = "uclamp.min",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_min_show,
++ .write = cpu_uclamp_min_write,
++ },
++ {
++ .name = "uclamp.max",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_max_show,
++ .write = cpu_uclamp_max_write,
++ },
++#endif /* CONFIG_UCLAMP_TASK_GROUP */
++ { } /* Terminate */
++};
++
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++static u64 cpu_weight_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_weight_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft, u64 weight)
++{
++ return 0;
++}
++
++static s64 cpu_weight_nice_read_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_weight_nice_write_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft, s64 nice)
++{
++ return 0;
++}
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++
++#ifdef CONFIG_CFS_BANDWIDTH
++static int cpu_max_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static ssize_t cpu_max_write(struct kernfs_open_file *of,
++ char *buf, size_t nbytes, loff_t off)
++{
++ return nbytes;
++}
++#endif /* CONFIG_CFS_BANDWIDTH */
++
++static struct cftype cpu_files[] = {
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++ {
++ .name = "weight",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_u64 = cpu_weight_read_u64,
++ .write_u64 = cpu_weight_write_u64,
++ },
++ {
++ .name = "weight.nice",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_s64 = cpu_weight_nice_read_s64,
++ .write_s64 = cpu_weight_nice_write_s64,
++ },
++ {
++ .name = "idle",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_s64 = cpu_idle_read_s64,
++ .write_s64 = cpu_idle_write_s64,
++ },
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++#ifdef CONFIG_CFS_BANDWIDTH
++ {
++ .name = "max",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_max_show,
++ .write = cpu_max_write,
++ },
++ {
++ .name = "max.burst",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_u64 = cpu_cfs_burst_read_u64,
++ .write_u64 = cpu_cfs_burst_write_u64,
++ },
++#endif /* CONFIG_CFS_BANDWIDTH */
++#ifdef CONFIG_UCLAMP_TASK_GROUP
++ {
++ .name = "uclamp.min",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_min_show,
++ .write = cpu_uclamp_min_write,
++ },
++ {
++ .name = "uclamp.max",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_max_show,
++ .write = cpu_uclamp_max_write,
++ },
++#endif /* CONFIG_UCLAMP_TASK_GROUP */
++ { } /* terminate */
++};
++
++static int cpu_extra_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++static int cpu_local_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++struct cgroup_subsys cpu_cgrp_subsys = {
++ .css_alloc = cpu_cgroup_css_alloc,
++ .css_online = cpu_cgroup_css_online,
++ .css_released = cpu_cgroup_css_released,
++ .css_free = cpu_cgroup_css_free,
++ .css_extra_stat_show = cpu_extra_stat_show,
++ .css_local_stat_show = cpu_local_stat_show,
++#ifdef CONFIG_RT_GROUP_SCHED
++ .can_attach = cpu_cgroup_can_attach,
++#endif /* CONFIG_RT_GROUP_SCHED */
++ .attach = cpu_cgroup_attach,
++ .legacy_cftypes = cpu_legacy_files,
++ .dfl_cftypes = cpu_files,
++ .early_init = true,
++ .threaded = true,
++};
++#endif /* CONFIG_CGROUP_SCHED */
++
++#undef CREATE_TRACE_POINTS
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#
++/*
++ * @cid_lock: Guarantee forward-progress of cid allocation.
++ *
++ * Concurrency ID allocation within a bitmap is mostly lock-free. The cid_lock
++ * is only used when contention is detected by the lock-free allocation so
++ * forward progress can be guaranteed.
++ */
++DEFINE_RAW_SPINLOCK(cid_lock);
++
++/*
++ * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
++ *
++ * When @use_cid_lock is 0, the cid allocation is lock-free. When contention is
++ * detected, it is set to 1 to ensure that all newly coming allocations are
++ * serialized by @cid_lock until the allocation which detected contention
++ * completes and sets @use_cid_lock back to 0. This guarantees forward progress
++ * of a cid allocation.
++ */
++int use_cid_lock;
++
++/*
++ * mm_cid remote-clear implements a lock-free algorithm to clear per-mm/cpu cid
++ * concurrently with respect to the execution of the source runqueue context
++ * switch.
++ *
++ * There is one basic properties we want to guarantee here:
++ *
++ * (1) Remote-clear should _never_ mark a per-cpu cid UNSET when it is actively
++ * used by a task. That would lead to concurrent allocation of the cid and
++ * userspace corruption.
++ *
++ * Provide this guarantee by introducing a Dekker memory ordering to guarantee
++ * that a pair of loads observe at least one of a pair of stores, which can be
++ * shown as:
++ *
++ * X = Y = 0
++ *
++ * w[X]=1 w[Y]=1
++ * MB MB
++ * r[Y]=y r[X]=x
++ *
++ * Which guarantees that x==0 && y==0 is impossible. But rather than using
++ * values 0 and 1, this algorithm cares about specific state transitions of the
++ * runqueue current task (as updated by the scheduler context switch), and the
++ * per-mm/cpu cid value.
++ *
++ * Let's introduce task (Y) which has task->mm == mm and task (N) which has
++ * task->mm != mm for the rest of the discussion. There are two scheduler state
++ * transitions on context switch we care about:
++ *
++ * (TSA) Store to rq->curr with transition from (N) to (Y)
++ *
++ * (TSB) Store to rq->curr with transition from (Y) to (N)
++ *
++ * On the remote-clear side, there is one transition we care about:
++ *
++ * (TMA) cmpxchg to *pcpu_cid to set the LAZY flag
++ *
++ * There is also a transition to UNSET state which can be performed from all
++ * sides (scheduler, remote-clear). It is always performed with a cmpxchg which
++ * guarantees that only a single thread will succeed:
++ *
++ * (TMB) cmpxchg to *pcpu_cid to mark UNSET
++ *
++ * Just to be clear, what we do _not_ want to happen is a transition to UNSET
++ * when a thread is actively using the cid (property (1)).
++ *
++ * Let's looks at the relevant combinations of TSA/TSB, and TMA transitions.
++ *
++ * Scenario A) (TSA)+(TMA) (from next task perspective)
++ *
++ * CPU0 CPU1
++ *
++ * Context switch CS-1 Remote-clear
++ * - store to rq->curr: (N)->(Y) (TSA) - cmpxchg to *pcpu_id to LAZY (TMA)
++ * (implied barrier after cmpxchg)
++ * - switch_mm_cid()
++ * - memory barrier (see switch_mm_cid()
++ * comment explaining how this barrier
++ * is combined with other scheduler
++ * barriers)
++ * - mm_cid_get (next)
++ * - READ_ONCE(*pcpu_cid) - rcu_dereference(src_rq->curr)
++ *
++ * This Dekker ensures that either task (Y) is observed by the
++ * rcu_dereference() or the LAZY flag is observed by READ_ONCE(), or both are
++ * observed.
++ *
++ * If task (Y) store is observed by rcu_dereference(), it means that there is
++ * still an active task on the cpu. Remote-clear will therefore not transition
++ * to UNSET, which fulfills property (1).
++ *
++ * If task (Y) is not observed, but the lazy flag is observed by READ_ONCE(),
++ * it will move its state to UNSET, which clears the percpu cid perhaps
++ * uselessly (which is not an issue for correctness). Because task (Y) is not
++ * observed, CPU1 can move ahead to set the state to UNSET. Because moving
++ * state to UNSET is done with a cmpxchg expecting that the old state has the
++ * LAZY flag set, only one thread will successfully UNSET.
++ *
++ * If both states (LAZY flag and task (Y)) are observed, the thread on CPU0
++ * will observe the LAZY flag and transition to UNSET (perhaps uselessly), and
++ * CPU1 will observe task (Y) and do nothing more, which is fine.
++ *
++ * What we are effectively preventing with this Dekker is a scenario where
++ * neither LAZY flag nor store (Y) are observed, which would fail property (1)
++ * because this would UNSET a cid which is actively used.
++ */
++
++void sched_mm_cid_migrate_from(struct task_struct *t)
++{
++ t->migrate_from_cpu = task_cpu(t);
++}
++
++static
++int __sched_mm_cid_migrate_from_fetch_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid)
++{
++ struct mm_struct *mm = t->mm;
++ struct task_struct *src_task;
++ int src_cid, last_mm_cid;
++
++ if (!mm)
++ return -1;
++
++ last_mm_cid = t->last_mm_cid;
++ /*
++ * If the migrated task has no last cid, or if the current
++ * task on src rq uses the cid, it means the source cid does not need
++ * to be moved to the destination cpu.
++ */
++ if (last_mm_cid == -1)
++ return -1;
++ src_cid = READ_ONCE(src_pcpu_cid->cid);
++ if (!mm_cid_is_valid(src_cid) || last_mm_cid != src_cid)
++ return -1;
++
++ /*
++ * If we observe an active task using the mm on this rq, it means we
++ * are not the last task to be migrated from this cpu for this mm, so
++ * there is no need to move src_cid to the destination cpu.
++ */
++ guard(rcu)();
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ t->last_mm_cid = -1;
++ return -1;
++ }
++
++ return src_cid;
++}
++
++static
++int __sched_mm_cid_migrate_from_try_steal_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid,
++ int src_cid)
++{
++ struct task_struct *src_task;
++ struct mm_struct *mm = t->mm;
++ int lazy_cid;
++
++ if (src_cid == -1)
++ return -1;
++
++ /*
++ * Attempt to clear the source cpu cid to move it to the destination
++ * cpu.
++ */
++ lazy_cid = mm_cid_set_lazy_put(src_cid);
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &src_cid, lazy_cid))
++ return -1;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, this task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ scoped_guard (rcu) {
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ rcu_read_unlock();
++ /*
++ * We observed an active task for this mm, there is therefore
++ * no point in moving this cid to the destination cpu.
++ */
++ t->last_mm_cid = -1;
++ return -1;
++ }
++ }
++
++ /*
++ * The src_cid is unused, so it can be unset.
++ */
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ return -1;
++ WRITE_ONCE(src_pcpu_cid->recent_cid, MM_CID_UNSET);
++ return src_cid;
++}
++
++/*
++ * Migration to dst cpu. Called with dst_rq lock held.
++ * Interrupts are disabled, which keeps the window of cid ownership without the
++ * source rq lock held small.
++ */
++void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t)
++{
++ struct mm_cid *src_pcpu_cid, *dst_pcpu_cid;
++ struct mm_struct *mm = t->mm;
++ int src_cid, src_cpu;
++ bool dst_cid_is_set;
++ struct rq *src_rq;
++
++ lockdep_assert_rq_held(dst_rq);
++
++ if (!mm)
++ return;
++ src_cpu = t->migrate_from_cpu;
++ if (src_cpu == -1) {
++ t->last_mm_cid = -1;
++ return;
++ }
++ /*
++ * Move the src cid if the dst cid is unset. This keeps id
++ * allocation closest to 0 in cases where few threads migrate around
++ * many CPUs.
++ *
++ * If destination cid or recent cid is already set, we may have
++ * to just clear the src cid to ensure compactness in frequent
++ * migrations scenarios.
++ *
++ * It is not useful to clear the src cid when the number of threads is
++ * greater or equal to the number of allowed CPUs, because user-space
++ * can expect that the number of allowed cids can reach the number of
++ * allowed CPUs.
++ */
++ dst_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(dst_rq));
++ dst_cid_is_set = !mm_cid_is_unset(READ_ONCE(dst_pcpu_cid->cid)) ||
++ !mm_cid_is_unset(READ_ONCE(dst_pcpu_cid->recent_cid));
++ if (dst_cid_is_set && atomic_read(&mm->mm_users) >= READ_ONCE(mm->nr_cpus_allowed))
++ return;
++ src_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, src_cpu);
++ src_rq = cpu_rq(src_cpu);
++ src_cid = __sched_mm_cid_migrate_from_fetch_cid(src_rq, t, src_pcpu_cid);
++ if (src_cid == -1)
++ return;
++ src_cid = __sched_mm_cid_migrate_from_try_steal_cid(src_rq, t, src_pcpu_cid,
++ src_cid);
++ if (src_cid == -1)
++ return;
++ if (dst_cid_is_set) {
++ __mm_cid_put(mm, src_cid);
++ return;
++ }
++ /* Move src_cid to dst cpu. */
++ mm_cid_snapshot_time(dst_rq, mm);
++ WRITE_ONCE(dst_pcpu_cid->cid, src_cid);
++ WRITE_ONCE(dst_pcpu_cid->recent_cid, src_cid);
++}
++
++static void sched_mm_cid_remote_clear(struct mm_struct *mm, struct mm_cid *pcpu_cid,
++ int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *t;
++ int cid, lazy_cid;
++
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid))
++ return;
++
++ /*
++ * Clear the cpu cid if it is set to keep cid allocation compact. If
++ * there happens to be other tasks left on the source cpu using this
++ * mm, the next task using this mm will reallocate its cid on context
++ * switch.
++ */
++ lazy_cid = mm_cid_set_lazy_put(cid);
++ if (!try_cmpxchg(&pcpu_cid->cid, &cid, lazy_cid))
++ return;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, that task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ scoped_guard (rcu) {
++ t = rcu_dereference(rq->curr);
++ if (READ_ONCE(t->mm_cid_active) && t->mm == mm)
++ return;
++ }
++
++ /*
++ * The cid is unused, so it can be unset.
++ * Disable interrupts to keep the window of cid ownership without rq
++ * lock small.
++ */
++ scoped_guard (irqsave) {
++ if (try_cmpxchg(&pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ __mm_cid_put(mm, cid);
++ }
++}
++
++static void sched_mm_cid_remote_clear_old(struct mm_struct *mm, int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct mm_cid *pcpu_cid;
++ struct task_struct *curr;
++ u64 rq_clock;
++
++ /*
++ * rq->clock load is racy on 32-bit but one spurious clear once in a
++ * while is irrelevant.
++ */
++ rq_clock = READ_ONCE(rq->clock);
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++
++ /*
++ * In order to take care of infrequently scheduled tasks, bump the time
++ * snapshot associated with this cid if an active task using the mm is
++ * observed on this rq.
++ */
++ scoped_guard (rcu) {
++ curr = rcu_dereference(rq->curr);
++ if (READ_ONCE(curr->mm_cid_active) && curr->mm == mm) {
++ WRITE_ONCE(pcpu_cid->time, rq_clock);
++ return;
++ }
++ }
++
++ if (rq_clock < pcpu_cid->time + SCHED_MM_CID_PERIOD_NS)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void sched_mm_cid_remote_clear_weight(struct mm_struct *mm, int cpu,
++ int weight)
++{
++ struct mm_cid *pcpu_cid;
++ int cid;
++
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid) || cid < weight)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void task_mm_cid_work(struct callback_head *work)
++{
++ unsigned long now = jiffies, old_scan, next_scan;
++ struct task_struct *t = current;
++ struct cpumask *cidmask;
++ struct mm_struct *mm;
++ int weight, cpu;
++
++ WARN_ON_ONCE(t != container_of(work, struct task_struct, cid_work));
++
++ work->next = work; /* Prevent double-add */
++ if (t->flags & PF_EXITING)
++ return;
++ mm = t->mm;
++ if (!mm)
++ return;
++ old_scan = READ_ONCE(mm->mm_cid_next_scan);
++ next_scan = now + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ if (!old_scan) {
++ unsigned long res;
++
++ res = cmpxchg(&mm->mm_cid_next_scan, old_scan, next_scan);
++ if (res != old_scan)
++ old_scan = res;
++ else
++ old_scan = next_scan;
++ }
++ if (time_before(now, old_scan))
++ return;
++ if (!try_cmpxchg(&mm->mm_cid_next_scan, &old_scan, next_scan))
++ return;
++ cidmask = mm_cidmask(mm);
++ /* Clear cids that were not recently used. */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_old(mm, cpu);
++ weight = cpumask_weight(cidmask);
++ /*
++ * Clear cids that are greater or equal to the cidmask weight to
++ * recompact it.
++ */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_weight(mm, cpu, weight);
++}
++
++void init_sched_mm_cid(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ int mm_users = 0;
++
++ if (mm) {
++ mm_users = atomic_read(&mm->mm_users);
++ if (mm_users == 1)
++ mm->mm_cid_next_scan = jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ }
++ t->cid_work.next = &t->cid_work; /* Protect against double add */
++ init_task_work(&t->cid_work, task_mm_cid_work);
++}
++
++void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
++{
++ struct callback_head *work = &curr->cid_work;
++ unsigned long now = jiffies;
++
++ if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) ||
++ work->next != work)
++ return;
++ if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan)))
++ return;
++
++ /* No page allocation under rq lock */
++ task_work_add(curr, work, TWA_RESUME);
++}
++
++void sched_mm_cid_exit_signals(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ guard(rq_lock_irqsave)(rq);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++}
++
++void sched_mm_cid_before_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ guard(rq_lock_irqsave)(rq);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++}
++
++void sched_mm_cid_after_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ scoped_guard (rq_lock_irqsave, rq) {
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 1);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ t->last_mm_cid = t->mm_cid = mm_cid_get(rq, t, mm);
++ }
++ rseq_set_notify_resume(t);
++}
++
++void sched_mm_cid_fork(struct task_struct *t)
++{
++ WARN_ON_ONCE(!t->mm || t->mm_cid != -1);
++ t->mm_cid_active = 1;
++}
++#endif /* CONFIG_SCHED_MM_CID */
+diff --git a/kernel/sched/alt_core.h b/kernel/sched/alt_core.h
+new file mode 100644
+index 000000000000..bb9512c76566
+--- /dev/null
++++ b/kernel/sched/alt_core.h
+@@ -0,0 +1,177 @@
++#ifndef _KERNEL_SCHED_ALT_CORE_H
++#define _KERNEL_SCHED_ALT_CORE_H
++
++/*
++ * Compile time debug macro
++ * #define ALT_SCHED_DEBUG
++ */
++
++/*
++ * Task related inlined functions
++ */
++static inline bool is_migration_disabled(struct task_struct *p)
++{
++ return p->migration_disabled;
++}
++
++/* rt_prio(prio) defined in include/linux/sched/rt.h */
++#define rt_task(p) rt_prio((p)->prio)
++#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
++#define task_has_rt_policy(p) (rt_policy((p)->policy))
++
++struct affinity_context {
++ const struct cpumask *new_mask;
++ struct cpumask *user_mask;
++ unsigned int flags;
++};
++
++/* CONFIG_SCHED_CLASS_EXT is not supported */
++#define scx_switched_all() false
++
++#define SCA_CHECK 0x01
++#define SCA_MIGRATE_DISABLE 0x02
++#define SCA_MIGRATE_ENABLE 0x04
++#define SCA_USER 0x08
++
++extern int __set_cpus_allowed_ptr(struct task_struct *p, struct affinity_context *ctx);
++
++static inline cpumask_t *alloc_user_cpus_ptr(int node)
++{
++ /*
++ * See do_set_cpus_allowed() above for the rcu_head usage.
++ */
++ int size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
++
++ return kmalloc_node(size, GFP_KERNEL, node);
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
++{
++ if (pi_task)
++ prio = min(prio, pi_task->prio);
++
++ return prio;
++}
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ struct task_struct *pi_task = rt_mutex_get_top_task(p);
++
++ return __rt_effective_prio(pi_task, prio);
++}
++
++#else /* !CONFIG_RT_MUTEXES: */
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ return prio;
++}
++
++#endif /* !CONFIG_RT_MUTEXES */
++
++extern int __sched_setscheduler(struct task_struct *p, const struct sched_attr *attr, bool user, bool pi);
++extern int __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
++extern void __setscheduler_prio(struct task_struct *p, int prio);
++
++/*
++ * Context API
++ */
++static inline struct rq *__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock(&rq->lock);
++ if (likely((p->on_cpu || task_on_rq_queued(p)) && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ *plock = NULL;
++ return rq;
++ }
++ }
++}
++
++static inline void __task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
++{
++ if (NULL != lock)
++ raw_spin_unlock(lock);
++}
++
++void check_task_changed(struct task_struct *p, struct rq *rq);
++
++/*
++ * RQ related inlined functions
++ */
++
++/*
++ * This routine assume that the idle task always in queue
++ */
++static inline struct task_struct *sched_rq_first_task(struct rq *rq)
++{
++ const struct list_head *head = &rq->queue.heads[sched_rq_prio_idx(rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++}
++
++static inline struct task_struct * sched_rq_next_task(struct task_struct *p, struct rq *rq)
++{
++ struct list_head *next = p->sq_node.next;
++
++ if (&rq->queue.heads[0] <= next && next < &rq->queue.heads[SCHED_LEVELS]) {
++ struct list_head *head;
++ unsigned long idx = next - &rq->queue.heads[0];
++
++ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
++ sched_idx2prio(idx, rq) + 1);
++ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++ }
++
++ return list_next_entry(p, sq_node);
++}
++
++extern void requeue_task(struct task_struct *p, struct rq *rq);
++
++#ifdef ALT_SCHED_DEBUG
++extern void alt_sched_debug(void);
++#else
++static inline void alt_sched_debug(void) {}
++#endif
++
++extern int sched_yield_type;
++
++extern cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DECLARE_STATIC_KEY_FALSE(sched_smt_present);
++DECLARE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
++
++extern cpumask_t sched_smt_mask ____cacheline_aligned_in_smp;
++
++extern cpumask_t *const sched_idle_mask;
++extern cpumask_t *const sched_sg_idle_mask;
++extern cpumask_t *const sched_pcore_idle_mask;
++extern cpumask_t *const sched_ecore_idle_mask;
++
++extern struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu);
++
++typedef bool (*idle_select_func_t)(struct cpumask *dstp, const struct cpumask *src1p,
++ const struct cpumask *src2p);
++
++extern idle_select_func_t idle_select_func;
++
++/* balance callback */
++extern struct balance_callback *splice_balance_callbacks(struct rq *rq);
++extern void balance_callbacks(struct rq *rq, struct balance_callback *head);
++
++#endif /* _KERNEL_SCHED_ALT_CORE_H */
+diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
+new file mode 100644
+index 000000000000..1dbd7eb6a434
+--- /dev/null
++++ b/kernel/sched/alt_debug.c
+@@ -0,0 +1,32 @@
++/*
++ * kernel/sched/alt_debug.c
++ *
++ * Print the alt scheduler debugging details
++ *
++ * Author: Alfred Chen
++ * Date : 2020
++ */
++#include "sched.h"
++#include "linux/sched/debug.h"
++
++/*
++ * This allows printing both to /proc/sched_debug and
++ * to the console
++ */
++#define SEQ_printf(m, x...) \
++ do { \
++ if (m) \
++ seq_printf(m, x); \
++ else \
++ pr_cont(x); \
++ } while (0)
++
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{
++ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
++ get_nr_threads(p));
++}
++
++void proc_sched_set_task(struct task_struct *p)
++{}
+diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
+new file mode 100644
+index 000000000000..5b9a53c669f5
+--- /dev/null
++++ b/kernel/sched/alt_sched.h
+@@ -0,0 +1,1018 @@
++#ifndef _KERNEL_SCHED_ALT_SCHED_H
++#define _KERNEL_SCHED_ALT_SCHED_H
++
++#include <linux/context_tracking.h>
++#include <linux/profile.h>
++#include <linux/stop_machine.h>
++#include <linux/syscalls.h>
++#include <linux/tick.h>
++
++#include <trace/events/power.h>
++#include <trace/events/sched.h>
++
++#include "../workqueue_internal.h"
++
++#include "cpupri.h"
++
++#ifdef CONFIG_CGROUP_SCHED
++/* task group related information */
++struct task_group {
++ struct cgroup_subsys_state css;
++
++ struct rcu_head rcu;
++ struct list_head list;
++
++ struct task_group *parent;
++ struct list_head siblings;
++ struct list_head children;
++};
++
++extern struct task_group *sched_create_group(struct task_group *parent);
++extern void sched_online_group(struct task_group *tg,
++ struct task_group *parent);
++extern void sched_destroy_group(struct task_group *tg);
++extern void sched_release_group(struct task_group *tg);
++#endif /* CONFIG_CGROUP_SCHED */
++
++#define MIN_SCHED_NORMAL_PRIO (32)
++/*
++ * levels: RT(0-24), reserved(25-31), NORMAL(32-63), cpu idle task(64)
++ *
++ * -- BMQ --
++ * NORMAL: (lower boost range 12, NICE_WIDTH 40, higher boost range 12) / 2
++ * -- PDS --
++ * NORMAL: SCHED_EDGE_DELTA + ((NICE_WIDTH 40) / 2)
++ */
++#define SCHED_LEVELS (64 + 1)
++
++#define IDLE_TASK_SCHED_PRIO (SCHED_LEVELS - 1)
++
++/*
++ * Increase resolution of nice-level calculations for 64-bit architectures.
++ * The extra resolution improves shares distribution and load balancing of
++ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
++ * hierarchies, especially on larger systems. This is not a user-visible change
++ * and does not change the user-interface for setting shares/weights.
++ *
++ * We increase resolution only if we have enough bits to allow this increased
++ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
++ * are pretty high and the returns do not justify the increased costs.
++ *
++ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
++ * increase coverage and consistency always enable it on 64-bit platforms.
++ */
++#ifdef CONFIG_64BIT
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
++# define scale_load_down(w) \
++({ \
++ unsigned long __w = (w); \
++ if (__w) \
++ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
++ __w; \
++})
++#else
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) (w)
++# define scale_load_down(w) (w)
++#endif
++
++/* task_struct::on_rq states: */
++#define TASK_ON_RQ_QUEUED 1
++#define TASK_ON_RQ_MIGRATING 2
++
++static inline int task_on_rq_queued(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_QUEUED;
++}
++
++static inline int task_on_rq_migrating(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
++}
++
++/* Wake flags. The first three directly map to some SD flag value */
++#define WF_EXEC 0x02 /* Wakeup after exec; maps to SD_BALANCE_EXEC */
++#define WF_FORK 0x04 /* Wakeup after fork; maps to SD_BALANCE_FORK */
++#define WF_TTWU 0x08 /* Wakeup; maps to SD_BALANCE_WAKE */
++
++#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
++#define WF_MIGRATED 0x20 /* Internal use, task got migrated */
++#define WF_CURRENT_CPU 0x40 /* Prefer to move the wakee to the current CPU. */
++
++static_assert(WF_EXEC == SD_BALANCE_EXEC);
++static_assert(WF_FORK == SD_BALANCE_FORK);
++static_assert(WF_TTWU == SD_BALANCE_WAKE);
++
++#define SCHED_QUEUE_BITS (SCHED_LEVELS - 1)
++
++struct sched_queue {
++ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
++ struct list_head heads[SCHED_LEVELS];
++};
++
++struct rq;
++struct cpuidle_state;
++
++struct balance_callback {
++ struct balance_callback *next;
++ void (*func)(struct rq *rq);
++};
++
++typedef void (*balance_func_t)(struct rq *rq, int cpu);
++typedef void (*set_idle_mask_func_t)(unsigned int cpu, struct cpumask *dstp);
++typedef void (*clear_idle_mask_func_t)(int cpu, struct cpumask *dstp);
++
++struct balance_arg {
++ struct task_struct *task;
++ int active;
++ cpumask_t *cpumask;
++};
++
++/*
++ * This is the main, per-CPU runqueue data structure.
++ * This data should only be modified by the local cpu.
++ */
++struct rq {
++ /* runqueue lock: */
++ raw_spinlock_t lock;
++
++ struct task_struct __rcu *curr;
++ struct task_struct *idle;
++ struct task_struct *stop;
++ struct mm_struct *prev_mm;
++
++ struct sched_queue queue ____cacheline_aligned;
++
++ int prio;
++#ifdef CONFIG_SCHED_PDS
++ int prio_idx;
++ u64 time_edge;
++#endif
++
++ /* switch count */
++ u64 nr_switches;
++
++ atomic_t nr_iowait;
++
++ u64 last_seen_need_resched_ns;
++ int ticks_without_resched;
++
++#ifdef CONFIG_MEMBARRIER
++ int membarrier_state;
++#endif
++
++ set_idle_mask_func_t set_idle_mask_func;
++ clear_idle_mask_func_t clear_idle_mask_func;
++
++ int cpu; /* cpu of this runqueue */
++ bool online;
++
++ unsigned int ttwu_pending;
++ unsigned char nohz_idle_balance;
++ unsigned char idle_balance;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ struct sched_avg avg_irq;
++#endif
++
++ balance_func_t balance_func;
++ struct balance_arg active_balance_arg ____cacheline_aligned;
++ struct cpu_stop_work active_balance_work;
++
++ struct balance_callback *balance_callback;
++
++#ifdef CONFIG_HOTPLUG_CPU
++ struct rcuwait hotplug_wait;
++#endif
++ unsigned int nr_pinned;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ u64 prev_irq_time;
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++#ifdef CONFIG_PARAVIRT
++ u64 prev_steal_time;
++#endif /* CONFIG_PARAVIRT */
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ u64 prev_steal_time_rq;
++#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
++
++ /* For genenal cpu load util */
++ s32 load_history;
++ u64 load_block;
++ u64 load_stamp;
++
++ /* calc_load related fields */
++ unsigned long calc_load_update;
++ long calc_load_active;
++
++ /* Ensure that all clocks are in the same cache line */
++ u64 clock ____cacheline_aligned;
++ u64 clock_task;
++ u64 prio_balance_time;
++
++ unsigned int nr_running;
++ unsigned long nr_uninterruptible;
++
++#ifdef CONFIG_SCHED_HRTICK
++ call_single_data_t hrtick_csd;
++ struct hrtimer hrtick_timer;
++ ktime_t hrtick_time;
++#endif
++
++#ifdef CONFIG_SCHEDSTATS
++
++ /* latency stats */
++ struct sched_info rq_sched_info;
++ unsigned long long rq_cpu_time;
++ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
++
++ /* sys_sched_yield() stats */
++ unsigned int yld_count;
++
++ /* schedule() stats */
++ unsigned int sched_switch;
++ unsigned int sched_count;
++ unsigned int sched_goidle;
++
++ /* try_to_wake_up() stats */
++ unsigned int ttwu_count;
++ unsigned int ttwu_local;
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_CPU_IDLE
++ /* Must be inspected within a rcu lock section */
++ struct cpuidle_state *idle_state;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++ call_single_data_t nohz_csd;
++ atomic_t nohz_flags;
++#endif /* CONFIG_NO_HZ_COMMON */
++
++ /* Scratch cpumask to be temporarily used under rq_lock */
++ cpumask_var_t scratch_mask;
++};
++
++extern unsigned int sysctl_sched_base_slice;
++
++extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
++
++extern unsigned long calc_load_update;
++extern atomic_long_t calc_load_tasks;
++
++extern void calc_global_load_tick(struct rq *this_rq);
++extern long calc_load_fold_active(struct rq *this_rq, long adjust);
++
++DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
++#define this_rq() this_cpu_ptr(&runqueues)
++#define task_rq(p) cpu_rq(task_cpu(p))
++#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
++#define raw_rq() raw_cpu_ptr(&runqueues)
++
++#ifdef CONFIG_SYSCTL
++void register_sched_domain_sysctl(void);
++void unregister_sched_domain_sysctl(void);
++#else
++static inline void register_sched_domain_sysctl(void)
++{
++}
++static inline void unregister_sched_domain_sysctl(void)
++{
++}
++#endif
++
++extern bool sched_smp_initialized;
++
++enum {
++#ifdef CONFIG_SCHED_SMT
++ SMT_LEVEL_SPACE_HOLDER,
++#endif
++ COREGROUP_LEVEL_SPACE_HOLDER,
++ CORE_LEVEL_SPACE_HOLDER,
++ OTHER_LEVEL_SPACE_HOLDER,
++ NR_CPU_AFFINITY_LEVELS
++};
++
++DECLARE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++
++static inline int
++__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
++{
++ int cpu;
++
++ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
++ mask++;
++
++ return cpu;
++}
++
++static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
++{
++ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
++}
++
++extern void resched_latency_warn(int cpu, u64 latency);
++
++#ifndef arch_scale_freq_tick
++static __always_inline
++void arch_scale_freq_tick(void)
++{
++}
++#endif
++
++#ifndef arch_scale_freq_capacity
++static __always_inline
++unsigned long arch_scale_freq_capacity(int cpu)
++{
++ return SCHED_CAPACITY_SCALE;
++}
++#endif
++
++static inline u64 __rq_clock_broken(struct rq *rq)
++{
++ return READ_ONCE(rq->clock);
++}
++
++static inline u64 rq_clock(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock;
++}
++
++static inline u64 rq_clock_task(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock_task;
++}
++
++/*
++ * {de,en}queue flags:
++ *
++ * DEQUEUE_SLEEP - task is no longer runnable
++ * ENQUEUE_WAKEUP - task just became runnable
++ *
++ */
++
++#define DEQUEUE_SLEEP 0x01
++
++#define ENQUEUE_WAKEUP 0x01
++
++
++/*
++ * Below are scheduler API which using in other kernel code
++ * It use the dummy rq_flags
++ * ToDo : BMQ need to support these APIs for compatibility with mainline
++ * scheduler code.
++ */
++struct rq_flags {
++ unsigned long flags;
++};
++
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock);
++
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock);
++
++static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++}
++
++static inline void
++rq_lock(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock(&rq->lock);
++}
++
++static inline void
++rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++rq_lock_irq(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irq(&rq->lock);
++}
++
++static inline void
++rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++static inline struct rq *
++this_rq_lock_irq(struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ local_irq_disable();
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ return rq;
++}
++
++static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
++{
++ return &rq->lock;
++}
++
++static inline raw_spinlock_t *rq_lockp(struct rq *rq)
++{
++ return __rq_lockp(rq);
++}
++
++static inline void lockdep_assert_rq_held(struct rq *rq)
++{
++ lockdep_assert_held(__rq_lockp(rq));
++}
++
++extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
++extern void raw_spin_rq_unlock(struct rq *rq);
++
++static inline void raw_spin_rq_lock(struct rq *rq)
++{
++ raw_spin_rq_lock_nested(rq, 0);
++}
++
++static inline void raw_spin_rq_lock_irq(struct rq *rq)
++{
++ local_irq_disable();
++ raw_spin_rq_lock(rq);
++}
++
++static inline void raw_spin_rq_unlock_irq(struct rq *rq)
++{
++ raw_spin_rq_unlock(rq);
++ local_irq_enable();
++}
++
++static inline int task_current(struct rq *rq, struct task_struct *p)
++{
++ return rq->curr == p;
++}
++
++static inline bool task_on_cpu(struct task_struct *p)
++{
++ return p->on_cpu;
++}
++
++extern struct static_key_false sched_schedstats;
++
++#ifdef CONFIG_CPU_IDLE
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++ rq->idle_state = idle_state;
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ WARN_ON(!rcu_read_lock_held());
++ return rq->idle_state;
++}
++#else
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ return NULL;
++}
++#endif
++
++static inline int cpu_of(const struct rq *rq)
++{
++ return rq->cpu;
++}
++
++extern void resched_cpu(int cpu);
++
++#include "stats.h"
++
++#ifdef CONFIG_NO_HZ_COMMON
++#define NOHZ_BALANCE_KICK_BIT 0
++#define NOHZ_STATS_KICK_BIT 1
++
++#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
++#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
++
++#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
++
++#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
++
++/* TODO: needed?
++extern void nohz_balance_exit_idle(struct rq *rq);
++#else
++static inline void nohz_balance_exit_idle(struct rq *rq) { }
++*/
++#endif
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++struct irqtime {
++ u64 total;
++ u64 tick_delta;
++ u64 irq_start_time;
++ struct u64_stats_sync sync;
++};
++
++DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
++extern int sched_clock_irqtime;
++
++static inline int irqtime_enabled(void)
++{
++ return sched_clock_irqtime;
++}
++
++/*
++ * Returns the irqtime minus the softirq time computed by ksoftirqd.
++ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
++ * and never move forward.
++ */
++static inline u64 irq_time_read(int cpu)
++{
++ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
++ unsigned int seq;
++ u64 total;
++
++ do {
++ seq = __u64_stats_fetch_begin(&irqtime->sync);
++ total = irqtime->total;
++ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
++
++ return total;
++}
++#else
++
++static inline int irqtime_enabled(void)
++{
++ return 0;
++}
++
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++
++#ifdef CONFIG_CPU_FREQ
++DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++extern int __init sched_tick_offload_init(void);
++#else
++static inline int sched_tick_offload_init(void) { return 0; }
++#endif
++
++#ifdef arch_scale_freq_capacity
++#ifndef arch_scale_freq_invariant
++#define arch_scale_freq_invariant() (true)
++#endif
++#else /* arch_scale_freq_capacity */
++#define arch_scale_freq_invariant() (false)
++#endif
++
++unsigned long sugov_effective_cpu_perf(int cpu, unsigned long actual,
++ unsigned long min,
++ unsigned long max);
++
++extern void schedule_idle(void);
++
++#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
++
++/*
++ * !! For sched_setattr_nocheck() (kernel) only !!
++ *
++ * This is actually gross. :(
++ *
++ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
++ * tasks, but still be able to sleep. We need this on platforms that cannot
++ * atomically change clock frequency. Remove once fast switching will be
++ * available on such platforms.
++ *
++ * SUGOV stands for SchedUtil GOVernor.
++ */
++#define SCHED_FLAG_SUGOV 0x10000000
++
++#ifdef CONFIG_MEMBARRIER
++/*
++ * The scheduler provides memory barriers required by membarrier between:
++ * - prior user-space memory accesses and store to rq->membarrier_state,
++ * - store to rq->membarrier_state and following user-space memory accesses.
++ * In the same way it provides those guarantees around store to rq->curr.
++ */
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++ int membarrier_state;
++
++ if (prev_mm == next_mm)
++ return;
++
++ membarrier_state = atomic_read(&next_mm->membarrier_state);
++ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
++ return;
++
++ WRITE_ONCE(rq->membarrier_state, membarrier_state);
++}
++#else
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++}
++#endif
++
++#ifdef CONFIG_NUMA
++extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
++#else
++static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return nr_cpu_ids;
++}
++#endif
++
++extern void swake_up_all_locked(struct swait_queue_head *q);
++extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
++
++extern int try_to_wake_up(struct task_struct *tsk, unsigned int state, int wake_flags);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++extern int preempt_dynamic_mode;
++extern int sched_dynamic_mode(const char *str);
++extern void sched_dynamic_update(int mode);
++#endif
++extern const char *preempt_modes[];
++
++static inline void nohz_run_idle_balance(int cpu) { }
++
++static inline unsigned long
++uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
++{
++ if (clamp_id == UCLAMP_MIN)
++ return 0;
++
++ return SCHED_CAPACITY_SCALE;
++}
++
++static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
++
++static inline bool uclamp_is_used(void)
++{
++ return false;
++}
++
++static inline unsigned long
++uclamp_rq_get(struct rq *rq, enum uclamp_id clamp_id)
++{
++ if (clamp_id == UCLAMP_MIN)
++ return 0;
++
++ return SCHED_CAPACITY_SCALE;
++}
++
++static inline void
++uclamp_rq_set(struct rq *rq, enum uclamp_id clamp_id, unsigned int value)
++{
++}
++
++static inline bool uclamp_rq_is_idle(struct rq *rq)
++{
++ return false;
++}
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#define SCHED_MM_CID_PERIOD_NS (100ULL * 1000000) /* 100ms */
++#define MM_CID_SCAN_DELAY 100 /* 100ms */
++
++extern raw_spinlock_t cid_lock;
++extern int use_cid_lock;
++
++extern void sched_mm_cid_migrate_from(struct task_struct *t);
++extern void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t);
++extern void task_tick_mm_cid(struct rq *rq, struct task_struct *curr);
++extern void init_sched_mm_cid(struct task_struct *t);
++
++static inline void __mm_cid_put(struct mm_struct *mm, int cid)
++{
++ if (cid < 0)
++ return;
++ cpumask_clear_cpu(cid, mm_cidmask(mm));
++}
++
++/*
++ * The per-mm/cpu cid can have the MM_CID_LAZY_PUT flag set or transition to
++ * the MM_CID_UNSET state without holding the rq lock, but the rq lock needs to
++ * be held to transition to other states.
++ *
++ * State transitions synchronized with cmpxchg or try_cmpxchg need to be
++ * consistent across cpus, which prevents use of this_cpu_cmpxchg.
++ */
++static inline void mm_cid_put_lazy(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (!mm_cid_is_lazy_put(cid) ||
++ !try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int mm_cid_pcpu_unset(struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid, res;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ for (;;) {
++ if (mm_cid_is_unset(cid))
++ return MM_CID_UNSET;
++ /*
++ * Attempt transition from valid or lazy-put to unset.
++ */
++ res = cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, cid, MM_CID_UNSET);
++ if (res == cid)
++ break;
++ cid = res;
++ }
++ return cid;
++}
++
++static inline void mm_cid_put(struct mm_struct *mm)
++{
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = mm_cid_pcpu_unset(mm);
++ if (cid == MM_CID_UNSET)
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int __mm_cid_try_get(struct task_struct *t, struct mm_struct *mm)
++{
++ struct cpumask *cidmask = mm_cidmask(mm);
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid, max_nr_cid, allowed_max_nr_cid;
++
++ /*
++ * After shrinking the number of threads or reducing the number
++ * of allowed cpus, reduce the value of max_nr_cid so expansion
++ * of cid allocation will preserve cache locality if the number
++ * of threads or allowed cpus increase again.
++ */
++ max_nr_cid = atomic_read(&mm->max_nr_cid);
++ while ((allowed_max_nr_cid = min_t(int, READ_ONCE(mm->nr_cpus_allowed),
++ atomic_read(&mm->mm_users))),
++ max_nr_cid > allowed_max_nr_cid) {
++ /* atomic_try_cmpxchg loads previous mm->max_nr_cid into max_nr_cid. */
++ if (atomic_try_cmpxchg(&mm->max_nr_cid, &max_nr_cid, allowed_max_nr_cid)) {
++ max_nr_cid = allowed_max_nr_cid;
++ break;
++ }
++ }
++ /* Try to re-use recent cid. This improves cache locality. */
++ cid = __this_cpu_read(pcpu_cid->recent_cid);
++ if (!mm_cid_is_unset(cid) && cid < max_nr_cid &&
++ !cpumask_test_and_set_cpu(cid, cidmask))
++ return cid;
++ /*
++ * Expand cid allocation if the maximum number of concurrency
++ * IDs allocated (max_nr_cid) is below the number cpus allowed
++ * and number of threads. Expanding cid allocation as much as
++ * possible improves cache locality.
++ */
++ cid = max_nr_cid;
++ while (cid < READ_ONCE(mm->nr_cpus_allowed) && cid < atomic_read(&mm->mm_users)) {
++ /* atomic_try_cmpxchg loads previous mm->max_nr_cid into cid. */
++ if (!atomic_try_cmpxchg(&mm->max_nr_cid, &cid, cid + 1))
++ continue;
++ if (!cpumask_test_and_set_cpu(cid, cidmask))
++ return cid;
++ }
++ /*
++ * Find the first available concurrency id.
++ * Retry finding first zero bit if the mask is temporarily
++ * filled. This only happens during concurrent remote-clear
++ * which owns a cid without holding a rq lock.
++ */
++ for (;;) {
++ cid = cpumask_first_zero(cidmask);
++ if (cid < READ_ONCE(mm->nr_cpus_allowed))
++ break;
++ cpu_relax();
++ }
++ if (cpumask_test_and_set_cpu(cid, cidmask))
++ return -1;
++
++ return cid;
++}
++
++/*
++ * Save a snapshot of the current runqueue time of this cpu
++ * with the per-cpu cid value, allowing to estimate how recently it was used.
++ */
++static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
++{
++ struct mm_cid *pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(rq));
++
++ lockdep_assert_rq_held(rq);
++ WRITE_ONCE(pcpu_cid->time, rq->clock);
++}
++
++static inline int __mm_cid_get(struct rq *rq, struct task_struct *t,
++ struct mm_struct *mm)
++{
++ int cid;
++
++ /*
++ * All allocations (even those using the cid_lock) are lock-free. If
++ * use_cid_lock is set, hold the cid_lock to perform cid allocation to
++ * guarantee forward progress.
++ */
++ if (!READ_ONCE(use_cid_lock)) {
++ cid = __mm_cid_try_get(t, mm);
++ if (cid >= 0)
++ goto end;
++ raw_spin_lock(&cid_lock);
++ } else {
++ raw_spin_lock(&cid_lock);
++ cid = __mm_cid_try_get(t, mm);
++ if (cid >= 0)
++ goto unlock;
++ }
++
++ /*
++ * cid concurrently allocated. Retry while forcing following
++ * allocations to use the cid_lock to ensure forward progress.
++ */
++ WRITE_ONCE(use_cid_lock, 1);
++ /*
++ * Set use_cid_lock before allocation. Only care about program order
++ * because this is only required for forward progress.
++ */
++ barrier();
++ /*
++ * Retry until it succeeds. It is guaranteed to eventually succeed once
++ * all newcoming allocations observe the use_cid_lock flag set.
++ */
++ do {
++ cid = __mm_cid_try_get(t, mm);
++ cpu_relax();
++ } while (cid < 0);
++ /*
++ * Allocate before clearing use_cid_lock. Only care about
++ * program order because this is for forward progress.
++ */
++ barrier();
++ WRITE_ONCE(use_cid_lock, 0);
++unlock:
++ raw_spin_unlock(&cid_lock);
++end:
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++}
++
++static inline int mm_cid_get(struct rq *rq, struct task_struct *t,
++ struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ struct cpumask *cpumask;
++ int cid;
++
++ lockdep_assert_rq_held(rq);
++ cpumask = mm_cidmask(mm);
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (mm_cid_is_valid(cid)) {
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++ }
++ if (mm_cid_is_lazy_put(cid)) {
++ if (try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++ }
++ cid = __mm_cid_get(rq, t, mm);
++ __this_cpu_write(pcpu_cid->cid, cid);
++ __this_cpu_write(pcpu_cid->recent_cid, cid);
++
++ return cid;
++}
++
++static inline void switch_mm_cid(struct rq *rq,
++ struct task_struct *prev,
++ struct task_struct *next)
++{
++ /*
++ * Provide a memory barrier between rq->curr store and load of
++ * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
++ *
++ * Should be adapted if context_switch() is modified.
++ */
++ if (!next->mm) { // to kernel
++ /*
++ * user -> kernel transition does not guarantee a barrier, but
++ * we can use the fact that it performs an atomic operation in
++ * mmgrab().
++ */
++ if (prev->mm) // from user
++ smp_mb__after_mmgrab();
++ /*
++ * kernel -> kernel transition does not change rq->curr->mm
++ * state. It stays NULL.
++ */
++ } else { // to user
++ /*
++ * kernel -> user transition does not provide a barrier
++ * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
++ * Provide it here.
++ */
++ if (!prev->mm) // from kernel
++ smp_mb();
++ /*
++ * user -> user transition guarantees a memory barrier through
++ * switch_mm() when current->mm changes. If current->mm is
++ * unchanged, no barrier is needed.
++ */
++ }
++ if (prev->mm_cid_active) {
++ mm_cid_snapshot_time(rq, prev->mm);
++ mm_cid_put_lazy(prev);
++ prev->mm_cid = -1;
++ }
++ if (next->mm_cid_active)
++ next->last_mm_cid = next->mm_cid = mm_cid_get(rq, next, next->mm);
++}
++
++#else
++static inline void switch_mm_cid(struct rq *rq, struct task_struct *prev, struct task_struct *next) { }
++static inline void sched_mm_cid_migrate_from(struct task_struct *t) { }
++static inline void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t) { }
++static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
++static inline void init_sched_mm_cid(struct task_struct *t) { }
++#endif
++
++extern struct balance_callback balance_push_callback;
++
++static inline void
++queue_balance_callback(struct rq *rq,
++ struct balance_callback *head,
++ void (*func)(struct rq *rq))
++{
++ lockdep_assert_rq_held(rq);
++
++ /*
++ * Don't (re)queue an already queued item; nor queue anything when
++ * balance_push() is active, see the comment with
++ * balance_push_callback.
++ */
++ if (unlikely(head->next || rq->balance_callback == &balance_push_callback))
++ return;
++
++ head->func = func;
++ head->next = rq->balance_callback;
++ rq->balance_callback = head;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#include "bmq.h"
++#endif
++#ifdef CONFIG_SCHED_PDS
++#include "pds.h"
++#endif
++
++#endif /* _KERNEL_SCHED_ALT_SCHED_H */
+diff --git a/kernel/sched/alt_topology.c b/kernel/sched/alt_topology.c
+new file mode 100644
+index 000000000000..376a08a5afda
+--- /dev/null
++++ b/kernel/sched/alt_topology.c
+@@ -0,0 +1,347 @@
++#include "alt_core.h"
++#include "alt_topology.h"
++
++static cpumask_t sched_pcore_mask ____cacheline_aligned_in_smp;
++
++static int __init sched_pcore_mask_setup(char *str)
++{
++ if (cpulist_parse(str, &sched_pcore_mask))
++ pr_warn("sched/alt: pcore_cpus= incorrect CPU range\n");
++
++ return 0;
++}
++__setup("pcore_cpus=", sched_pcore_mask_setup);
++
++/*
++ * set/clear idle mask functions
++ */
++#ifdef CONFIG_SCHED_SMT
++static void set_idle_mask_smt(unsigned int cpu, struct cpumask *dstp)
++{
++ cpumask_set_cpu(cpu, dstp);
++ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++}
++
++static void clear_idle_mask_smt(int cpu, struct cpumask *dstp)
++{
++ cpumask_clear_cpu(cpu, dstp);
++ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++}
++#endif
++
++static void set_idle_mask_pcore(unsigned int cpu, struct cpumask *dstp)
++{
++ cpumask_set_cpu(cpu, dstp);
++ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
++}
++
++static void clear_idle_mask_pcore(int cpu, struct cpumask *dstp)
++{
++ cpumask_clear_cpu(cpu, dstp);
++ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
++}
++
++static void set_idle_mask_ecore(unsigned int cpu, struct cpumask *dstp)
++{
++ cpumask_set_cpu(cpu, dstp);
++ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
++}
++
++static void clear_idle_mask_ecore(int cpu, struct cpumask *dstp)
++{
++ cpumask_clear_cpu(cpu, dstp);
++ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
++}
++
++/*
++ * Idle cpu/rq selection functions
++ */
++#ifdef CONFIG_SCHED_SMT
++static bool p1_idle_select_func(struct cpumask *dstp, const struct cpumask *src1p,
++ const struct cpumask *src2p)
++{
++ return cpumask_and(dstp, src1p, src2p + 1) ||
++ cpumask_and(dstp, src1p, src2p);
++}
++#endif
++
++static bool p1p2_idle_select_func(struct cpumask *dstp, const struct cpumask *src1p,
++ const struct cpumask *src2p)
++{
++ return cpumask_and(dstp, src1p, src2p + 1) ||
++ cpumask_and(dstp, src1p, src2p + 2) ||
++ cpumask_and(dstp, src1p, src2p);
++}
++
++/* common balance functions */
++static int active_balance_cpu_stop(void *data)
++{
++ struct balance_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++ cpumask_t tmp;
++
++ local_irq_save(flags);
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++
++ arg->active = 0;
++
++ if (task_on_rq_queued(p) && task_rq(p) == rq &&
++ cpumask_and(&tmp, p->cpus_ptr, arg->cpumask) &&
++ !is_migration_disabled(p)) {
++ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu_of(rq)));
++ rq = move_queued_task(rq, p, dcpu);
++ }
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++/* trigger_active_balance - for @rq */
++static inline int
++trigger_active_balance(struct rq *src_rq, struct rq *rq, cpumask_t *target_mask)
++{
++ struct balance_arg *arg;
++ unsigned long flags;
++ struct task_struct *p;
++ int res;
++
++ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
++ return 0;
++
++ arg = &rq->active_balance_arg;
++ res = (1 == rq->nr_running) && \
++ !is_migration_disabled((p = sched_rq_first_task(rq))) && \
++ cpumask_intersects(p->cpus_ptr, target_mask) && \
++ !arg->active;
++ if (res) {
++ arg->task = p;
++ arg->cpumask = target_mask;
++
++ arg->active = 1;
++ }
++
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ if (res) {
++ preempt_disable();
++ raw_spin_unlock(&src_rq->lock);
++
++ stop_one_cpu_nowait(cpu_of(rq), active_balance_cpu_stop, arg,
++ &rq->active_balance_work);
++
++ preempt_enable();
++ raw_spin_lock(&src_rq->lock);
++ }
++
++ return res;
++}
++
++static inline int
++ecore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
++{
++ if (cpumask_andnot(single_task_mask, single_task_mask, &sched_pcore_mask)) {
++ int i, cpu = cpu_of(rq);
++
++ for_each_cpu_wrap(i, single_task_mask, cpu)
++ if (trigger_active_balance(rq, cpu_rq(i), target_mask))
++ return 1;
++ }
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct balance_callback, active_balance_head);
++
++#ifdef CONFIG_SCHED_SMT
++static inline int
++smt_pcore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
++{
++ cpumask_t smt_single_mask;
++
++ if (cpumask_and(&smt_single_mask, single_task_mask, &sched_smt_mask)) {
++ int i, cpu = cpu_of(rq);
++
++ for_each_cpu_wrap(i, &smt_single_mask, cpu) {
++ if (cpumask_subset(cpu_smt_mask(i), &smt_single_mask) &&
++ trigger_active_balance(rq, cpu_rq(i), target_mask))
++ return 1;
++ }
++ }
++
++ return 0;
++}
++
++/* smt p core balance functions */
++static inline void smt_pcore_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ (/* smt core group balance */
++ (static_key_count(&sched_smt_present.key) > 1 &&
++ smt_pcore_source_balance(rq, &single_task_mask, sched_sg_idle_mask)
++ ) ||
++ /* e core to idle smt core balance */
++ ecore_source_balance(rq, &single_task_mask, sched_sg_idle_mask)))
++ return;
++}
++
++static void smt_pcore_balance_func(struct rq *rq, const int cpu)
++{
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
++}
++
++/* smt balance functions */
++static inline void smt_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ static_key_count(&sched_smt_present.key) > 1 &&
++ smt_pcore_source_balance(rq, &single_task_mask, sched_sg_idle_mask))
++ return;
++}
++
++static void smt_balance_func(struct rq *rq, const int cpu)
++{
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
++}
++
++/* e core balance functions */
++static inline void ecore_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ /* smt occupied p core to idle e core balance */
++ smt_pcore_source_balance(rq, &single_task_mask, sched_ecore_idle_mask))
++ return;
++}
++
++static void ecore_balance_func(struct rq *rq, const int cpu)
++{
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
++}
++#endif /* CONFIG_SCHED_SMT */
++
++/* p core balance functions */
++static inline void pcore_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ /* idle e core to p core balance */
++ ecore_source_balance(rq, &single_task_mask, sched_pcore_idle_mask))
++ return;
++}
++
++static void pcore_balance_func(struct rq *rq, const int cpu)
++{
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
++}
++
++#ifdef ALT_SCHED_DEBUG
++#define SCHED_DEBUG_INFO(...) printk(KERN_INFO __VA_ARGS__)
++#else
++#define SCHED_DEBUG_INFO(...) do { } while(0)
++#endif
++
++#define SET_IDLE_SELECT_FUNC(func) \
++{ \
++ idle_select_func = func; \
++ printk(KERN_INFO "sched: "#func); \
++}
++
++#define SET_RQ_BALANCE_FUNC(rq, cpu, func) \
++{ \
++ rq->balance_func = func; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#func, cpu); \
++}
++
++#define SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_func, clear_func) \
++{ \
++ rq->set_idle_mask_func = set_func; \
++ rq->clear_idle_mask_func = clear_func; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#set_func" "#clear_func, cpu); \
++}
++
++void sched_init_topology(void)
++{
++ int cpu;
++ struct rq *rq;
++ cpumask_t sched_ecore_mask = { CPU_BITS_NONE };
++ int ecore_present = 0;
++
++#ifdef CONFIG_SCHED_SMT
++ if (!cpumask_empty(&sched_smt_mask))
++ printk(KERN_INFO "sched: smt mask: 0x%08lx\n", sched_smt_mask.bits[0]);
++#endif
++
++ if (!cpumask_empty(&sched_pcore_mask)) {
++ cpumask_andnot(&sched_ecore_mask, cpu_online_mask, &sched_pcore_mask);
++ printk(KERN_INFO "sched: pcore mask: 0x%08lx, ecore mask: 0x%08lx\n",
++ sched_pcore_mask.bits[0], sched_ecore_mask.bits[0]);
++
++ ecore_present = !cpumask_empty(&sched_ecore_mask);
++ }
++
++#ifdef CONFIG_SCHED_SMT
++ /* idle select function */
++ if (cpumask_equal(&sched_smt_mask, cpu_online_mask)) {
++ SET_IDLE_SELECT_FUNC(p1_idle_select_func);
++ } else
++#endif
++ if (!cpumask_empty(&sched_pcore_mask)) {
++ SET_IDLE_SELECT_FUNC(p1p2_idle_select_func);
++ }
++
++ for_each_online_cpu(cpu) {
++ rq = cpu_rq(cpu);
++ /* take chance to reset time slice for idle tasks */
++ rq->idle->time_slice = sysctl_sched_base_slice;
++
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_weight(cpu_smt_mask(cpu)) > 1) {
++ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_smt, clear_idle_mask_smt);
++
++ if (cpumask_test_cpu(cpu, &sched_pcore_mask) &&
++ !cpumask_intersects(&sched_ecore_mask, &sched_smt_mask)) {
++ SET_RQ_BALANCE_FUNC(rq, cpu, smt_pcore_balance_func);
++ } else {
++ SET_RQ_BALANCE_FUNC(rq, cpu, smt_balance_func);
++ }
++
++ continue;
++ }
++#endif
++ /* !SMT or only one cpu in sg */
++ if (cpumask_test_cpu(cpu, &sched_pcore_mask)) {
++ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_pcore, clear_idle_mask_pcore);
++
++ if (ecore_present)
++ SET_RQ_BALANCE_FUNC(rq, cpu, pcore_balance_func);
++
++ continue;
++ }
++ if (cpumask_test_cpu(cpu, &sched_ecore_mask)) {
++ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_ecore, clear_idle_mask_ecore);
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_intersects(&sched_pcore_mask, &sched_smt_mask))
++ SET_RQ_BALANCE_FUNC(rq, cpu, ecore_balance_func);
++#endif
++ }
++ }
++}
+diff --git a/kernel/sched/alt_topology.h b/kernel/sched/alt_topology.h
+new file mode 100644
+index 000000000000..076174cd2bc6
+--- /dev/null
++++ b/kernel/sched/alt_topology.h
+@@ -0,0 +1,6 @@
++#ifndef _KERNEL_SCHED_ALT_TOPOLOGY_H
++#define _KERNEL_SCHED_ALT_TOPOLOGY_H
++
++extern void sched_init_topology(void);
++
++#endif /* _KERNEL_SCHED_ALT_TOPOLOGY_H */
+diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
+new file mode 100644
+index 000000000000..5a7835246ec3
+--- /dev/null
++++ b/kernel/sched/bmq.h
+@@ -0,0 +1,103 @@
++#ifndef _KERNEL_SCHED_BMQ_H
++#define _KERNEL_SCHED_BMQ_H
++
++#define ALT_SCHED_NAME "BMQ"
++
++/*
++ * BMQ only routines
++ */
++static inline void boost_task(struct task_struct *p, int n)
++{
++ int limit;
++
++ switch (p->policy) {
++ case SCHED_NORMAL:
++ limit = -MAX_PRIORITY_ADJ;
++ break;
++ case SCHED_BATCH:
++ limit = 0;
++ break;
++ default:
++ return;
++ }
++
++ p->boost_prio = max(limit, p->boost_prio - n);
++}
++
++static inline void deboost_task(struct task_struct *p)
++{
++ if (p->boost_prio < MAX_PRIORITY_ADJ)
++ p->boost_prio++;
++}
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms) {}
++
++/* This API is used in task_prio(), return value readed by human users */
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ return p->prio + p->boost_prio - MIN_NORMAL_PRIO;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MIN_NORMAL_PRIO)? (p->prio >> 2) :
++ MIN_SCHED_NORMAL_PRIO + (p->prio + p->boost_prio - MIN_NORMAL_PRIO) / 2;
++}
++
++#define TASK_SCHED_PRIO_IDX(p, rq, idx, prio) \
++ prio = task_sched_prio(p); \
++ idx = prio;
++
++static inline int sched_prio2idx(int prio, struct rq *rq)
++{
++ return prio;
++}
++
++static inline int sched_idx2prio(int idx, struct rq *rq)
++{
++ return idx;
++}
++
++static inline int sched_rq_prio_idx(struct rq *rq)
++{
++ return rq->prio;
++}
++
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio + p->boost_prio > DEFAULT_PRIO);
++}
++
++static inline void sched_update_rq_clock(struct rq *rq) {}
++
++static inline void sched_task_renew(struct task_struct *p, const struct rq *rq)
++{
++ deboost_task(p);
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
++static inline void sched_task_fork(struct task_struct *p, struct rq *rq) {}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++static inline void sched_task_ttwu(struct task_struct *p)
++{
++ s64 delta = this_rq()->clock_task > p->last_ran;
++
++ if (likely(delta > 0))
++ boost_task(p, delta >> 22);
++}
++
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
++{
++ boost_task(p, 1);
++}
++
++#endif /* _KERNEL_SCHED_BMQ_H */
+diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
+index c4a488e67aa7..42575d419e28 100644
+--- a/kernel/sched/build_policy.c
++++ b/kernel/sched/build_policy.c
+@@ -49,13 +49,17 @@
+
+ #include "idle.c"
+
+-#include "rt.c"
+-#include "cpudeadline.c"
++#ifndef CONFIG_SCHED_ALT
++# include "rt.c"
++# include "cpudeadline.c"
++#endif
+
+ #include "pelt.c"
+
+ #include "cputime.c"
++#ifndef CONFIG_SCHED_ALT
+ #include "deadline.c"
++#endif
+
+ #ifdef CONFIG_SCHED_CLASS_EXT
+ # include "ext.c"
+diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
+index e2cf3b08d4e9..a64bf71a6c69 100644
+--- a/kernel/sched/build_utility.c
++++ b/kernel/sched/build_utility.c
+@@ -56,6 +56,10 @@
+
+ #include "clock.c"
+
++#ifdef CONFIG_SCHED_ALT
++# include "alt_topology.c"
++#endif
++
+ #ifdef CONFIG_CGROUP_CPUACCT
+ # include "cpuacct.c"
+ #endif
+@@ -68,7 +72,7 @@
+ # include "cpufreq_schedutil.c"
+ #endif
+
+-#include "debug.c"
++# include "debug.c"
+
+ #ifdef CONFIG_SCHEDSTATS
+ # include "stats.c"
+@@ -81,7 +85,9 @@
+ #include "wait.c"
+
+ #include "cpupri.c"
+-#include "stop_task.c"
++#ifndef CONFIG_SCHED_ALT
++# include "stop_task.c"
++#endif
+
+ #include "topology.c"
+
+diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
+index 0ab5f9d4bc59..60f374ffa96d 100644
+--- a/kernel/sched/cpufreq_schedutil.c
++++ b/kernel/sched/cpufreq_schedutil.c
+@@ -225,6 +225,7 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsigned long actual,
+
+ static void sugov_get_util(struct sugov_cpu *sg_cpu, unsigned long boost)
+ {
++#ifndef CONFIG_SCHED_ALT
+ unsigned long min, max, util = scx_cpuperf_target(sg_cpu->cpu);
+
+ if (!scx_switched_all())
+@@ -233,6 +234,10 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu, unsigned long boost)
+ util = max(util, boost);
+ sg_cpu->bw_min = min;
+ sg_cpu->util = sugov_effective_cpu_perf(sg_cpu->cpu, util, min, max);
++#else /* CONFIG_SCHED_ALT */
++ sg_cpu->bw_min = 0;
++ sg_cpu->util = rq_load_util(cpu_rq(sg_cpu->cpu), arch_scale_cpu_capacity(sg_cpu->cpu));
++#endif /* CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -392,8 +397,10 @@ static inline bool sugov_hold_freq(struct sugov_cpu *sg_cpu) { return false; }
+ */
+ static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_min)
+- sg_cpu->sg_policy->need_freq_update = true;
++ sg_cpu->sg_policy->limits_changed = true;
++#endif
+ }
+
+ static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
+@@ -687,6 +694,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
+ }
+
+ ret = sched_setattr_nocheck(thread, &attr);
++
+ if (ret) {
+ kthread_stop(thread);
+ pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
+diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
+index 7097de2c8cda..52b5626ce7b6 100644
+--- a/kernel/sched/cputime.c
++++ b/kernel/sched/cputime.c
+@@ -127,7 +127,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
+ p->utime += cputime;
+ account_group_user_time(p, cputime);
+
+- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
++ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
+
+ /* Add user time to cpustat. */
+ task_group_account_field(p, index, cputime);
+@@ -151,7 +151,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
+ p->gtime += cputime;
+
+ /* Add guest time to cpustat. */
+- if (task_nice(p) > 0) {
++ if (task_running_nice(p)) {
+ task_group_account_field(p, CPUTIME_NICE, cputime);
+ cpustat[CPUTIME_GUEST_NICE] += cputime;
+ } else {
+@@ -289,7 +289,7 @@ static inline u64 account_other_time(u64 max)
+ #ifdef CONFIG_64BIT
+ static inline u64 read_sum_exec_runtime(struct task_struct *t)
+ {
+- return t->se.sum_exec_runtime;
++ return tsk_seruntime(t);
+ }
+ #else /* !CONFIG_64BIT: */
+ static u64 read_sum_exec_runtime(struct task_struct *t)
+@@ -299,7 +299,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
+ struct rq *rq;
+
+ rq = task_rq_lock(t, &rf);
+- ns = t->se.sum_exec_runtime;
++ ns = tsk_seruntime(t);
+ task_rq_unlock(rq, t, &rf);
+
+ return ns;
+@@ -624,7 +624,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
+ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
+ {
+ struct task_cputime cputime = {
+- .sum_exec_runtime = p->se.sum_exec_runtime,
++ .sum_exec_runtime = tsk_seruntime(p),
+ };
+
+ if (task_cputime(p, &cputime.utime, &cputime.stime))
+diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
+index 02e16b70a790..2687421bc524 100644
+--- a/kernel/sched/debug.c
++++ b/kernel/sched/debug.c
+@@ -10,6 +10,7 @@
+ #include <linux/nmi.h>
+ #include "sched.h"
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * This allows printing both to /sys/kernel/debug/sched/debug and
+ * to the console
+@@ -215,6 +216,8 @@ static const struct file_operations sched_scaling_fops = {
+ .release = single_release,
+ };
+
++#endif /* !CONFIG_SCHED_ALT */
++
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+
+ static ssize_t sched_dynamic_write(struct file *filp, const char __user *ubuf,
+@@ -280,6 +283,7 @@ static const struct file_operations sched_dynamic_fops = {
+
+ #endif /* CONFIG_PREEMPT_DYNAMIC */
+
++#ifndef CONFIG_SCHED_ALT
+ __read_mostly bool sched_debug_verbose;
+
+ static struct dentry *sd_dentry;
+@@ -464,9 +468,11 @@ static const struct file_operations fair_server_period_fops = {
+ .llseek = seq_lseek,
+ .release = single_release,
+ };
++#endif /* !CONFIG_SCHED_ALT */
+
+ static struct dentry *debugfs_sched;
+
++#ifndef CONFIG_SCHED_ALT
+ static void debugfs_fair_server_init(void)
+ {
+ struct dentry *d_fair;
+@@ -487,6 +493,7 @@ static void debugfs_fair_server_init(void)
+ debugfs_create_file("period", 0644, d_cpu, (void *) cpu, &fair_server_period_fops);
+ }
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ static __init int sched_init_debug(void)
+ {
+@@ -494,14 +501,17 @@ static __init int sched_init_debug(void)
+
+ debugfs_sched = debugfs_create_dir("sched", NULL);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
+ debugfs_create_file_unsafe("verbose", 0644, debugfs_sched, &sched_debug_verbose, &sched_verbose_fops);
++#endif /* !CONFIG_SCHED_ALT */
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+ debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
+ #endif
+
+ debugfs_create_u32("base_slice_ns", 0644, debugfs_sched, &sysctl_sched_base_slice);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_u32("latency_warn_ms", 0644, debugfs_sched, &sysctl_resched_latency_warn_ms);
+ debugfs_create_u32("latency_warn_once", 0644, debugfs_sched, &sysctl_resched_latency_warn_once);
+
+@@ -524,13 +534,18 @@ static __init int sched_init_debug(void)
+ #endif /* CONFIG_NUMA_BALANCING */
+
+ debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
++#endif /* !CONFIG_SCHED_ALT */
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_fair_server_init();
++#endif /* !CONFIG_SCHED_ALT */
+
+ return 0;
+ }
+ late_initcall(sched_init_debug);
+
++#ifndef CONFIG_SCHED_ALT
++
+ static cpumask_var_t sd_sysctl_cpus;
+
+ static int sd_flags_show(struct seq_file *m, void *v)
+@@ -1263,6 +1278,11 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
+
+ sched_show_numa(p, m);
+ }
++#else
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ void proc_sched_set_task(struct task_struct *p)
+ {
+diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
+index c39b089d4f09..9903f1d96dc3 100644
+--- a/kernel/sched/idle.c
++++ b/kernel/sched/idle.c
+@@ -428,6 +428,7 @@ void cpu_startup_entry(enum cpuhp_state state)
+ do_idle();
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * idle-task scheduling class.
+ */
+@@ -539,3 +540,4 @@ DEFINE_SCHED_CLASS(idle) = {
+ .switched_to = switched_to_idle,
+ .update_curr = update_curr_idle,
+ };
++#endif
+diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
+new file mode 100644
+index 000000000000..fe3099071eb7
+--- /dev/null
++++ b/kernel/sched/pds.h
+@@ -0,0 +1,139 @@
++#ifndef _KERNEL_SCHED_PDS_H
++#define _KERNEL_SCHED_PDS_H
++
++#define ALT_SCHED_NAME "PDS"
++
++static const u64 RT_MASK = ((1ULL << MIN_SCHED_NORMAL_PRIO) - 1);
++
++#define SCHED_NORMAL_PRIO_NUM (32)
++#define SCHED_EDGE_DELTA (SCHED_NORMAL_PRIO_NUM - NICE_WIDTH / 2)
++
++/* PDS assume SCHED_NORMAL_PRIO_NUM is power of 2 */
++#define SCHED_NORMAL_PRIO_MOD(x) ((x) & (SCHED_NORMAL_PRIO_NUM - 1))
++
++/* default time slice 4ms -> shift 22, 2 time slice slots -> shift 23 */
++static __read_mostly int sched_timeslice_shift = 23;
++
++/*
++ * Common interfaces
++ */
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ u64 sched_dl = max(p->deadline, rq->time_edge);
++
++#ifdef ALT_SCHED_DEBUG
++ if (WARN_ONCE(sched_dl - rq->time_edge > NORMAL_PRIO_NUM - 1,
++ "pds: task_sched_prio_normal() delta %lld\n", sched_dl - rq->time_edge))
++ return SCHED_NORMAL_PRIO_NUM - 1;
++#endif
++
++ return sched_dl - rq->time_edge;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MIN_NORMAL_PRIO) ? (p->prio >> 2) :
++ MIN_SCHED_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
++}
++
++#define TASK_SCHED_PRIO_IDX(p, rq, idx, prio) \
++ if (p->prio < MIN_NORMAL_PRIO) { \
++ prio = p->prio >> 2; \
++ idx = prio; \
++ } else { \
++ u64 sched_dl = max(p->deadline, rq->time_edge); \
++ prio = MIN_SCHED_NORMAL_PRIO + sched_dl - rq->time_edge; \
++ idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_dl); \
++ }
++
++static inline int sched_prio2idx(int sched_prio, struct rq *rq)
++{
++ return (IDLE_TASK_SCHED_PRIO == sched_prio || sched_prio < MIN_SCHED_NORMAL_PRIO) ?
++ sched_prio :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_prio + rq->time_edge);
++}
++
++static inline int sched_idx2prio(int sched_idx, struct rq *rq)
++{
++ return (sched_idx < MIN_SCHED_NORMAL_PRIO) ?
++ sched_idx :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_idx - rq->time_edge);
++}
++
++static inline int sched_rq_prio_idx(struct rq *rq)
++{
++ return rq->prio_idx;
++}
++
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio > DEFAULT_PRIO);
++}
++
++static inline void sched_update_rq_clock(struct rq *rq)
++{
++ struct list_head head;
++ u64 old = rq->time_edge;
++ u64 now = rq->clock >> sched_timeslice_shift;
++ u64 prio, delta;
++ DECLARE_BITMAP(normal, SCHED_QUEUE_BITS);
++
++ if (now == old)
++ return;
++
++ rq->time_edge = now;
++ delta = min_t(u64, SCHED_NORMAL_PRIO_NUM, now - old);
++ INIT_LIST_HEAD(&head);
++
++ prio = MIN_SCHED_NORMAL_PRIO;
++ for_each_set_bit_from(prio, rq->queue.bitmap, MIN_SCHED_NORMAL_PRIO + delta)
++ list_splice_tail_init(rq->queue.heads + MIN_SCHED_NORMAL_PRIO +
++ SCHED_NORMAL_PRIO_MOD(prio + old), &head);
++
++ bitmap_shift_right(normal, rq->queue.bitmap, delta, SCHED_QUEUE_BITS);
++ if (!list_empty(&head)) {
++ u64 idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(now);
++
++ __list_splice(&head, rq->queue.heads + idx, rq->queue.heads[idx].next);
++ set_bit(MIN_SCHED_NORMAL_PRIO, normal);
++ }
++ bitmap_replace(rq->queue.bitmap, normal, rq->queue.bitmap,
++ (const unsigned long *)&RT_MASK, SCHED_QUEUE_BITS);
++
++ if (rq->prio < MIN_SCHED_NORMAL_PRIO || IDLE_TASK_SCHED_PRIO == rq->prio)
++ return;
++
++ rq->prio = max_t(u64, MIN_SCHED_NORMAL_PRIO, rq->prio - delta);
++ rq->prio_idx = sched_prio2idx(rq->prio, rq);
++}
++
++static inline void sched_task_renew(struct task_struct *p, const struct rq *rq)
++{
++ if (p->prio >= MIN_NORMAL_PRIO)
++ p->deadline = rq->time_edge + SCHED_EDGE_DELTA +
++ (p->static_prio - (MAX_PRIO - NICE_WIDTH)) / 2;
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
++{
++ u64 max_dl = rq->time_edge + SCHED_EDGE_DELTA + NICE_WIDTH / 2 - 1;
++ if (unlikely(p->deadline > max_dl))
++ p->deadline = max_dl;
++}
++
++static inline void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ sched_task_renew(p, rq);
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sysctl_sched_base_slice;
++ sched_task_renew(p, rq);
++}
++
++static inline void sched_task_ttwu(struct task_struct *p) {}
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
++
++#endif /* _KERNEL_SCHED_PDS_H */
+diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
+index fa83bbaf4f3e..e5a8e94e6a8e 100644
+--- a/kernel/sched/pelt.c
++++ b/kernel/sched/pelt.c
+@@ -267,6 +267,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
+ WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * sched_entity:
+ *
+@@ -384,8 +385,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+
+ return 0;
+ }
++#endif
+
+-#ifdef CONFIG_SCHED_HW_PRESSURE
++#if defined(CONFIG_SCHED_HW_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * hardware:
+ *
+@@ -469,6 +471,7 @@ int update_irq_load_avg(struct rq *rq, u64 running)
+ }
+ #endif /* CONFIG_HAVE_SCHED_AVG_IRQ */
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * Load avg and utiliztion metrics need to be updated periodically and before
+ * consumption. This function updates the metrics for all subsystems except for
+@@ -488,3 +491,4 @@ bool update_other_load_avgs(struct rq *rq)
+ update_hw_load_avg(rq_clock_task(rq), rq, hw_pressure) |
+ update_irq_load_avg(rq, 0);
+ }
++#endif /* !CONFIG_SCHED_ALT */
+diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
+index 62c3fa543c0f..de93412fd3ad 100644
+--- a/kernel/sched/pelt.h
++++ b/kernel/sched/pelt.h
+@@ -5,14 +5,16 @@
+
+ #include "sched-pelt.h"
+
++#ifndef CONFIG_SCHED_ALT
+ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
+ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
+ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
+ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
+ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
+ bool update_other_load_avgs(struct rq *rq);
++#endif
+
+-#ifdef CONFIG_SCHED_HW_PRESSURE
++#if defined(CONFIG_SCHED_HW_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ int update_hw_load_avg(u64 now, struct rq *rq, u64 capacity);
+
+ static inline u64 hw_load_avg(struct rq *rq)
+@@ -49,6 +51,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
+ return PELT_MIN_DIVIDER + avg->period_contrib;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void cfs_se_util_change(struct sched_avg *avg)
+ {
+ unsigned int enqueued;
+@@ -185,5 +188,6 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+ return rq_clock_pelt(rq_of(cfs_rq));
+ }
+ #endif /* !CONFIG_CFS_BANDWIDTH */
++#endif /* CONFIG_SCHED_ALT */
+
+ #endif /* _KERNEL_SCHED_PELT_H */
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index cf2109b67f9a..a800b6b50264 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -5,6 +5,10 @@
+ #ifndef _KERNEL_SCHED_SCHED_H
+ #define _KERNEL_SCHED_SCHED_H
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_sched.h"
++#else
++
+ #include <linux/sched/affinity.h>
+ #include <linux/sched/autogroup.h>
+ #include <linux/sched/cpufreq.h>
+@@ -3900,4 +3904,9 @@ void sched_enq_and_set_task(struct sched_enq_and_set_ctx *ctx);
+
+ #include "ext.h"
+
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (task_nice(p) > 0);
++}
++#endif /* !CONFIG_SCHED_ALT */
+ #endif /* _KERNEL_SCHED_SCHED_H */
+diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
+index d1c9429a4ac5..cc3764073dd3 100644
+--- a/kernel/sched/stats.c
++++ b/kernel/sched/stats.c
+@@ -115,8 +115,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ seq_printf(seq, "timestamp %lu\n", jiffies);
+ } else {
+ struct rq *rq;
++#ifndef CONFIG_SCHED_ALT
+ struct sched_domain *sd;
+ int dcount = 0;
++#endif
+ cpu = (unsigned long)(v - 2);
+ rq = cpu_rq(cpu);
+
+@@ -131,6 +133,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+
+ seq_printf(seq, "\n");
+
++#ifndef CONFIG_SCHED_ALT
+ /* domain-specific stats */
+ rcu_read_lock();
+ for_each_domain(cpu, sd) {
+@@ -161,6 +164,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ sd->ttwu_move_balance);
+ }
+ rcu_read_unlock();
++#endif
+ }
+ return 0;
+ }
+diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
+index 26f3fd4d34ce..a3e389997138 100644
+--- a/kernel/sched/stats.h
++++ b/kernel/sched/stats.h
+@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
+
+ #endif /* CONFIG_SCHEDSTATS */
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ struct sched_entity_stats {
+ struct sched_entity se;
+@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
+ #endif
+ return &task_of(se)->stats;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PSI
+ void psi_task_change(struct task_struct *task, int clear, int set);
+diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
+index 77ae87f36e84..a2b06eba44e7 100644
+--- a/kernel/sched/syscalls.c
++++ b/kernel/sched/syscalls.c
+@@ -16,6 +16,14 @@
+ #include "sched.h"
+ #include "autogroup.h"
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_core.h"
++
++static inline int __normal_prio(int policy, int rt_prio, int static_prio)
++{
++ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) : static_prio;
++}
++#else /* !CONFIG_SCHED_ALT */
+ static inline int __normal_prio(int policy, int rt_prio, int nice)
+ {
+ int prio;
+@@ -29,6 +37,7 @@ static inline int __normal_prio(int policy, int rt_prio, int nice)
+
+ return prio;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ /*
+ * Calculate the expected normal priority: i.e. priority
+@@ -39,7 +48,11 @@ static inline int __normal_prio(int policy, int rt_prio, int nice)
+ */
+ static inline int normal_prio(struct task_struct *p)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
++#else /* !CONFIG_SCHED_ALT */
+ return __normal_prio(p->policy, p->rt_priority, PRIO_TO_NICE(p->static_prio));
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ /*
+@@ -64,6 +77,37 @@ static int effective_prio(struct task_struct *p)
+
+ void set_user_nice(struct task_struct *p, long nice)
+ {
++#ifdef CONFIG_SCHED_ALT
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
++ return;
++ /*
++ * We have to be careful, if called from sys_setpriority(),
++ * the task might be in the middle of scheduling on another CPU.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ rq = __task_access_lock(p, &lock);
++
++ p->static_prio = NICE_TO_PRIO(nice);
++ /*
++ * The RT priorities are set via sched_setscheduler(), but we still
++ * allow the 'normal' nice value to be set - but as expected
++ * it won't have any effect on scheduling until the task is
++ * not SCHED_NORMAL/SCHED_BATCH:
++ */
++ if (task_has_rt_policy(p))
++ goto out_unlock;
++
++ p->prio = effective_prio(p);
++
++ check_task_changed(p, rq);
++out_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++#else
+ bool queued, running;
+ struct rq *rq;
+ int old_prio;
+@@ -112,6 +156,7 @@ void set_user_nice(struct task_struct *p, long nice)
+ * lowered its priority, then reschedule its CPU:
+ */
+ p->sched_class->prio_changed(rq, p, old_prio);
++#endif /* !CONFIG_SCHED_ALT */
+ }
+ EXPORT_SYMBOL(set_user_nice);
+
+@@ -190,7 +235,19 @@ SYSCALL_DEFINE1(nice, int, increment)
+ */
+ int task_prio(const struct task_struct *p)
+ {
++#ifdef CONFIG_SCHED_ALT
++/*
++ * sched policy return value kernel prio user prio/nice
++ *
++ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
++ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
++ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
++ */
++ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
++ task_sched_prio_normal(p, task_rq(p));
++#else
+ return p->prio - MAX_RT_PRIO;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -297,11 +354,16 @@ static void __setscheduler_params(struct task_struct *p,
+
+ p->policy = policy;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_policy(policy))
+ __setparam_dl(p, attr);
+ else if (fair_policy(policy))
+ __setparam_fair(p, attr);
++#else /* !CONFIG_SCHED_ALT */
++ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
++#endif /* CONFIG_SCHED_ALT */
+
++#ifndef CONFIG_SCHED_ALT
+ /* rt-policy tasks do not have a timerslack */
+ if (rt_or_dl_task_policy(p)) {
+ p->timer_slack_ns = 0;
+@@ -309,6 +371,7 @@ static void __setscheduler_params(struct task_struct *p,
+ /* when switching back to non-rt policy, restore timerslack */
+ p->timer_slack_ns = p->default_timer_slack_ns;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ /*
+ * __sched_setscheduler() ensures attr->sched_priority == 0 when
+@@ -317,7 +380,9 @@ static void __setscheduler_params(struct task_struct *p,
+ */
+ p->rt_priority = attr->sched_priority;
+ p->normal_prio = normal_prio(p);
++#ifndef CONFIG_SCHED_ALT
+ set_load_weight(p, true);
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ /*
+@@ -333,6 +398,8 @@ static bool check_same_owner(struct task_struct *p)
+ uid_eq(cred->euid, pcred->uid));
+ }
+
++#ifndef CONFIG_SCHED_ALT
++
+ #ifdef CONFIG_UCLAMP_TASK
+
+ static int uclamp_validate(struct task_struct *p,
+@@ -446,6 +513,7 @@ static inline int uclamp_validate(struct task_struct *p,
+ static void __setscheduler_uclamp(struct task_struct *p,
+ const struct sched_attr *attr) { }
+ #endif /* !CONFIG_UCLAMP_TASK */
++#endif /* !CONFIG_SCHED_ALT */
+
+ /*
+ * Allow unprivileged RT tasks to decrease priority.
+@@ -456,11 +524,13 @@ static int user_check_sched_setscheduler(struct task_struct *p,
+ const struct sched_attr *attr,
+ int policy, int reset_on_fork)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (fair_policy(policy)) {
+ if (attr->sched_nice < task_nice(p) &&
+ !is_nice_reduction(p, attr->sched_nice))
+ goto req_priv;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ if (rt_policy(policy)) {
+ unsigned long rlim_rtprio = task_rlimit(p, RLIMIT_RTPRIO);
+@@ -475,6 +545,7 @@ static int user_check_sched_setscheduler(struct task_struct *p,
+ goto req_priv;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * Can't set/change SCHED_DEADLINE policy at all for now
+ * (safest behavior); in the future we would like to allow
+@@ -492,6 +563,7 @@ static int user_check_sched_setscheduler(struct task_struct *p,
+ if (!is_nice_reduction(p, task_nice(p)))
+ goto req_priv;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ /* Can't change other user's priorities: */
+ if (!check_same_owner(p))
+@@ -514,6 +586,158 @@ int __sched_setscheduler(struct task_struct *p,
+ const struct sched_attr *attr,
+ bool user, bool pi)
+ {
++#ifdef CONFIG_SCHED_ALT
++ const struct sched_attr dl_squash_attr = {
++ .size = sizeof(struct sched_attr),
++ .sched_policy = SCHED_FIFO,
++ .sched_nice = 0,
++ .sched_priority = 99,
++ };
++ int oldpolicy = -1, policy = attr->sched_policy;
++ int retval, newprio;
++ struct balance_callback *head;
++ unsigned long flags;
++ struct rq *rq;
++ int reset_on_fork;
++ raw_spinlock_t *lock;
++
++ /* The pi code expects interrupts enabled */
++ BUG_ON(pi && in_interrupt());
++
++ /*
++ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
++ */
++ if (unlikely(SCHED_DEADLINE == policy)) {
++ attr = &dl_squash_attr;
++ policy = attr->sched_policy;
++ }
++recheck:
++ /* Double check policy once rq lock held */
++ if (policy < 0) {
++ reset_on_fork = p->sched_reset_on_fork;
++ policy = oldpolicy = p->policy;
++ } else {
++ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
++
++ if (policy > SCHED_IDLE)
++ return -EINVAL;
++ }
++
++ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
++ return -EINVAL;
++
++ /*
++ * Valid priorities for SCHED_FIFO and SCHED_RR are
++ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
++ * SCHED_BATCH and SCHED_IDLE is 0.
++ */
++ if (attr->sched_priority < 0 ||
++ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
++ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
++ return -EINVAL;
++ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
++ (attr->sched_priority != 0))
++ return -EINVAL;
++
++ if (user) {
++ retval = user_check_sched_setscheduler(p, attr, policy, reset_on_fork);
++ if (retval)
++ return retval;
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ return retval;
++ }
++
++ /*
++ * Make sure no PI-waiters arrive (or leave) while we are
++ * changing the priority of the task:
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++
++ /*
++ * To be able to change p->policy safely, task_access_lock()
++ * must be called.
++ * IF use task_access_lock() here:
++ * For the task p which is not running, reading rq->stop is
++ * racy but acceptable as ->stop doesn't change much.
++ * An enhancemnet can be made to read rq->stop saftly.
++ */
++ rq = __task_access_lock(p, &lock);
++
++ /*
++ * Changing the policy of the stop threads its a very bad idea
++ */
++ if (p == rq->stop) {
++ retval = -EINVAL;
++ goto unlock;
++ }
++
++ /*
++ * If not changing anything there's no need to proceed further:
++ */
++ if (unlikely(policy == p->policy)) {
++ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
++ goto change;
++ if (!rt_policy(policy) &&
++ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
++ goto change;
++
++ p->sched_reset_on_fork = reset_on_fork;
++ retval = 0;
++ goto unlock;
++ }
++change:
++
++ /* Re-check policy now with rq lock held */
++ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
++ policy = oldpolicy = -1;
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ goto recheck;
++ }
++
++ p->sched_reset_on_fork = reset_on_fork;
++
++ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
++ if (pi) {
++ /*
++ * Take priority boosted tasks into account. If the new
++ * effective priority is unchanged, we just store the new
++ * normal parameters and do not touch the scheduler class and
++ * the runqueue. This will be done when the task deboost
++ * itself.
++ */
++ newprio = rt_effective_prio(p, newprio);
++ }
++
++ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
++ __setscheduler_params(p, attr);
++ __setscheduler_prio(p, newprio);
++ }
++
++ check_task_changed(p, rq);
++
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++ head = splice_balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ if (pi)
++ rt_mutex_adjust_pi(p);
++
++ /* Run balance callbacks after we've adjusted the PI chain: */
++ balance_callbacks(rq, head);
++ preempt_enable();
++
++ return 0;
++
++unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ return retval;
++#else /* !CONFIG_SCHED_ALT */
+ int oldpolicy = -1, policy = attr->sched_policy;
+ int retval, oldprio, newprio, queued, running;
+ const struct sched_class *prev_class, *next_class;
+@@ -750,6 +974,7 @@ int __sched_setscheduler(struct task_struct *p,
+ if (cpuset_locked)
+ cpuset_unlock();
+ return retval;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ static int _sched_setscheduler(struct task_struct *p, int policy,
+@@ -761,8 +986,10 @@ static int _sched_setscheduler(struct task_struct *p, int policy,
+ .sched_nice = PRIO_TO_NICE(p->static_prio),
+ };
+
++#ifndef CONFIG_SCHED_ALT
+ if (p->se.custom_slice)
+ attr.sched_runtime = p->se.slice;
++#endif /* !CONFIG_SCHED_ALT */
+
+ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
+ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
+@@ -930,13 +1157,18 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
+
+ static void get_params(struct task_struct *p, struct sched_attr *attr)
+ {
+- if (task_has_dl_policy(p)) {
++#ifndef CONFIG_SCHED_ALT
++ if (task_has_dl_policy(p))
+ __getparam_dl(p, attr);
+- } else if (task_has_rt_policy(p)) {
++ else
++#endif
++ if (task_has_rt_policy(p)) {
+ attr->sched_priority = p->rt_priority;
+ } else {
+ attr->sched_nice = task_nice(p);
++#ifndef CONFIG_SCHED_ALT
+ attr->sched_runtime = p->se.slice;
++#endif
+ }
+ }
+
+@@ -1117,6 +1349,7 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
+
+ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
+ {
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * If the task isn't a deadline task or admission control is
+ * disabled then we don't care about affinity changes.
+@@ -1140,6 +1373,7 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
+ guard(rcu)();
+ if (!cpumask_subset(task_rq(p)->rd->span, mask))
+ return -EBUSY;
++#endif
+
+ return 0;
+ }
+@@ -1163,9 +1397,11 @@ int __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
+ ctx->new_mask = new_mask;
+ ctx->flags |= SCA_CHECK;
+
++#ifndef CONFIG_SCHED_ALT
+ retval = dl_task_check_affinity(p, new_mask);
+ if (retval)
+ goto out_free_new_mask;
++#endif
+
+ retval = __set_cpus_allowed_ptr(p, ctx);
+ if (retval)
+@@ -1345,13 +1581,34 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
+
+ static void do_sched_yield(void)
+ {
+- struct rq_flags rf;
+ struct rq *rq;
++ struct rq_flags rf;
++
++#ifdef CONFIG_SCHED_ALT
++ struct task_struct *p;
++
++ if (!sched_yield_type)
++ return;
+
+ rq = this_rq_lock_irq(&rf);
+
++ schedstat_inc(rq->yld_count);
++
++ p = current;
++ if (rt_task(p)) {
++ if (task_on_rq_queued(p))
++ requeue_task(p, rq);
++ } else if (rq->nr_running > 1) {
++ do_sched_yield_type_1(p, rq);
++ if (task_on_rq_queued(p))
++ requeue_task(p, rq);
++ }
++#else /* !CONFIG_SCHED_ALT */
++ rq = this_rq_lock_irq(&rf);
++
+ schedstat_inc(rq->yld_count);
+ current->sched_class->yield_task(rq);
++#endif /* !CONFIG_SCHED_ALT */
+
+ preempt_disable();
+ rq_unlock_irq(rq, &rf);
+@@ -1420,6 +1677,9 @@ EXPORT_SYMBOL(yield);
+ */
+ int __sched yield_to(struct task_struct *p, bool preempt)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return 0;
++#else /* !CONFIG_SCHED_ALT */
+ struct task_struct *curr = current;
+ struct rq *rq, *p_rq;
+ int yielded = 0;
+@@ -1465,6 +1725,7 @@ int __sched yield_to(struct task_struct *p, bool preempt)
+ schedule();
+
+ return yielded;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+ EXPORT_SYMBOL_GPL(yield_to);
+
+@@ -1485,7 +1746,9 @@ SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
+ case SCHED_RR:
+ ret = MAX_RT_PRIO-1;
+ break;
++#ifndef CONFIG_SCHED_ALT
+ case SCHED_DEADLINE:
++#endif
+ case SCHED_NORMAL:
+ case SCHED_BATCH:
+ case SCHED_IDLE:
+@@ -1513,7 +1776,9 @@ SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
+ case SCHED_RR:
+ ret = 1;
+ break;
++#ifndef CONFIG_SCHED_ALT
+ case SCHED_DEADLINE:
++#endif
+ case SCHED_NORMAL:
+ case SCHED_BATCH:
+ case SCHED_IDLE:
+@@ -1525,7 +1790,9 @@ SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
+
+ static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
+ {
++#ifndef CONFIG_SCHED_ALT
+ unsigned int time_slice = 0;
++#endif
+ int retval;
+
+ if (pid < 0)
+@@ -1540,6 +1807,7 @@ static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
+ if (retval)
+ return retval;
+
++#ifndef CONFIG_SCHED_ALT
+ scoped_guard (task_rq_lock, p) {
+ struct rq *rq = scope.rq;
+ if (p->sched_class->get_rr_interval)
+@@ -1548,6 +1816,13 @@ static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
+ }
+
+ jiffies_to_timespec64(time_slice, t);
++#else
++ }
++
++ alt_sched_debug();
++
++ *t = ns_to_timespec64(sysctl_sched_base_slice);
++#endif /* !CONFIG_SCHED_ALT */
+ return 0;
+ }
+
+diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
+index 6e2f54169e66..5a5031761477 100644
+--- a/kernel/sched/topology.c
++++ b/kernel/sched/topology.c
+@@ -3,6 +3,7 @@
+ * Scheduler topology setup/handling methods
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ #include <linux/sched/isolation.h>
+ #include <linux/bsearch.h>
+ #include "sched.h"
+@@ -1497,8 +1498,10 @@ static void asym_cpu_capacity_scan(void)
+ */
+
+ static int default_relax_domain_level = -1;
++#endif /* CONFIG_SCHED_ALT */
+ int sched_domain_level_max;
+
++#ifndef CONFIG_SCHED_ALT
+ static int __init setup_relax_domain_level(char *str)
+ {
+ if (kstrtoint(str, 0, &default_relax_domain_level))
+@@ -1731,6 +1734,7 @@ sd_init(struct sched_domain_topology_level *tl,
+
+ return sd;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ /*
+ * Topology list, bottom-up.
+@@ -1767,6 +1771,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
+ sched_domain_topology_saved = NULL;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_NUMA
+
+ static const struct cpumask *sd_numa_mask(int cpu)
+@@ -2833,3 +2838,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
+ sched_domains_mutex_unlock();
+ }
++#else /* CONFIG_SCHED_ALT */
++DEFINE_STATIC_KEY_FALSE(sched_asym_cpucapacity);
++
++void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
++ struct sched_domain_attr *dattr_new)
++{}
++
++#ifdef CONFIG_NUMA
++int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return best_mask_cpu(cpu, cpus);
++}
++
++int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
++{
++ return cpumask_nth(cpu, cpus);
++}
++
++const struct cpumask *sched_numa_hop_mask(unsigned int node, unsigned int hops)
++{
++ return ERR_PTR(-EOPNOTSUPP);
++}
++EXPORT_SYMBOL_GPL(sched_numa_hop_mask);
++#endif /* CONFIG_NUMA */
++
++void sched_update_asym_prefer_cpu(int cpu, int old_prio, int new_prio)
++{}
++#endif
+diff --git a/kernel/sysctl.c b/kernel/sysctl.c
+index cb6196e3fa99..d0446e53fd64 100644
+--- a/kernel/sysctl.c
++++ b/kernel/sysctl.c
+@@ -36,6 +36,10 @@ EXPORT_SYMBOL_GPL(sysctl_long_vals);
+ static const int ngroups_max = NGROUPS_MAX;
+ static const int cap_last_cap = CAP_LAST_CAP;
+
++#ifdef CONFIG_SCHED_ALT
++extern int sched_yield_type;
++#endif
++
+ #ifdef CONFIG_PROC_SYSCTL
+
+ /**
+@@ -1489,6 +1493,17 @@ static const struct ctl_table sysctl_subsys_table[] = {
+ .proc_handler = proc_dointvec,
+ },
+ #endif
++#ifdef CONFIG_SCHED_ALT
++ {
++ .procname = "yield_type",
++ .data = &sched_yield_type,
++ .maxlen = sizeof (int),
++ .mode = 0644,
++ .proc_handler = &proc_dointvec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_TWO,
++ },
++#endif
+ #ifdef CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
+ {
+ .procname = "ignore-unaligned-usertrap",
+diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
+index 2e5b89d7d866..38c4526f5bc7 100644
+--- a/kernel/time/posix-cpu-timers.c
++++ b/kernel/time/posix-cpu-timers.c
+@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
+ u64 stime, utime;
+
+ task_cputime(p, &utime, &stime);
+- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
++ store_samples(samples, stime, utime, tsk_seruntime(p));
+ }
+
+ static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
+@@ -835,6 +835,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
+ }
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void check_dl_overrun(struct task_struct *tsk)
+ {
+ if (tsk->dl.dl_overrun) {
+@@ -842,6 +843,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
+ send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID);
+ }
+ }
++#endif
+
+ static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
+ {
+@@ -869,8 +871,10 @@ static void check_thread_timers(struct task_struct *tsk,
+ u64 samples[CPUCLOCK_MAX];
+ unsigned long soft;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk))
+ check_dl_overrun(tsk);
++#endif
+
+ if (expiry_cache_is_inactive(pct))
+ return;
+@@ -884,7 +888,7 @@ static void check_thread_timers(struct task_struct *tsk,
+ soft = task_rlimit(tsk, RLIMIT_RTTIME);
+ if (soft != RLIM_INFINITY) {
+ /* Task RT timeout is accounted in jiffies. RTTIME is usec */
+- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
++ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
+ unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
+
+ /* At the hard limit, send SIGKILL. No further action. */
+@@ -1120,8 +1124,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
+ return true;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk) && tsk->dl.dl_overrun)
+ return true;
++#endif
+
+ return false;
+ }
+diff --git a/kernel/time/timer.c b/kernel/time/timer.c
+index 553fa469d7cc..d7c90f6ff009 100644
+--- a/kernel/time/timer.c
++++ b/kernel/time/timer.c
+@@ -2453,7 +2453,11 @@ static void run_local_timers(void)
+ */
+ if (time_after_eq(jiffies, READ_ONCE(base->next_expiry)) ||
+ (i == BASE_DEF && tmigr_requires_handle_remote())) {
++#ifdef CONFIG_SCHED_BMQ
++ __raise_softirq_irqoff(TIMER_SOFTIRQ);
++#else
+ raise_timer_softirq(TIMER_SOFTIRQ);
++#endif
+ return;
+ }
+ }
+diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
+index dc734867f0fc..9ce22f6282eb 100644
+--- a/kernel/trace/trace_osnoise.c
++++ b/kernel/trace/trace_osnoise.c
+@@ -1645,6 +1645,9 @@ static void osnoise_sleep(bool skip_period)
+ */
+ static inline int osnoise_migration_pending(void)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return 0;
++#else
+ if (!current->migration_pending)
+ return 0;
+
+@@ -1666,6 +1669,7 @@ static inline int osnoise_migration_pending(void)
+ mutex_unlock(&interface_lock);
+
+ return 1;
++#endif
+ }
+
+ /*
+diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
+index d88c44f1dfa5..4af3cbbdcccb 100644
+--- a/kernel/trace/trace_selftest.c
++++ b/kernel/trace/trace_selftest.c
+@@ -1423,10 +1423,15 @@ static int trace_wakeup_test_thread(void *data)
+ {
+ /* Make this a -deadline thread */
+ static const struct sched_attr attr = {
++#ifdef CONFIG_SCHED_ALT
++ /* No deadline on BMQ/PDS, use RR */
++ .sched_policy = SCHED_RR,
++#else
+ .sched_policy = SCHED_DEADLINE,
+ .sched_runtime = 100000ULL,
+ .sched_deadline = 10000000ULL,
+ .sched_period = 10000000ULL
++#endif
+ };
+ struct wakeup_test_data *x = data;
+
+diff --git a/kernel/workqueue.c b/kernel/workqueue.c
+index c6b79b3675c3..2872234d8620 100644
+--- a/kernel/workqueue.c
++++ b/kernel/workqueue.c
+@@ -1251,6 +1251,7 @@ static bool kick_pool(struct worker_pool *pool)
+
+ p = worker->task;
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+ /*
+ * Idle @worker is about to execute @work and waking up provides an
+@@ -1280,6 +1281,8 @@ static bool kick_pool(struct worker_pool *pool)
+ }
+ }
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
++
+ wake_up_process(p);
+ return true;
+ }
+@@ -1408,7 +1411,11 @@ void wq_worker_running(struct task_struct *task)
+ * CPU intensive auto-detection cares about how long a work item hogged
+ * CPU without sleeping. Reset the starting timestamp on wakeup.
+ */
++#ifdef CONFIG_SCHED_ALT
++ worker->current_at = worker->task->sched_time;
++#else
+ worker->current_at = worker->task->se.sum_exec_runtime;
++#endif
+
+ WRITE_ONCE(worker->sleeping, 0);
+ }
+@@ -1493,7 +1500,11 @@ void wq_worker_tick(struct task_struct *task)
+ * We probably want to make this prettier in the future.
+ */
+ if ((worker->flags & WORKER_NOT_RUNNING) || READ_ONCE(worker->sleeping) ||
++#ifdef CONFIG_SCHED_ALT
++ worker->task->sched_time - worker->current_at <
++#else
+ worker->task->se.sum_exec_runtime - worker->current_at <
++#endif
+ wq_cpu_intensive_thresh_us * NSEC_PER_USEC)
+ return;
+
+@@ -3164,7 +3175,11 @@ __acquires(&pool->lock)
+ worker->current_func = work->func;
+ worker->current_pwq = pwq;
+ if (worker->task)
++#ifdef CONFIG_SCHED_ALT
++ worker->current_at = worker->task->sched_time;
++#else
+ worker->current_at = worker->task->se.sum_exec_runtime;
++#endif
+ work_data = *work_data_bits(work);
+ worker->current_color = get_work_color(work_data);
+
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
new file mode 100644
index 00000000..6dc48eec
--- /dev/null
+++ b/5021_BMQ-and-PDS-gentoo-defaults.patch
@@ -0,0 +1,13 @@
+--- a/init/Kconfig 2023-02-13 08:16:09.534315265 -0500
++++ b/init/Kconfig 2023-02-13 08:17:24.130237204 -0500
+@@ -867,8 +867,9 @@ config UCLAMP_BUCKETS_COUNT
+ If in doubt, use the default value.
+
+ menuconfig SCHED_ALT
++ depends on X86_64
+ bool "Alternative CPU Schedulers"
+- default y
++ default n
+ help
+ This feature enable alternative CPU scheduler"
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-09-29 12:16 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-09-29 12:16 UTC (permalink / raw
To: gentoo-commits
commit: 2f01814d76fef6a4cc8d8ae106c6ea1bcc362237
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Sep 29 12:05:41 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Sep 29 12:10:55 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=2f01814d
BMQ(BitMap Queue) Scheduler v6.17-r0
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 7 +
5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch | 11455 +++++++++++++++++++++++++
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 +
3 files changed, 11475 insertions(+)
diff --git a/0000_README b/0000_README
index 2e9aa3cc..7dc67ad1 100644
--- a/0000_README
+++ b/0000_README
@@ -78,3 +78,10 @@ Patch: 4567_distro-Gentoo-Kconfig.patch
From: Tom Wijsman <TomWij@gentoo.org>
Desc: Add Gentoo Linux support config settings and defaults.
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+From: https://gitlab.com/alfredchen/projectc
+Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
+
+Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
+From: https://gitweb.gentoo.org/proj/linux-patches.git/
+Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
new file mode 100644
index 00000000..6b5e3269
--- /dev/null
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
@@ -0,0 +1,11455 @@
+diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
+index 8b49eab937d0..c5d4901a9608 100644
+--- a/Documentation/admin-guide/sysctl/kernel.rst
++++ b/Documentation/admin-guide/sysctl/kernel.rst
+@@ -1716,3 +1716,12 @@ is 10 seconds.
+
+ The softlockup threshold is (``2 * watchdog_thresh``). Setting this
+ tunable to zero will disable lockup detection altogether.
++
++yield_type:
++===========
++
++BMQ/PDS CPU scheduler only. This determines what type of yield calls
++to sched_yield() will be performed.
++
++ 0 - No yield.
++ 1 - Requeue task. (default)
+diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
+new file mode 100644
+index 000000000000..05c84eec0f31
+--- /dev/null
++++ b/Documentation/scheduler/sched-BMQ.txt
+@@ -0,0 +1,110 @@
++ BitMap queue CPU Scheduler
++ --------------------------
++
++CONTENT
++========
++
++ Background
++ Design
++ Overview
++ Task policy
++ Priority management
++ BitMap Queue
++ CPU Assignment and Migration
++
++
++Background
++==========
++
++BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
++of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
++and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
++simple, while efficiency and scalable for interactive tasks, such as desktop,
++movie playback and gaming etc.
++
++Design
++======
++
++Overview
++--------
++
++BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
++each CPU is responsible for scheduling the tasks that are putting into it's
++run queue.
++
++The run queue is a set of priority queues. Note that these queues are fifo
++queue for non-rt tasks or priority queue for rt tasks in data structure. See
++BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
++that most applications are non-rt tasks. No matter the queue is fifo or
++priority, In each queue is an ordered list of runnable tasks awaiting execution
++and the data structures are the same. When it is time for a new task to run,
++the scheduler simply looks the lowest numbered queueue that contains a task,
++and runs the first task from the head of that queue. And per CPU idle task is
++also in the run queue, so the scheduler can always find a task to run on from
++its run queue.
++
++Each task will assigned the same timeslice(default 4ms) when it is picked to
++start running. Task will be reinserted at the end of the appropriate priority
++queue when it uses its whole timeslice. When the scheduler selects a new task
++from the priority queue it sets the CPU's preemption timer for the remainder of
++the previous timeslice. When that timer fires the scheduler will stop execution
++on that task, select another task and start over again.
++
++If a task blocks waiting for a shared resource then it's taken out of its
++priority queue and is placed in a wait queue for the shared resource. When it
++is unblocked it will be reinserted in the appropriate priority queue of an
++eligible CPU.
++
++Task policy
++-----------
++
++BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
++mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
++NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
++policy.
++
++DEADLINE
++ It is squashed as priority 0 FIFO task.
++
++FIFO/RR
++ All RT tasks share one single priority queue in BMQ run queue designed. The
++complexity of insert operation is O(n). BMQ is not designed for system runs
++with major rt policy tasks.
++
++NORMAL/BATCH/IDLE
++ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
++NORMAL policy tasks, but they just don't boost. To control the priority of
++NORMAL/BATCH/IDLE tasks, simply use nice level.
++
++ISO
++ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
++task instead.
++
++Priority management
++-------------------
++
++RT tasks have priority from 0-99. For non-rt tasks, there are three different
++factors used to determine the effective priority of a task. The effective
++priority being what is used to determine which queue it will be in.
++
++The first factor is simply the task’s static priority. Which is assigned from
++task's nice level, within [-20, 19] in userland's point of view and [0, 39]
++internally.
++
++The second factor is the priority boost. This is a value bounded between
++[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
++modified by the following cases:
++
++*When a thread has used up its entire timeslice, always deboost its boost by
++increasing by one.
++*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
++and its switch-in time(time after last switch and run) below the thredhold
++based on its priority boost, will boost its boost by decreasing by one buti is
++capped at 0 (won’t go negative).
++
++The intent in this system is to ensure that interactive threads are serviced
++quickly. These are usually the threads that interact directly with the user
++and cause user-perceivable latency. These threads usually do little work and
++spend most of their time blocked awaiting another user event. So they get the
++priority boost from unblocking while background threads that do most of the
++processing receive the priority penalty for using their entire timeslice.
+diff --git a/fs/bcachefs/io_write.c b/fs/bcachefs/io_write.c
+index 88b1eec8eff3..4619aa57cd9f 100644
+--- a/fs/bcachefs/io_write.c
++++ b/fs/bcachefs/io_write.c
+@@ -640,8 +640,14 @@ static inline void __wp_update_state(struct write_point *wp, enum write_point_st
+ if (state != wp->state) {
+ struct task_struct *p = current;
+ u64 now = ktime_get_ns();
++
++#ifdef CONFIG_SCHED_ALT
++ u64 runtime = tsk_seruntime(p) +
++ (now - p->last_ran);
++#else
+ u64 runtime = p->se.sum_exec_runtime +
+ (now - p->se.exec_start);
++#endif
+
+ if (state == WRITE_POINT_runnable)
+ wp->last_runtime = runtime;
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index 62d35631ba8c..eb1a57209822 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -515,7 +515,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
+ seq_puts(m, "0 0 0\n");
+ else
+ seq_printf(m, "%llu %llu %lu\n",
+- (unsigned long long)task->se.sum_exec_runtime,
++ (unsigned long long)tsk_seruntime(task),
+ (unsigned long long)task->sched_info.run_delay,
+ task->sched_info.pcount);
+
+diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
+index 8874f681b056..59eb72bf7d5f 100644
+--- a/include/asm-generic/resource.h
++++ b/include/asm-generic/resource.h
+@@ -23,7 +23,7 @@
+ [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ [RLIMIT_SIGPENDING] = { 0, 0 }, \
+ [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
+- [RLIMIT_NICE] = { 0, 0 }, \
++ [RLIMIT_NICE] = { 30, 30 }, \
+ [RLIMIT_RTPRIO] = { 0, 0 }, \
+ [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ }
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index e4ce0a76831e..7414ebd6267c 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -843,7 +843,9 @@ struct task_struct {
+ #endif
+
+ int on_cpu;
++
+ struct __call_single_node wake_entry;
++#ifndef CONFIG_SCHED_ALT
+ unsigned int wakee_flips;
+ unsigned long wakee_flip_decay_ts;
+ struct task_struct *last_wakee;
+@@ -857,6 +859,7 @@ struct task_struct {
+ */
+ int recent_used_cpu;
+ int wake_cpu;
++#endif /* !CONFIG_SCHED_ALT */
+ int on_rq;
+
+ int prio;
+@@ -864,6 +867,19 @@ struct task_struct {
+ int normal_prio;
+ unsigned int rt_priority;
+
++#ifdef CONFIG_SCHED_ALT
++ u64 last_ran;
++ s64 time_slice;
++ struct list_head sq_node;
++#ifdef CONFIG_SCHED_BMQ
++ int boost_prio;
++#endif /* CONFIG_SCHED_BMQ */
++#ifdef CONFIG_SCHED_PDS
++ u64 deadline;
++#endif /* CONFIG_SCHED_PDS */
++ /* sched_clock time spent running */
++ u64 sched_time;
++#else /* !CONFIG_SCHED_ALT */
+ struct sched_entity se;
+ struct sched_rt_entity rt;
+ struct sched_dl_entity dl;
+@@ -878,6 +894,7 @@ struct task_struct {
+ unsigned long core_cookie;
+ unsigned int core_occupation;
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_CGROUP_SCHED
+ struct task_group *sched_task_group;
+@@ -914,9 +931,13 @@ struct task_struct {
+ const cpumask_t *cpus_ptr;
+ cpumask_t *user_cpus_ptr;
+ cpumask_t cpus_mask;
++#ifndef CONFIG_SCHED_ALT
+ void *migration_pending;
++#endif
+ unsigned short migration_disabled;
++#ifndef CONFIG_SCHED_ALT
+ unsigned short migration_flags;
++#endif
+
+ #ifdef CONFIG_PREEMPT_RCU
+ int rcu_read_lock_nesting;
+@@ -947,8 +968,10 @@ struct task_struct {
+ struct sched_info sched_info;
+
+ struct list_head tasks;
++#ifndef CONFIG_SCHED_ALT
+ struct plist_node pushable_tasks;
+ struct rb_node pushable_dl_tasks;
++#endif
+
+ struct mm_struct *mm;
+ struct mm_struct *active_mm;
+@@ -1672,6 +1695,15 @@ static inline bool sched_proxy_exec(void)
+ }
+ #endif
+
++#ifdef CONFIG_SCHED_ALT
++#define tsk_seruntime(t) ((t)->sched_time)
++/* replace the uncertian rt_timeout with 0UL */
++#define tsk_rttimeout(t) (0UL)
++#else /* !CONFIG_SCHED_ALT: */
++#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
++#define tsk_rttimeout(t) ((t)->rt.timeout)
++#endif /* !CONFIG_SCHED_ALT */
++
+ #define TASK_REPORT_IDLE (TASK_REPORT + 1)
+ #define TASK_REPORT_MAX (TASK_REPORT_IDLE << 1)
+
+@@ -2236,7 +2268,11 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu)
+
+ static inline bool task_is_runnable(struct task_struct *p)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return p->on_rq;
++#else
+ return p->on_rq && !p->se.sched_delayed;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ extern bool sched_task_on_rq(struct task_struct *p);
+diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
+index c40115d4e34d..ddc97ddeed47 100644
+--- a/include/linux/sched/deadline.h
++++ b/include/linux/sched/deadline.h
+@@ -2,6 +2,25 @@
+ #ifndef _LINUX_SCHED_DEADLINE_H
+ #define _LINUX_SCHED_DEADLINE_H
+
++#ifdef CONFIG_SCHED_ALT
++
++static inline int dl_task(struct task_struct *p)
++{
++ return 0;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#define __tsk_deadline(p) (0UL)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
++#endif
++
++#else
++
++#define __tsk_deadline(p) ((p)->dl.deadline)
++
+ /*
+ * SCHED_DEADLINE tasks has negative priorities, reflecting
+ * the fact that any of them has higher prio than RT and
+@@ -23,6 +42,7 @@ static inline bool dl_task(struct task_struct *p)
+ {
+ return dl_prio(p->prio);
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ static inline bool dl_time_before(u64 a, u64 b)
+ {
+diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
+index 6ab43b4f72f9..ef1cff556c5e 100644
+--- a/include/linux/sched/prio.h
++++ b/include/linux/sched/prio.h
+@@ -19,6 +19,28 @@
+ #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
+ #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
+
++#ifdef CONFIG_SCHED_ALT
++
++/* Undefine MAX_PRIO and DEFAULT_PRIO */
++#undef MAX_PRIO
++#undef DEFAULT_PRIO
++
++/* +/- priority levels from the base priority */
++#ifdef CONFIG_SCHED_BMQ
++#define MAX_PRIORITY_ADJ (12)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define MAX_PRIORITY_ADJ (0)
++#endif
++
++#define MIN_NORMAL_PRIO (128)
++#define NORMAL_PRIO_NUM (64)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
++#define DEFAULT_PRIO (MAX_PRIO - MAX_PRIORITY_ADJ - NICE_WIDTH / 2)
++
++#endif /* CONFIG_SCHED_ALT */
++
+ /*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
+index 4e3338103654..6dfef878fe3b 100644
+--- a/include/linux/sched/rt.h
++++ b/include/linux/sched/rt.h
+@@ -45,8 +45,10 @@ static inline bool rt_or_dl_task_policy(struct task_struct *tsk)
+
+ if (policy == SCHED_FIFO || policy == SCHED_RR)
+ return true;
++#ifndef CONFIG_SCHED_ALT
+ if (policy == SCHED_DEADLINE)
+ return true;
++#endif
+ return false;
+ }
+
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 5263746b63e8..693e5e3b6b26 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -196,7 +196,8 @@ extern void sched_update_asym_prefer_cpu(int cpu, int old_prio, int new_prio);
+ #define SDTL_INIT(maskfn, flagsfn, dname) ((struct sched_domain_topology_level) \
+ { .mask = maskfn, .sd_flags = flagsfn, .name = #dname })
+
+-#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
++#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
++ !defined(CONFIG_SCHED_ALT)
+ extern void rebuild_sched_domains_energy(void);
+ #else
+ static inline void rebuild_sched_domains_energy(void)
+diff --git a/init/Kconfig b/init/Kconfig
+index ecddb94db8dc..a0afff9dbf4c 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -678,6 +678,7 @@ config TASK_IO_ACCOUNTING
+
+ config PSI
+ bool "Pressure stall information tracking"
++ depends on !SCHED_ALT
+ select KERNFS
+ help
+ Collect metrics that indicate how overcommitted the CPU, memory,
+@@ -901,6 +902,35 @@ config SCHED_PROXY_EXEC
+ This option enables proxy execution, a mechanism for mutex-owning
+ tasks to inherit the scheduling context of higher priority waiters.
+
++menuconfig SCHED_ALT
++ bool "Alternative CPU Schedulers"
++ default y
++ help
++ This feature enable alternative CPU scheduler"
++
++if SCHED_ALT
++
++choice
++ prompt "Alternative CPU Scheduler"
++ default SCHED_BMQ
++
++config SCHED_BMQ
++ bool "BMQ CPU scheduler"
++ help
++ The BitMap Queue CPU scheduler for excellent interactivity and
++ responsiveness on the desktop and solid scalability on normal
++ hardware and commodity servers.
++
++config SCHED_PDS
++ bool "PDS CPU scheduler"
++ help
++ The Priority and Deadline based Skip list multiple queue CPU
++ Scheduler.
++
++endchoice
++
++endif
++
+ endmenu
+
+ #
+@@ -966,6 +996,7 @@ config NUMA_BALANCING
+ depends on ARCH_SUPPORTS_NUMA_BALANCING
+ depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
+ depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
++ depends on !SCHED_ALT
+ help
+ This option adds support for automatic NUMA aware memory/task placement.
+ The mechanism is quite primitive and is based on migrating memory when
+@@ -1415,6 +1446,7 @@ config CHECKPOINT_RESTORE
+
+ config SCHED_AUTOGROUP
+ bool "Automatic process group scheduling"
++ depends on !SCHED_ALT
+ select CGROUPS
+ select CGROUP_SCHED
+ select FAIR_GROUP_SCHED
+diff --git a/init/init_task.c b/init/init_task.c
+index e557f622bd90..99e59c2082e0 100644
+--- a/init/init_task.c
++++ b/init/init_task.c
+@@ -72,9 +72,16 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
+ .stack = init_stack,
+ .usage = REFCOUNT_INIT(2),
+ .flags = PF_KTHREAD,
++#ifdef CONFIG_SCHED_ALT
++ .on_cpu = 1,
++ .prio = DEFAULT_PRIO,
++ .static_prio = DEFAULT_PRIO,
++ .normal_prio = DEFAULT_PRIO,
++#else
+ .prio = MAX_PRIO - 20,
+ .static_prio = MAX_PRIO - 20,
+ .normal_prio = MAX_PRIO - 20,
++#endif
+ .policy = SCHED_NORMAL,
+ .cpus_ptr = &init_task.cpus_mask,
+ .user_cpus_ptr = NULL,
+@@ -87,6 +94,16 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
+ .restart_block = {
+ .fn = do_no_restart_syscall,
+ },
++#ifdef CONFIG_SCHED_ALT
++ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
++#ifdef CONFIG_SCHED_BMQ
++ .boost_prio = 0,
++#endif
++#ifdef CONFIG_SCHED_PDS
++ .deadline = 0,
++#endif
++ .time_slice = HZ,
++#else
+ .se = {
+ .group_node = LIST_HEAD_INIT(init_task.se.group_node),
+ },
+@@ -94,10 +111,13 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
+ .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
+ .time_slice = RR_TIMESLICE,
+ },
++#endif
+ .tasks = LIST_HEAD_INIT(init_task.tasks),
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+ .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
+ #endif
++#endif
+ #ifdef CONFIG_CGROUP_SCHED
+ .sched_task_group = &root_task_group,
+ #endif
+diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
+index 54ea59ff8fbe..a6d3560cef75 100644
+--- a/kernel/Kconfig.preempt
++++ b/kernel/Kconfig.preempt
+@@ -134,7 +134,7 @@ config PREEMPT_DYNAMIC
+
+ config SCHED_CORE
+ bool "Core Scheduling for SMT"
+- depends on SCHED_SMT
++ depends on SCHED_SMT && !SCHED_ALT
+ help
+ This option permits Core Scheduling, a means of coordinated task
+ selection across SMT siblings. When enabled -- see
+@@ -152,7 +152,7 @@ config SCHED_CORE
+
+ config SCHED_CLASS_EXT
+ bool "Extensible Scheduling Class"
+- depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF
++ depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF && !SCHED_ALT
+ select STACKTRACE if STACKTRACE_SUPPORT
+ help
+ This option enables a new scheduler class sched_ext (SCX), which
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index 27adb04df675..b88d31c14417 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -662,7 +662,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
+ return ret;
+ }
+
+-#ifdef CONFIG_SMP
++#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * Helper routine for generate_sched_domains().
+ * Do cpusets a, b have overlapping effective cpus_allowed masks?
+@@ -1075,7 +1075,7 @@ void rebuild_sched_domains_locked(void)
+ /* Have scheduler rebuild the domains */
+ partition_sched_domains(ndoms, doms, attr);
+ }
+-#else /* !CONFIG_SMP */
++#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
+ void rebuild_sched_domains_locked(void)
+ {
+ }
+@@ -3049,12 +3049,15 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ goto out_unlock;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(task)) {
+ cs->nr_migrate_dl_tasks++;
+ cs->sum_migrate_dl_bw += task->dl.dl_bw;
+ }
++#endif
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (!cs->nr_migrate_dl_tasks)
+ goto out_success;
+
+@@ -3075,6 +3078,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ }
+
+ out_success:
++#endif
+ /*
+ * Mark attach is in progress. This makes validate_change() fail
+ * changes which zero cpus/mems_allowed.
+@@ -3096,12 +3100,14 @@ static void cpuset_cancel_attach(struct cgroup_taskset *tset)
+ mutex_lock(&cpuset_mutex);
+ dec_attach_in_progress_locked(cs);
+
++#ifndef CONFIG_SCHED_ALT
+ if (cs->nr_migrate_dl_tasks) {
+ int cpu = cpumask_any(cs->effective_cpus);
+
+ dl_bw_free(cpu, cs->sum_migrate_dl_bw);
+ reset_migrate_dl_data(cs);
+ }
++#endif
+
+ mutex_unlock(&cpuset_mutex);
+ }
+diff --git a/kernel/delayacct.c b/kernel/delayacct.c
+index 30e7912ebb0d..f6b7e29d2018 100644
+--- a/kernel/delayacct.c
++++ b/kernel/delayacct.c
+@@ -164,7 +164,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
+ */
+ t1 = tsk->sched_info.pcount;
+ t2 = tsk->sched_info.run_delay;
+- t3 = tsk->se.sum_exec_runtime;
++ t3 = tsk_seruntime(tsk);
+
+ d->cpu_count += t1;
+
+diff --git a/kernel/exit.c b/kernel/exit.c
+index 343eb97543d5..bd34de061dff 100644
+--- a/kernel/exit.c
++++ b/kernel/exit.c
+@@ -207,7 +207,7 @@ static void __exit_signal(struct release_task_post *post, struct task_struct *ts
+ sig->inblock += task_io_get_inblock(tsk);
+ sig->oublock += task_io_get_oublock(tsk);
+ task_io_accounting_add(&sig->ioac, &tsk->ioac);
+- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
++ sig->sum_sched_runtime += tsk_seruntime(tsk);
+ sig->nr_threads--;
+ __unhash_process(post, tsk, group_dead);
+ write_sequnlock(&sig->stats_lock);
+@@ -291,8 +291,8 @@ void release_task(struct task_struct *p)
+ write_unlock_irq(&tasklist_lock);
+ /* @thread_pid can't go away until free_pids() below */
+ proc_flush_pid(thread_pid);
+- add_device_randomness(&p->se.sum_exec_runtime,
+- sizeof(p->se.sum_exec_runtime));
++ add_device_randomness((const void*) &tsk_seruntime(p),
++ sizeof(unsigned long long));
+ free_pids(post.pids);
+ release_thread(p);
+ /*
+diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
+index c80902eacd79..b1d388145968 100644
+--- a/kernel/locking/rtmutex.c
++++ b/kernel/locking/rtmutex.c
+@@ -366,7 +366,7 @@ waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry));
+
+ waiter->tree.prio = __waiter_prio(task);
+- waiter->tree.deadline = task->dl.deadline;
++ waiter->tree.deadline = __tsk_deadline(task);
+ }
+
+ /*
+@@ -387,16 +387,20 @@ waiter_clone_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ * Only use with rt_waiter_node_{less,equal}()
+ */
+ #define task_to_waiter_node(p) \
+- &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
++ &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
+ #define task_to_waiter(p) \
+ &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
+
+ static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
+ struct rt_waiter_node *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline < right->deadline);
++#else
+ if (left->prio < right->prio)
+ return 1;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -405,16 +409,22 @@ static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
+ */
+ if (dl_prio(left->prio))
+ return dl_time_before(left->deadline, right->deadline);
++#endif
+
+ return 0;
++#endif
+ }
+
+ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
+ struct rt_waiter_node *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline == right->deadline);
++#else
+ if (left->prio != right->prio)
+ return 0;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -423,8 +433,10 @@ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
+ */
+ if (dl_prio(left->prio))
+ return left->deadline == right->deadline;
++#endif
+
+ return 1;
++#endif
+ }
+
+ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h
+index 31a785afee6c..0e7df1f689e0 100644
+--- a/kernel/locking/ww_mutex.h
++++ b/kernel/locking/ww_mutex.h
+@@ -247,6 +247,7 @@ __ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b)
+
+ /* equal static prio */
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_prio(a_prio)) {
+ if (dl_time_before(b->task->dl.deadline,
+ a->task->dl.deadline))
+@@ -256,6 +257,7 @@ __ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b)
+ b->task->dl.deadline))
+ return false;
+ }
++#endif
+
+ /* equal prio */
+ }
+diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
+index 8ae86371ddcd..a972ef1e31a7 100644
+--- a/kernel/sched/Makefile
++++ b/kernel/sched/Makefile
+@@ -33,7 +33,12 @@ endif
+ # These compilation units have roughly the same size and complexity - so their
+ # build parallelizes well and finishes roughly at once:
+ #
++ifdef CONFIG_SCHED_ALT
++obj-y += alt_core.o
++obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
++else
+ obj-y += core.o
+ obj-y += fair.o
++endif
+ obj-y += build_policy.o
+ obj-y += build_utility.o
+diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
+new file mode 100644
+index 000000000000..8f03f5312e4d
+--- /dev/null
++++ b/kernel/sched/alt_core.c
+@@ -0,0 +1,7648 @@
++/*
++ * kernel/sched/alt_core.c
++ *
++ * Core alternative kernel scheduler code and related syscalls
++ *
++ * Copyright (C) 1991-2002 Linus Torvalds
++ *
++ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
++ * a whole lot of those previous things.
++ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
++ * scheduler by Alfred Chen.
++ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
++ */
++#include <linux/sched/clock.h>
++#include <linux/sched/cputime.h>
++#include <linux/sched/debug.h>
++#include <linux/sched/hotplug.h>
++#include <linux/sched/init.h>
++#include <linux/sched/isolation.h>
++#include <linux/sched/loadavg.h>
++#include <linux/sched/mm.h>
++#include <linux/sched/nohz.h>
++#include <linux/sched/stat.h>
++#include <linux/sched/wake_q.h>
++
++#include <linux/blkdev.h>
++#include <linux/context_tracking.h>
++#include <linux/cpuset.h>
++#include <linux/delayacct.h>
++#include <linux/init_task.h>
++#include <linux/kcov.h>
++#include <linux/kprobes.h>
++#include <linux/nmi.h>
++#include <linux/rseq.h>
++#include <linux/scs.h>
++
++#include <uapi/linux/sched/types.h>
++
++#include <asm/irq_regs.h>
++#include <asm/switch_to.h>
++
++#define CREATE_TRACE_POINTS
++#include <trace/events/sched.h>
++#include <trace/events/ipi.h>
++#undef CREATE_TRACE_POINTS
++
++#include "sched.h"
++#include "smp.h"
++
++#include "pelt.h"
++
++#include "../../io_uring/io-wq.h"
++#include "../smpboot.h"
++
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpu);
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpumask);
++
++/*
++ * Export tracepoints that act as a bare tracehook (ie: have no trace event
++ * associated with them) to allow external modules to probe them.
++ */
++EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
++
++#define sched_feat(x) (1)
++/*
++ * Print a warning if need_resched is set for the given duration (if
++ * LATENCY_WARN is enabled).
++ *
++ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
++ * per boot.
++ */
++__read_mostly int sysctl_resched_latency_warn_ms = 100;
++__read_mostly int sysctl_resched_latency_warn_once = 1;
++
++#define ALT_SCHED_VERSION "v6.17-r0"
++
++#define STOP_PRIO (MAX_RT_PRIO - 1)
++
++/*
++ * Time slice
++ * (default: 4 msec, units: nanoseconds)
++ */
++unsigned int sysctl_sched_base_slice __read_mostly = (4 << 20);
++
++#include "alt_core.h"
++#include "alt_topology.h"
++
++/* Reschedule if less than this many μs left */
++#define RESCHED_NS (100 << 10)
++
++/**
++ * sched_yield_type - Type of sched_yield() will be performed.
++ * 0: No yield.
++ * 1: Requeue task. (default)
++ */
++int sched_yield_type __read_mostly = 1;
++
++cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DEFINE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_topo_end_mask);
++
++#ifdef CONFIG_SCHED_SMT
++DEFINE_STATIC_KEY_FALSE(sched_smt_present);
++EXPORT_SYMBOL_GPL(sched_smt_present);
++
++cpumask_t sched_smt_mask ____cacheline_aligned_in_smp;
++#endif
++
++/*
++ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
++ * the domain), this allows us to quickly tell if two cpus are in the same cache
++ * domain, see cpus_share_cache().
++ */
++DEFINE_PER_CPU(int, sd_llc_id);
++
++DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static cpumask_t sched_preempt_mask[SCHED_QUEUE_BITS + 2] ____cacheline_aligned_in_smp;
++
++cpumask_t *const sched_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS - 1];
++cpumask_t *const sched_sg_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS];
++cpumask_t *const sched_pcore_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS];
++cpumask_t *const sched_ecore_idle_mask = &sched_preempt_mask[SCHED_QUEUE_BITS + 1];
++
++/* task function */
++static inline const struct cpumask *task_user_cpus(struct task_struct *p)
++{
++ if (!p->user_cpus_ptr)
++ return cpu_possible_mask; /* &init_task.cpus_mask */
++ return p->user_cpus_ptr;
++}
++
++/* sched_queue related functions */
++static inline void sched_queue_init(struct sched_queue *q)
++{
++ int i;
++
++ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
++ for(i = 0; i < SCHED_LEVELS; i++)
++ INIT_LIST_HEAD(&q->heads[i]);
++}
++
++/*
++ * Init idle task and put into queue structure of rq
++ * IMPORTANT: may be called multiple times for a single cpu
++ */
++static inline void sched_queue_init_idle(struct sched_queue *q,
++ struct task_struct *idle)
++{
++ INIT_LIST_HEAD(&q->heads[IDLE_TASK_SCHED_PRIO]);
++ list_add_tail(&idle->sq_node, &q->heads[IDLE_TASK_SCHED_PRIO]);
++ idle->on_rq = TASK_ON_RQ_QUEUED;
++}
++
++#define CLEAR_CACHED_PREEMPT_MASK(pr, low, high, cpu) \
++ if (low < pr && pr <= high) \
++ cpumask_clear_cpu(cpu, sched_preempt_mask + pr);
++
++#define SET_CACHED_PREEMPT_MASK(pr, low, high, cpu) \
++ if (low < pr && pr <= high) \
++ cpumask_set_cpu(cpu, sched_preempt_mask + pr);
++
++static atomic_t sched_prio_record = ATOMIC_INIT(0);
++
++/* water mark related functions */
++static inline void update_sched_preempt_mask(struct rq *rq)
++{
++ int prio = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
++ int last_prio = rq->prio;
++ int cpu, pr;
++
++ if (prio == last_prio)
++ return;
++
++ rq->prio = prio;
++#ifdef CONFIG_SCHED_PDS
++ rq->prio_idx = sched_prio2idx(rq->prio, rq);
++#endif
++ cpu = cpu_of(rq);
++ pr = atomic_read(&sched_prio_record);
++
++ if (prio < last_prio) {
++ if (IDLE_TASK_SCHED_PRIO == last_prio) {
++ rq->clear_idle_mask_func(cpu, sched_idle_mask);
++ last_prio -= 2;
++ }
++ CLEAR_CACHED_PREEMPT_MASK(pr, prio, last_prio, cpu);
++
++ return;
++ }
++ /* last_prio < prio */
++ if (IDLE_TASK_SCHED_PRIO == prio) {
++ rq->set_idle_mask_func(cpu, sched_idle_mask);
++ prio -= 2;
++ }
++ SET_CACHED_PREEMPT_MASK(pr, last_prio, prio, cpu);
++}
++
++/* need a wrapper since we may need to trace from modules */
++EXPORT_TRACEPOINT_SYMBOL(sched_set_state_tp);
++
++/* Call via the helper macro trace_set_current_state. */
++void __trace_set_current_state(int state_value)
++{
++ trace_sched_set_state_tp(current, state_value);
++}
++EXPORT_SYMBOL(__trace_set_current_state);
++
++/*
++ * Serialization rules:
++ *
++ * Lock order:
++ *
++ * p->pi_lock
++ * rq->lock
++ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
++ *
++ * rq1->lock
++ * rq2->lock where: rq1 < rq2
++ *
++ * Regular state:
++ *
++ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
++ * local CPU's rq->lock, it optionally removes the task from the runqueue and
++ * always looks at the local rq data structures to find the most eligible task
++ * to run next.
++ *
++ * Task enqueue is also under rq->lock, possibly taken from another CPU.
++ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
++ * the local CPU to avoid bouncing the runqueue state around [ see
++ * ttwu_queue_wakelist() ]
++ *
++ * Task wakeup, specifically wakeups that involve migration, are horribly
++ * complicated to avoid having to take two rq->locks.
++ *
++ * Special state:
++ *
++ * System-calls and anything external will use task_rq_lock() which acquires
++ * both p->pi_lock and rq->lock. As a consequence the state they change is
++ * stable while holding either lock:
++ *
++ * - sched_setaffinity()/
++ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
++ * - set_user_nice(): p->se.load, p->*prio
++ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
++ * p->se.load, p->rt_priority,
++ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
++ * - sched_setnuma(): p->numa_preferred_nid
++ * - sched_move_task(): p->sched_task_group
++ * - uclamp_update_active() p->uclamp*
++ *
++ * p->state <- TASK_*:
++ *
++ * is changed locklessly using set_current_state(), __set_current_state() or
++ * set_special_state(), see their respective comments, or by
++ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
++ * concurrent self.
++ *
++ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
++ *
++ * is set by activate_task() and cleared by deactivate_task(), under
++ * rq->lock. Non-zero indicates the task is runnable, the special
++ * ON_RQ_MIGRATING state is used for migration without holding both
++ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
++ *
++ * Additionally it is possible to be ->on_rq but still be considered not
++ * runnable when p->se.sched_delayed is true. These tasks are on the runqueue
++ * but will be dequeued as soon as they get picked again. See the
++ * task_is_runnable() helper.
++ *
++ * p->on_cpu <- { 0, 1 }:
++ *
++ * is set by prepare_task() and cleared by finish_task() such that it will be
++ * set before p is scheduled-in and cleared after p is scheduled-out, both
++ * under rq->lock. Non-zero indicates the task is running on its CPU.
++ *
++ * [ The astute reader will observe that it is possible for two tasks on one
++ * CPU to have ->on_cpu = 1 at the same time. ]
++ *
++ * task_cpu(p): is changed by set_task_cpu(), the rules are:
++ *
++ * - Don't call set_task_cpu() on a blocked task:
++ *
++ * We don't care what CPU we're not running on, this simplifies hotplug,
++ * the CPU assignment of blocked tasks isn't required to be valid.
++ *
++ * - for try_to_wake_up(), called under p->pi_lock:
++ *
++ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
++ *
++ * - for migration called under rq->lock:
++ * [ see task_on_rq_migrating() in task_rq_lock() ]
++ *
++ * o move_queued_task()
++ * o detach_task()
++ *
++ * - for migration called under double_rq_lock():
++ *
++ * o __migrate_swap_task()
++ * o push_rt_task() / pull_rt_task()
++ * o push_dl_task() / pull_dl_task()
++ * o dl_task_offline_migration()
++ *
++ */
++
++/*
++ * Context: p->pi_lock
++ */
++static inline struct rq *
++task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock, unsigned long *flags)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock_irqsave(&rq->lock, *flags);
++ if (likely((p->on_cpu || task_on_rq_queued(p)) && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, *flags);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ raw_spin_lock_irqsave(&p->pi_lock, *flags);
++ if (likely(!p->on_cpu && !p->on_rq && rq == task_rq(p))) {
++ *plock = &p->pi_lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
++ }
++ }
++}
++
++static inline void
++task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock, unsigned long *flags)
++{
++ raw_spin_unlock_irqrestore(lock, *flags);
++}
++
++/*
++ * __task_rq_lock - lock the rq @p resides on.
++ */
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ lockdep_assert_held(&p->pi_lock);
++
++ for (;;) {
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
++ return rq;
++ raw_spin_unlock(&rq->lock);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++/*
++ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
++ */
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ for (;;) {
++ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ /*
++ * move_queued_task() task_rq_lock()
++ *
++ * ACQUIRE (rq->lock)
++ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
++ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
++ * [S] ->cpu = new_cpu [L] task_rq()
++ * [L] ->on_rq
++ * RELEASE (rq->lock)
++ *
++ * If we observe the old CPU in task_rq_lock(), the acquire of
++ * the old rq->lock will fully serialize against the stores.
++ *
++ * If we observe the new CPU in task_rq_lock(), the address
++ * dependency headed by '[L] rq = task_rq()' and the acquire
++ * will pair with the WMB to ensure we then also see migrating.
++ */
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++static inline void rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irqsave(&rq->lock, rf->flags);
++}
++
++static inline void rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
++}
++
++DEFINE_LOCK_GUARD_1(rq_lock_irqsave, struct rq,
++ rq_lock_irqsave(_T->lock, &_T->rf),
++ rq_unlock_irqrestore(_T->lock, &_T->rf),
++ struct rq_flags rf)
++
++void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
++{
++ raw_spinlock_t *lock;
++
++ /* Matches synchronize_rcu() in __sched_core_enable() */
++ preempt_disable();
++
++ for (;;) {
++ lock = __rq_lockp(rq);
++ raw_spin_lock_nested(lock, subclass);
++ if (likely(lock == __rq_lockp(rq))) {
++ /* preempt_count *MUST* be > 1 */
++ preempt_enable_no_resched();
++ return;
++ }
++ raw_spin_unlock(lock);
++ }
++}
++
++void raw_spin_rq_unlock(struct rq *rq)
++{
++ raw_spin_unlock(rq_lockp(rq));
++}
++
++/*
++ * RQ-clock updating methods:
++ */
++
++static void update_rq_clock_task(struct rq *rq, s64 delta)
++{
++/*
++ * In theory, the compile should just see 0 here, and optimize out the call
++ * to sched_rt_avg_update. But I don't trust it...
++ */
++ s64 __maybe_unused steal = 0, irq_delta = 0;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ if (irqtime_enabled()) {
++ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
++
++ /*
++ * Since irq_time is only updated on {soft,}irq_exit, we might run into
++ * this case when a previous update_rq_clock() happened inside a
++ * {soft,}IRQ region.
++ *
++ * When this happens, we stop ->clock_task and only update the
++ * prev_irq_time stamp to account for the part that fit, so that a next
++ * update will consume the rest. This ensures ->clock_task is
++ * monotonic.
++ *
++ * It does however cause some slight miss-attribution of {soft,}IRQ
++ * time, a more accurate solution would be to update the irq_time using
++ * the current rq->clock timestamp, except that would require using
++ * atomic ops.
++ */
++ if (irq_delta > delta)
++ irq_delta = delta;
++
++ rq->prev_irq_time += irq_delta;
++ delta -= irq_delta;
++ delayacct_irq(rq->curr, irq_delta);
++ }
++#endif
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ if (static_key_false((¶virt_steal_rq_enabled))) {
++ u64 prev_steal;
++
++ steal = prev_steal = paravirt_steal_clock(cpu_of(rq));
++ steal -= rq->prev_steal_time_rq;
++
++ if (unlikely(steal > delta))
++ steal = delta;
++
++ rq->prev_steal_time_rq = prev_steal;
++ delta -= steal;
++ }
++#endif
++
++ rq->clock_task += delta;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ if ((irq_delta + steal))
++ update_irq_load_avg(rq, irq_delta + steal);
++#endif
++}
++
++static inline void update_rq_clock(struct rq *rq)
++{
++ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
++
++ if (unlikely(delta <= 0))
++ return;
++ rq->clock += delta;
++ sched_update_rq_clock(rq);
++ update_rq_clock_task(rq, delta);
++}
++
++/*
++ * RQ Load update routine
++ */
++#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
++#define RQ_UTIL_SHIFT (8)
++#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
++
++#define LOAD_BLOCK(t) ((t) >> 17)
++#define LOAD_HALF_BLOCK(t) ((t) >> 16)
++#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
++#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
++#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
++
++static inline void rq_load_update(struct rq *rq)
++{
++ u64 time = rq->clock;
++ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp), RQ_LOAD_HISTORY_BITS - 1);
++ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
++ u64 curr = !!rq->nr_running;
++
++ if (delta) {
++ rq->load_history = rq->load_history >> delta;
++
++ if (delta < RQ_UTIL_SHIFT) {
++ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
++ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
++ rq->load_history ^= LOAD_BLOCK_BIT(delta);
++ }
++
++ rq->load_block = BLOCK_MASK(time) * prev;
++ } else {
++ rq->load_block += (time - rq->load_stamp) * prev;
++ }
++ if (prev ^ curr)
++ rq->load_history ^= CURRENT_LOAD_BIT;
++ rq->load_stamp = time;
++}
++
++unsigned long rq_load_util(struct rq *rq, unsigned long max)
++{
++ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
++}
++
++unsigned long sched_cpu_util(int cpu)
++{
++ return rq_load_util(cpu_rq(cpu), arch_scale_cpu_capacity(cpu));
++}
++
++#ifdef CONFIG_CPU_FREQ
++/**
++ * cpufreq_update_util - Take a note about CPU utilization changes.
++ * @rq: Runqueue to carry out the update for.
++ * @flags: Update reason flags.
++ *
++ * This function is called by the scheduler on the CPU whose utilization is
++ * being updated.
++ *
++ * It can only be called from RCU-sched read-side critical sections.
++ *
++ * The way cpufreq is currently arranged requires it to evaluate the CPU
++ * performance state (frequency/voltage) on a regular basis to prevent it from
++ * being stuck in a completely inadequate performance level for too long.
++ * That is not guaranteed to happen if the updates are only triggered from CFS
++ * and DL, though, because they may not be coming in if only RT tasks are
++ * active all the time (or there are RT tasks only).
++ *
++ * As a workaround for that issue, this function is called periodically by the
++ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
++ * but that really is a band-aid. Going forward it should be replaced with
++ * solutions targeted more specifically at RT tasks.
++ */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ struct update_util_data *data;
++
++ rq_load_update(rq);
++ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data, cpu_of(rq)));
++ if (data)
++ data->func(data, rq_clock(rq), flags);
++}
++#else /* !CONFIG_CPU_FREQ: */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ rq_load_update(rq);
++}
++#endif /* !CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++/*
++ * Tick may be needed by tasks in the runqueue depending on their policy and
++ * requirements. If tick is needed, lets send the target an IPI to kick it out
++ * of nohz mode if necessary.
++ */
++static inline void sched_update_tick_dependency(struct rq *rq)
++{
++ int cpu = cpu_of(rq);
++
++ if (!tick_nohz_full_cpu(cpu))
++ return;
++
++ if (rq->nr_running < 2)
++ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
++ else
++ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
++}
++#else /* !CONFIG_NO_HZ_FULL: */
++static inline void sched_update_tick_dependency(struct rq *rq) { }
++#endif /* !CONFIG_NO_HZ_FULL */
++
++static inline void add_nr_running(struct rq *rq, unsigned count)
++{
++ rq->nr_running += count;
++ if (rq->nr_running > 1) {
++ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
++ rq->prio_balance_time = rq->clock;
++ }
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void sub_nr_running(struct rq *rq, unsigned count)
++{
++ rq->nr_running -= count;
++ if (rq->nr_running < 2) {
++ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
++ rq->prio_balance_time = 0;
++ }
++
++ sched_update_tick_dependency(rq);
++}
++
++bool sched_task_on_rq(struct task_struct *p)
++{
++ return task_on_rq_queued(p);
++}
++
++unsigned long get_wchan(struct task_struct *p)
++{
++ unsigned long ip = 0;
++ unsigned int state;
++
++ if (!p || p == current)
++ return 0;
++
++ /* Only get wchan if task is blocked and we can keep it that way. */
++ raw_spin_lock_irq(&p->pi_lock);
++ state = READ_ONCE(p->__state);
++ smp_rmb(); /* see try_to_wake_up() */
++ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
++ ip = __get_wchan(p);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ return ip;
++}
++
++/*
++ * Add/Remove/Requeue task to/from the runqueue routines
++ * Context: rq->lock
++ */
++#define __SCHED_DEQUEUE_TASK(p, rq, flags, func) \
++ sched_info_dequeue(rq, p); \
++ \
++ __list_del_entry(&p->sq_node); \
++ if (p->sq_node.prev == p->sq_node.next) { \
++ clear_bit(sched_idx2prio(p->sq_node.next - &rq->queue.heads[0], rq), \
++ rq->queue.bitmap); \
++ func; \
++ }
++
++#define __SCHED_ENQUEUE_TASK(p, rq, flags, func) \
++ sched_info_enqueue(rq, p); \
++ { \
++ int idx, prio; \
++ TASK_SCHED_PRIO_IDX(p, rq, idx, prio); \
++ list_add_tail(&p->sq_node, &rq->queue.heads[idx]); \
++ if (list_is_first(&p->sq_node, &rq->queue.heads[idx])) { \
++ set_bit(prio, rq->queue.bitmap); \
++ func; \
++ } \
++ }
++
++static inline void __dequeue_task(struct task_struct *p, struct rq *rq)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_DEQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
++}
++
++static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++ __dequeue_task(p, rq);
++ sub_nr_running(rq, 1);
++}
++
++static inline void __enqueue_task(struct task_struct *p, struct rq *rq)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: enqueue(%d) %px %d\n", cpu_of(rq), p, p->prio);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_ENQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
++}
++
++static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++ __enqueue_task(p, rq);
++ add_nr_running(rq, 1);
++}
++
++void requeue_task(struct task_struct *p, struct rq *rq)
++{
++ struct list_head *node = &p->sq_node;
++ int deq_idx, idx, prio;
++
++ TASK_SCHED_PRIO_IDX(p, rq, idx, prio);
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
++ cpu_of(rq), task_cpu(p));
++#endif
++ if (list_is_last(node, &rq->queue.heads[idx]))
++ return;
++
++ __list_del_entry(node);
++ if (node->prev == node->next && (deq_idx = node->next - &rq->queue.heads[0]) != idx)
++ clear_bit(sched_idx2prio(deq_idx, rq), rq->queue.bitmap);
++
++ list_add_tail(node, &rq->queue.heads[idx]);
++ if (list_is_first(node, &rq->queue.heads[idx]))
++ set_bit(prio, rq->queue.bitmap);
++ update_sched_preempt_mask(rq);
++}
++
++/*
++ * try_cmpxchg based fetch_or() macro so it works for different integer types:
++ */
++#define fetch_or(ptr, mask) \
++ ({ \
++ typeof(ptr) _ptr = (ptr); \
++ typeof(mask) _mask = (mask); \
++ typeof(*_ptr) _val = *_ptr; \
++ \
++ do { \
++ } while (!try_cmpxchg(_ptr, &_val, _val | _mask)); \
++ _val; \
++})
++
++#ifdef TIF_POLLING_NRFLAG
++/*
++ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
++ * this avoids any races wrt polling state changes and thereby avoids
++ * spurious IPIs.
++ */
++static inline bool set_nr_and_not_polling(struct thread_info *ti, int tif)
++{
++ return !(fetch_or(&ti->flags, 1 << tif) & _TIF_POLLING_NRFLAG);
++}
++
++/*
++ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
++ *
++ * If this returns true, then the idle task promises to call
++ * sched_ttwu_pending() and reschedule soon.
++ */
++static bool set_nr_if_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ typeof(ti->flags) val = READ_ONCE(ti->flags);
++
++ do {
++ if (!(val & _TIF_POLLING_NRFLAG))
++ return false;
++ if (val & _TIF_NEED_RESCHED)
++ return true;
++ } while (!try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED));
++
++ return true;
++}
++
++#else /* !TIF_POLLING_NRFLAG: */
++static inline bool set_nr_and_not_polling(struct thread_info *ti, int tif)
++{
++ set_ti_thread_flag(ti, tif);
++ return true;
++}
++
++static inline bool set_nr_if_polling(struct task_struct *p)
++{
++ return false;
++}
++#endif /* !TIF_POLLING_NRFLAG */
++
++static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ struct wake_q_node *node = &task->wake_q;
++
++ /*
++ * Atomically grab the task, if ->wake_q is !nil already it means
++ * it's already queued (either by us or someone else) and will get the
++ * wakeup due to that.
++ *
++ * In order to ensure that a pending wakeup will observe our pending
++ * state, even in the failed case, an explicit smp_mb() must be used.
++ */
++ smp_mb__before_atomic();
++ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
++ return false;
++
++ /*
++ * The head is context local, there can be no concurrency.
++ */
++ *head->lastp = node;
++ head->lastp = &node->next;
++ return true;
++}
++
++/**
++ * wake_q_add() - queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ */
++void wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ if (__wake_q_add(head, task))
++ get_task_struct(task);
++}
++
++/**
++ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ *
++ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
++ * that already hold reference to @task can call the 'safe' version and trust
++ * wake_q to do the right thing depending whether or not the @task is already
++ * queued for wakeup.
++ */
++void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
++{
++ if (!__wake_q_add(head, task))
++ put_task_struct(task);
++}
++
++void wake_up_q(struct wake_q_head *head)
++{
++ struct wake_q_node *node = head->first;
++
++ while (node != WAKE_Q_TAIL) {
++ struct task_struct *task;
++
++ task = container_of(node, struct task_struct, wake_q);
++ node = node->next;
++ /* pairs with cmpxchg_relaxed() in __wake_q_add() */
++ WRITE_ONCE(task->wake_q.next, NULL);
++ /* Task can safely be re-inserted now. */
++
++ /*
++ * wake_up_process() executes a full barrier, which pairs with
++ * the queueing in wake_q_add() so as not to miss wakeups.
++ */
++ wake_up_process(task);
++ put_task_struct(task);
++ }
++}
++
++/*
++ * resched_curr - mark rq's current task 'to be rescheduled now'.
++ *
++ * On UP this means the setting of the need_resched flag, on SMP it
++ * might also involve a cross-CPU call to trigger the scheduler on
++ * the target CPU.
++ */
++static inline void __resched_curr(struct rq *rq, int tif)
++{
++ struct task_struct *curr = rq->curr;
++ struct thread_info *cti = task_thread_info(curr);
++ int cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Always immediately preempt the idle task; no point in delaying doing
++ * actual work.
++ */
++ if (is_idle_task(curr) && tif == TIF_NEED_RESCHED_LAZY)
++ tif = TIF_NEED_RESCHED;
++
++ if (cti->flags & ((1 << tif) | _TIF_NEED_RESCHED))
++ return;
++
++ cpu = cpu_of(rq);
++
++ trace_sched_set_need_resched_tp(curr, cpu, tif);
++ if (cpu == smp_processor_id()) {
++ set_ti_thread_flag(cti, tif);
++ if (tif == TIF_NEED_RESCHED)
++ set_preempt_need_resched();
++ return;
++ }
++
++ if (set_nr_and_not_polling(cti, tif)) {
++ if (tif == TIF_NEED_RESCHED)
++ smp_send_reschedule(cpu);
++ } else {
++ trace_sched_wake_idle_without_ipi(cpu);
++ }
++}
++
++void __trace_set_need_resched(struct task_struct *curr, int tif)
++{
++ trace_sched_set_need_resched_tp(curr, smp_processor_id(), tif);
++}
++
++static inline void resched_curr(struct rq *rq)
++{
++ __resched_curr(rq, TIF_NEED_RESCHED);
++}
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_preempt_lazy);
++static __always_inline bool dynamic_preempt_lazy(void)
++{
++ return static_branch_unlikely(&sk_dynamic_preempt_lazy);
++}
++#else /* !CONFIG_PREEMPT_DYNAMIC: */
++static __always_inline bool dynamic_preempt_lazy(void)
++{
++ return IS_ENABLED(CONFIG_PREEMPT_LAZY);
++}
++#endif /* !CONFIG_PREEMPT_DYNAMIC */
++
++static __always_inline int get_lazy_tif_bit(void)
++{
++ if (dynamic_preempt_lazy())
++ return TIF_NEED_RESCHED_LAZY;
++
++ return TIF_NEED_RESCHED;
++}
++
++static inline void resched_curr_lazy(struct rq *rq)
++{
++ __resched_curr(rq, get_lazy_tif_bit());
++}
++
++void resched_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (cpu_online(cpu) || cpu == smp_processor_id())
++ resched_curr(cpu_rq(cpu));
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++#ifdef CONFIG_NO_HZ_COMMON
++/*
++ * This routine will record that the CPU is going idle with tick stopped.
++ * This info will be used in performing idle load balancing in the future.
++ */
++void nohz_balance_enter_idle(int cpu) {}
++
++/*
++ * In the semi idle case, use the nearest busy CPU for migrating timers
++ * from an idle CPU. This is good for power-savings.
++ *
++ * We don't do similar optimization for completely idle system, as
++ * selecting an idle CPU will add more delays to the timers than intended
++ * (as that CPU's timer base may not be up to date wrt jiffies etc).
++ */
++int get_nohz_timer_target(void)
++{
++ int i, cpu = smp_processor_id(), default_cpu = -1;
++ struct cpumask *mask;
++ const struct cpumask *hk_mask;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE)) {
++ if (!idle_cpu(cpu))
++ return cpu;
++ default_cpu = cpu;
++ }
++
++ hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
++
++ for (mask = per_cpu(sched_cpu_topo_masks, cpu);
++ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
++ for_each_cpu_and(i, mask, hk_mask)
++ if (!idle_cpu(i))
++ return i;
++
++ if (default_cpu == -1)
++ default_cpu = housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE);
++ cpu = default_cpu;
++
++ return cpu;
++}
++
++/*
++ * When add_timer_on() enqueues a timer into the timer wheel of an
++ * idle CPU then this timer might expire before the next timer event
++ * which is scheduled to wake up that CPU. In case of a completely
++ * idle system the next event might even be infinite time into the
++ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
++ * leaves the inner idle loop so the newly added timer is taken into
++ * account when the CPU goes back to idle and evaluates the timer
++ * wheel for the next timer event.
++ */
++static inline void wake_up_idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (cpu == smp_processor_id())
++ return;
++
++ /*
++ * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
++ * part of the idle loop. This forces an exit from the idle loop
++ * and a round trip to schedule(). Now this could be optimized
++ * because a simple new idle loop iteration is enough to
++ * re-evaluate the next tick. Provided some re-ordering of tick
++ * nohz functions that would need to follow TIF_NR_POLLING
++ * clearing:
++ *
++ * - On most architectures, a simple fetch_or on ti::flags with a
++ * "0" value would be enough to know if an IPI needs to be sent.
++ *
++ * - x86 needs to perform a last need_resched() check between
++ * monitor and mwait which doesn't take timers into account.
++ * There a dedicated TIF_TIMER flag would be required to
++ * fetch_or here and be checked along with TIF_NEED_RESCHED
++ * before mwait().
++ *
++ * However, remote timer enqueue is not such a frequent event
++ * and testing of the above solutions didn't appear to report
++ * much benefits.
++ */
++ if (set_nr_and_not_polling(task_thread_info(rq->idle), TIF_NEED_RESCHED))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++static inline bool wake_up_full_nohz_cpu(int cpu)
++{
++ /*
++ * We just need the target to call irq_exit() and re-evaluate
++ * the next tick. The nohz full kick at least implies that.
++ * If needed we can still optimize that later with an
++ * empty IRQ.
++ */
++ if (cpu_is_offline(cpu))
++ return true; /* Don't try to wake offline CPUs. */
++ if (tick_nohz_full_cpu(cpu)) {
++ if (cpu != smp_processor_id() ||
++ tick_nohz_tick_stopped())
++ tick_nohz_full_kick_cpu(cpu);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_nohz_cpu(int cpu)
++{
++ if (!wake_up_full_nohz_cpu(cpu))
++ wake_up_idle_cpu(cpu);
++}
++
++static void nohz_csd_func(void *info)
++{
++ struct rq *rq = info;
++ int cpu = cpu_of(rq);
++ unsigned int flags;
++
++ /*
++ * Release the rq::nohz_csd.
++ */
++ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
++ WARN_ON(!(flags & NOHZ_KICK_MASK));
++
++ rq->idle_balance = idle_cpu(cpu);
++ if (rq->idle_balance) {
++ rq->nohz_idle_balance = flags;
++ __raise_softirq_irqoff(SCHED_SOFTIRQ);
++ }
++}
++
++#endif /* CONFIG_NO_HZ_COMMON */
++
++static inline void wakeup_preempt(struct rq *rq)
++{
++ if (sched_rq_first_task(rq) != rq->curr)
++ resched_curr(rq);
++}
++
++static __always_inline
++int __task_state_match(struct task_struct *p, unsigned int state)
++{
++ if (READ_ONCE(p->__state) & state)
++ return 1;
++
++ if (READ_ONCE(p->saved_state) & state)
++ return -1;
++
++ return 0;
++}
++
++static __always_inline
++int task_state_match(struct task_struct *p, unsigned int state)
++{
++ /*
++ * Serialize against current_save_and_set_rtlock_wait_state(),
++ * current_restore_rtlock_saved_state(), and __refrigerator().
++ */
++ guard(raw_spinlock_irq)(&p->pi_lock);
++
++ return __task_state_match(p, state);
++}
++
++/*
++ * wait_task_inactive - wait for a thread to unschedule.
++ *
++ * Wait for the thread to block in any of the states set in @match_state.
++ * If it changes, i.e. @p might have woken up, then return zero. When we
++ * succeed in waiting for @p to be off its CPU, we return a positive number
++ * (its total switch count). If a second call a short while later returns the
++ * same number, the caller can be sure that @p has remained unscheduled the
++ * whole time.
++ *
++ * The caller must ensure that the task *will* unschedule sometime soon,
++ * else this function might spin for a *long* time. This function can't
++ * be called with interrupts off, or it may introduce deadlock with
++ * smp_call_function() if an IPI is sent by the same process we are
++ * waiting to become inactive.
++ */
++unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
++{
++ unsigned long flags;
++ int running, queued, match;
++ unsigned long ncsw;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ for (;;) {
++ rq = task_rq(p);
++
++ /*
++ * If the task is actively running on another CPU
++ * still, just relax and busy-wait without holding
++ * any locks.
++ *
++ * NOTE! Since we don't hold any locks, it's not
++ * even sure that "rq" stays as the right runqueue!
++ * But we don't care, since this will return false
++ * if the runqueue has changed and p is actually now
++ * running somewhere else!
++ */
++ while (task_on_cpu(p)) {
++ if (!task_state_match(p, match_state))
++ return 0;
++ cpu_relax();
++ }
++
++ /*
++ * Ok, time to look more closely! We need the rq
++ * lock now, to be *sure*. If we're wrong, we'll
++ * just go back and repeat.
++ */
++ task_access_lock_irqsave(p, &lock, &flags);
++ trace_sched_wait_task(p);
++ running = task_on_cpu(p);
++ queued = p->on_rq;
++ ncsw = 0;
++ if ((match = __task_state_match(p, match_state))) {
++ /*
++ * When matching on p->saved_state, consider this task
++ * still queued so it will wait.
++ */
++ if (match < 0)
++ queued = 1;
++ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
++ }
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ /*
++ * If it changed from the expected state, bail out now.
++ */
++ if (unlikely(!ncsw))
++ break;
++
++ /*
++ * Was it really running after all now that we
++ * checked with the proper locks actually held?
++ *
++ * Oops. Go back and try again..
++ */
++ if (unlikely(running)) {
++ cpu_relax();
++ continue;
++ }
++
++ /*
++ * It's not enough that it's not actively running,
++ * it must be off the runqueue _entirely_, and not
++ * preempted!
++ *
++ * So if it was still runnable (but just not actively
++ * running right now), it's preempted, and we should
++ * yield - it could be a while.
++ */
++ if (unlikely(queued)) {
++ ktime_t to = NSEC_PER_SEC / HZ;
++
++ set_current_state(TASK_UNINTERRUPTIBLE);
++ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
++ continue;
++ }
++
++ /*
++ * Ahh, all good. It wasn't running, and it wasn't
++ * runnable, which means that it will never become
++ * running in the future either. We're all done!
++ */
++ break;
++ }
++
++ return ncsw;
++}
++
++#ifdef CONFIG_SCHED_HRTICK
++/*
++ * Use HR-timers to deliver accurate preemption points.
++ */
++
++static void hrtick_clear(struct rq *rq)
++{
++ if (hrtimer_active(&rq->hrtick_timer))
++ hrtimer_cancel(&rq->hrtick_timer);
++}
++
++/*
++ * High-resolution timer tick.
++ * Runs from hardirq context with interrupts disabled.
++ */
++static enum hrtimer_restart hrtick(struct hrtimer *timer)
++{
++ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
++
++ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
++
++ raw_spin_lock(&rq->lock);
++ resched_curr(rq);
++ raw_spin_unlock(&rq->lock);
++
++ return HRTIMER_NORESTART;
++}
++
++/*
++ * Use hrtick when:
++ * - enabled by features
++ * - hrtimer is actually high res
++ */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ /**
++ * Alt schedule FW doesn't support sched_feat yet
++ if (!sched_feat(HRTICK))
++ return 0;
++ */
++ if (!cpu_active(cpu_of(rq)))
++ return 0;
++ return hrtimer_is_hres_active(&rq->hrtick_timer);
++}
++
++static void __hrtick_restart(struct rq *rq)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ ktime_t time = rq->hrtick_time;
++
++ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
++}
++
++/*
++ * called from hardirq (IPI) context
++ */
++static void __hrtick_start(void *arg)
++{
++ struct rq *rq = arg;
++
++ raw_spin_lock(&rq->lock);
++ __hrtick_restart(rq);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and IRQs disabled
++ */
++static inline void hrtick_start(struct rq *rq, u64 delay)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ s64 delta;
++
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense and can cause timer DoS.
++ */
++ delta = max_t(s64, delay, 10000LL);
++
++ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
++
++ if (rq == this_rq())
++ __hrtick_restart(rq);
++ else
++ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
++}
++
++static void hrtick_rq_init(struct rq *rq)
++{
++ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
++ hrtimer_setup(&rq->hrtick_timer, hrtick, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
++}
++#else /* !CONFIG_SCHED_HRTICK: */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ return 0;
++}
++
++static inline void hrtick_clear(struct rq *rq)
++{
++}
++
++static inline void hrtick_rq_init(struct rq *rq)
++{
++}
++#endif /* !CONFIG_SCHED_HRTICK */
++
++/*
++ * activate_task - move a task to the runqueue.
++ *
++ * Context: rq->lock
++ */
++static void activate_task(struct task_struct *p, struct rq *rq)
++{
++ enqueue_task(p, rq, ENQUEUE_WAKEUP);
++
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
++ ASSERT_EXCLUSIVE_WRITER(p->on_rq);
++
++ /*
++ * If in_iowait is set, the code below may not trigger any cpufreq
++ * utilization updates, so do it here explicitly with the IOWAIT flag
++ * passed.
++ */
++ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
++}
++
++static void block_task(struct rq *rq, struct task_struct *p)
++{
++ dequeue_task(p, rq, DEQUEUE_SLEEP);
++
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible++;
++
++ if (p->in_iowait) {
++ atomic_inc(&rq->nr_iowait);
++ delayacct_blkio_start();
++ }
++
++ ASSERT_EXCLUSIVE_WRITER(p->on_rq);
++
++ /*
++ * The moment this write goes through, ttwu() can swoop in and migrate
++ * this task, rendering our rq->__lock ineffective.
++ *
++ * __schedule() try_to_wake_up()
++ * LOCK rq->__lock LOCK p->pi_lock
++ * pick_next_task()
++ * pick_next_task_fair()
++ * pick_next_entity()
++ * dequeue_entities()
++ * __block_task()
++ * RELEASE p->on_rq = 0 if (p->on_rq && ...)
++ * break;
++ *
++ * ACQUIRE (after ctrl-dep)
++ *
++ * cpu = select_task_rq();
++ * set_task_cpu(p, cpu);
++ * ttwu_queue()
++ * ttwu_do_activate()
++ * LOCK rq->__lock
++ * activate_task()
++ * STORE p->on_rq = 1
++ * UNLOCK rq->__lock
++ *
++ * Callers must ensure to not reference @p after this -- we no longer
++ * own it.
++ */
++ smp_store_release(&p->on_rq, 0);
++}
++
++static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
++{
++ /*
++ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
++ * successfully executed on another CPU. We must ensure that updates of
++ * per-task data have been completed by this moment.
++ */
++ smp_wmb();
++
++ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
++}
++
++void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * We should never call set_task_cpu() on a blocked task,
++ * ttwu() will sort out the placement.
++ */
++ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
++
++#ifdef CONFIG_LOCKDEP
++ /*
++ * The caller should hold either p->pi_lock or rq->lock, when changing
++ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
++ *
++ * sched_move_task() holds both and thus holding either pins the cgroup,
++ * see task_group().
++ */
++ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
++ lockdep_is_held(&task_rq(p)->lock)));
++#endif
++ /*
++ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
++ */
++ WARN_ON_ONCE(!cpu_online(new_cpu));
++
++ WARN_ON_ONCE(is_migration_disabled(p));
++ trace_sched_migrate_task(p, new_cpu);
++
++ if (task_cpu(p) != new_cpu)
++ {
++ rseq_migrate(p);
++ sched_mm_cid_migrate_from(p);
++ perf_event_task_migrate(p);
++ }
++
++ __set_task_cpu(p, new_cpu);
++}
++
++static void
++__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ /*
++ * This here violates the locking rules for affinity, since we're only
++ * supposed to change these variables while holding both rq->lock and
++ * p->pi_lock.
++ *
++ * HOWEVER, it magically works, because ttwu() is the only code that
++ * accesses these variables under p->pi_lock and only does so after
++ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
++ * before finish_task().
++ *
++ * XXX do further audits, this smells like something putrid.
++ */
++ WARN_ON_ONCE(!p->on_cpu);
++ p->cpus_ptr = new_mask;
++}
++
++void migrate_disable(void)
++{
++ struct task_struct *p = current;
++ int cpu;
++
++ if (p->migration_disabled) {
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Warn about overflow half-way through the range.
++ */
++ WARN_ON_ONCE((s16)p->migration_disabled < 0);
++#endif
++ p->migration_disabled++;
++ return;
++ }
++
++ guard(preempt)();
++ cpu = smp_processor_id();
++ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
++ cpu_rq(cpu)->nr_pinned++;
++ p->migration_disabled = 1;
++ /*
++ * Violates locking rules! see comment in __do_set_cpus_ptr().
++ */
++ if (p->cpus_ptr == &p->cpus_mask)
++ __do_set_cpus_ptr(p, cpumask_of(cpu));
++ }
++}
++EXPORT_SYMBOL_GPL(migrate_disable);
++
++void migrate_enable(void)
++{
++ struct task_struct *p = current;
++
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Check both overflow from migrate_disable() and superfluous
++ * migrate_enable().
++ */
++ if (WARN_ON_ONCE((s16)p->migration_disabled <= 0))
++ return;
++#endif
++
++ if (p->migration_disabled > 1) {
++ p->migration_disabled--;
++ return;
++ }
++
++ /*
++ * Ensure stop_task runs either before or after this, and that
++ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
++ */
++ guard(preempt)();
++ /*
++ * Assumption: current should be running on allowed cpu
++ */
++ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
++ if (p->cpus_ptr != &p->cpus_mask)
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ /*
++ * Mustn't clear migration_disabled() until cpus_ptr points back at the
++ * regular cpus_mask, otherwise things that race (eg.
++ * select_fallback_rq) get confused.
++ */
++ barrier();
++ p->migration_disabled = 0;
++ this_rq()->nr_pinned--;
++}
++EXPORT_SYMBOL_GPL(migrate_enable);
++
++static void __migrate_force_enable(struct task_struct *p, struct rq *rq)
++{
++ if (likely(p->cpus_ptr != &p->cpus_mask))
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ p->migration_disabled = 0;
++ /* When p is migrate_disabled, rq->lock should be held */
++ rq->nr_pinned--;
++}
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return rq->nr_pinned;
++}
++
++/*
++ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
++ * __set_cpus_allowed_ptr() and select_fallback_rq().
++ */
++static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
++{
++ /* When not in the task's cpumask, no point in looking further. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /* migrate_disabled() must be allowed to finish. */
++ if (is_migration_disabled(p))
++ return cpu_online(cpu);
++
++ /* Non kernel threads are not allowed during either online or offline. */
++ if (!(p->flags & PF_KTHREAD))
++ return cpu_active(cpu) && task_cpu_possible(cpu, p);
++
++ /* KTHREAD_IS_PER_CPU is always allowed. */
++ if (kthread_is_per_cpu(p))
++ return cpu_online(cpu);
++
++ /* Regular kernel threads don't get to stay during offline. */
++ if (cpu_dying(cpu))
++ return false;
++
++ /* But are allowed during online. */
++ return cpu_online(cpu);
++}
++
++/*
++ * This is how migration works:
++ *
++ * 1) we invoke migration_cpu_stop() on the target CPU using
++ * stop_one_cpu().
++ * 2) stopper starts to run (implicitly forcing the migrated thread
++ * off the CPU)
++ * 3) it checks whether the migrated task is still in the wrong runqueue.
++ * 4) if it's in the wrong runqueue then the migration thread removes
++ * it and puts it into the right queue.
++ * 5) stopper completes and stop_one_cpu() returns and the migration
++ * is done.
++ */
++
++/*
++ * move_queued_task - move a queued task to new rq.
++ *
++ * Returns (locked) new rq. Old rq's lock is released.
++ */
++struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu)
++{
++ lockdep_assert_held(&rq->lock);
++
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ set_task_cpu(p, new_cpu);
++ raw_spin_unlock(&rq->lock);
++
++ rq = cpu_rq(new_cpu);
++
++ raw_spin_lock(&rq->lock);
++ WARN_ON_ONCE(task_cpu(p) != new_cpu);
++
++ sched_mm_cid_migrate_to(rq, p);
++
++ sched_task_sanity_check(p, rq);
++ enqueue_task(p, rq, 0);
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
++ wakeup_preempt(rq);
++
++ return rq;
++}
++
++struct migration_arg {
++ struct task_struct *task;
++ int dest_cpu;
++};
++
++/*
++ * Move (not current) task off this CPU, onto the destination CPU. We're doing
++ * this because either it can't run here any more (set_cpus_allowed()
++ * away from this CPU, or CPU going down), or because we're
++ * attempting to rebalance this task on exec (sched_exec).
++ *
++ * So we race with normal scheduler movements, but that's OK, as long
++ * as the task is no longer on this CPU.
++ */
++static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int dest_cpu)
++{
++ /* Affinity changed (again). */
++ if (!is_cpu_allowed(p, dest_cpu))
++ return rq;
++
++ return move_queued_task(rq, p, dest_cpu);
++}
++
++/*
++ * migration_cpu_stop - this will be executed by a high-prio stopper thread
++ * and performs thread migration by bumping thread off CPU then
++ * 'pushing' onto another runqueue.
++ */
++static int migration_cpu_stop(void *data)
++{
++ struct migration_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++
++ /*
++ * The original target CPU might have gone down and we might
++ * be on another CPU but it doesn't matter.
++ */
++ local_irq_save(flags);
++ /*
++ * We need to explicitly wake pending tasks before running
++ * __migrate_task() such that we will not miss enforcing cpus_ptr
++ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
++ */
++ flush_smp_call_function_queue();
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++ /*
++ * If task_rq(p) != rq, it cannot be migrated here, because we're
++ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
++ * we're holding p->pi_lock.
++ */
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ rq = __migrate_task(rq, p, arg->dest_cpu);
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++static inline void
++set_cpus_allowed_common(struct task_struct *p, struct affinity_context *ctx)
++{
++ cpumask_copy(&p->cpus_mask, ctx->new_mask);
++ p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
++
++ /*
++ * Swap in a new user_cpus_ptr if SCA_USER flag set
++ */
++ if (ctx->flags & SCA_USER)
++ swap(p->user_cpus_ptr, ctx->user_mask);
++}
++
++static void
++__do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
++{
++ lockdep_assert_held(&p->pi_lock);
++ set_cpus_allowed_common(p, ctx);
++ mm_set_cpus_allowed(p->mm, ctx->new_mask);
++}
++
++/*
++ * Used for kthread_bind() and select_fallback_rq(), in both cases the user
++ * affinity (if any) should be destroyed too.
++ */
++void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .user_mask = NULL,
++ .flags = SCA_USER, /* clear the user requested mask */
++ };
++ union cpumask_rcuhead {
++ cpumask_t cpumask;
++ struct rcu_head rcu;
++ };
++
++ __do_set_cpus_allowed(p, &ac);
++
++ if (is_migration_disabled(p) && !cpumask_test_cpu(task_cpu(p), &p->cpus_mask))
++ __migrate_force_enable(p, task_rq(p));
++
++ /*
++ * Because this is called with p->pi_lock held, it is not possible
++ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
++ * kfree_rcu().
++ */
++ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
++}
++
++int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
++ int node)
++{
++ cpumask_t *user_mask;
++ unsigned long flags;
++
++ /*
++ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
++ * may differ by now due to racing.
++ */
++ dst->user_cpus_ptr = NULL;
++
++ /*
++ * This check is racy and losing the race is a valid situation.
++ * It is not worth the extra overhead of taking the pi_lock on
++ * every fork/clone.
++ */
++ if (data_race(!src->user_cpus_ptr))
++ return 0;
++
++ user_mask = alloc_user_cpus_ptr(node);
++ if (!user_mask)
++ return -ENOMEM;
++
++ /*
++ * Use pi_lock to protect content of user_cpus_ptr
++ *
++ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
++ * do_set_cpus_allowed().
++ */
++ raw_spin_lock_irqsave(&src->pi_lock, flags);
++ if (src->user_cpus_ptr) {
++ swap(dst->user_cpus_ptr, user_mask);
++ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
++ }
++ raw_spin_unlock_irqrestore(&src->pi_lock, flags);
++
++ if (unlikely(user_mask))
++ kfree(user_mask);
++
++ return 0;
++}
++
++static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
++{
++ struct cpumask *user_mask = NULL;
++
++ swap(p->user_cpus_ptr, user_mask);
++
++ return user_mask;
++}
++
++void release_user_cpus_ptr(struct task_struct *p)
++{
++ kfree(clear_user_cpus_ptr(p));
++}
++
++/**
++ * task_curr - is this task currently executing on a CPU?
++ * @p: the task in question.
++ *
++ * Return: 1 if the task is currently executing. 0 otherwise.
++ */
++inline int task_curr(const struct task_struct *p)
++{
++ return cpu_curr(task_cpu(p)) == p;
++}
++
++/***
++ * kick_process - kick a running thread to enter/exit the kernel
++ * @p: the to-be-kicked thread
++ *
++ * Cause a process which is running on another CPU to enter
++ * kernel-mode, without any delay. (to get signals handled.)
++ *
++ * NOTE: this function doesn't have to take the runqueue lock,
++ * because all it wants to ensure is that the remote task enters
++ * the kernel. If the IPI races and the task has been migrated
++ * to another CPU then no harm is done and the purpose has been
++ * achieved as well.
++ */
++void kick_process(struct task_struct *p)
++{
++ guard(preempt)();
++ int cpu = task_cpu(p);
++
++ if ((cpu != smp_processor_id()) && task_curr(p))
++ smp_send_reschedule(cpu);
++}
++EXPORT_SYMBOL_GPL(kick_process);
++
++/*
++ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
++ *
++ * A few notes on cpu_active vs cpu_online:
++ *
++ * - cpu_active must be a subset of cpu_online
++ *
++ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
++ * see __set_cpus_allowed_ptr(). At this point the newly online
++ * CPU isn't yet part of the sched domains, and balancing will not
++ * see it.
++ *
++ * - on cpu-down we clear cpu_active() to mask the sched domains and
++ * avoid the load balancer to place new tasks on the to be removed
++ * CPU. Existing tasks will remain running there and will be taken
++ * off.
++ *
++ * This means that fallback selection must not select !active CPUs.
++ * And can assume that any active CPU must be online. Conversely
++ * select_task_rq() below may allow selection of !active CPUs in order
++ * to satisfy the above rules.
++ */
++static int select_fallback_rq(int cpu, struct task_struct *p)
++{
++ int nid = cpu_to_node(cpu);
++ const struct cpumask *nodemask = NULL;
++ enum { cpuset, possible, fail } state = cpuset;
++ int dest_cpu;
++
++ /*
++ * If the node that the CPU is on has been offlined, cpu_to_node()
++ * will return -1. There is no CPU on the node, and we should
++ * select the CPU on the other node.
++ */
++ if (nid != -1) {
++ nodemask = cpumask_of_node(nid);
++
++ /* Look for allowed, online CPU in same node. */
++ for_each_cpu(dest_cpu, nodemask) {
++ if (is_cpu_allowed(p, dest_cpu))
++ return dest_cpu;
++ }
++ }
++
++ for (;;) {
++ /* Any allowed, online CPU? */
++ for_each_cpu(dest_cpu, p->cpus_ptr) {
++ if (!is_cpu_allowed(p, dest_cpu))
++ continue;
++ goto out;
++ }
++
++ /* No more Mr. Nice Guy. */
++ switch (state) {
++ case cpuset:
++ if (cpuset_cpus_allowed_fallback(p)) {
++ state = possible;
++ break;
++ }
++ fallthrough;
++ case possible:
++ /*
++ * XXX When called from select_task_rq() we only
++ * hold p->pi_lock and again violate locking order.
++ *
++ * More yuck to audit.
++ */
++ do_set_cpus_allowed(p, task_cpu_fallback_mask(p));
++ state = fail;
++ break;
++
++ case fail:
++ BUG();
++ break;
++ }
++ }
++
++out:
++ if (state != cpuset) {
++ /*
++ * Don't tell them about moving exiting tasks or
++ * kernel threads (both mm NULL), since they never
++ * leave kernel.
++ */
++ if (p->mm && printk_ratelimit()) {
++ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
++ task_pid_nr(p), p->comm, cpu);
++ }
++ }
++
++ return dest_cpu;
++}
++
++static inline void
++sched_preempt_mask_flush(cpumask_t *mask, int prio, int ref)
++{
++ int cpu;
++
++ cpumask_copy(mask, sched_preempt_mask + ref);
++ if (prio < ref) {
++ for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits) {
++ if (prio < cpu_rq(cpu)->prio)
++ cpumask_set_cpu(cpu, mask);
++ }
++ } else {
++ for_each_cpu_andnot(cpu, mask, sched_idle_mask) {
++ if (prio >= cpu_rq(cpu)->prio)
++ cpumask_clear_cpu(cpu, mask);
++ }
++ }
++}
++
++static inline int
++preempt_mask_check(cpumask_t *preempt_mask, const cpumask_t *allow_mask, int prio)
++{
++ cpumask_t *mask = sched_preempt_mask + prio;
++ int pr = atomic_read(&sched_prio_record);
++
++ if (pr != prio && SCHED_QUEUE_BITS - 1 != prio) {
++ sched_preempt_mask_flush(mask, prio, pr);
++ atomic_set(&sched_prio_record, prio);
++ }
++
++ return cpumask_and(preempt_mask, allow_mask, mask);
++}
++
++__read_mostly idle_select_func_t idle_select_func ____cacheline_aligned_in_smp = cpumask_and;
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ cpumask_t allow_mask, mask;
++
++ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
++ return select_fallback_rq(task_cpu(p), p);
++
++ if (idle_select_func(&mask, &allow_mask, sched_idle_mask) ||
++ preempt_mask_check(&mask, &allow_mask, task_sched_prio(p)))
++ return best_mask_cpu(task_cpu(p), &mask);
++
++ return best_mask_cpu(task_cpu(p), &allow_mask);
++}
++
++void sched_set_stop_task(int cpu, struct task_struct *stop)
++{
++ static struct lock_class_key stop_pi_lock;
++ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
++ struct sched_param start_param = { .sched_priority = 0 };
++ struct task_struct *old_stop = cpu_rq(cpu)->stop;
++
++ if (stop) {
++ /*
++ * Make it appear like a SCHED_FIFO task, its something
++ * userspace knows about and won't get confused about.
++ *
++ * Also, it will make PI more or less work without too
++ * much confusion -- but then, stop work should not
++ * rely on PI working anyway.
++ */
++ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
++
++ /*
++ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
++ * adjust the effective priority of a task. As a result,
++ * rt_mutex_setprio() can trigger (RT) balancing operations,
++ * which can then trigger wakeups of the stop thread to push
++ * around the current task.
++ *
++ * The stop task itself will never be part of the PI-chain, it
++ * never blocks, therefore that ->pi_lock recursion is safe.
++ * Tell lockdep about this by placing the stop->pi_lock in its
++ * own class.
++ */
++ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
++ }
++
++ cpu_rq(cpu)->stop = stop;
++
++ if (old_stop) {
++ /*
++ * Reset it back to a normal scheduling policy so that
++ * it can die in pieces.
++ */
++ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
++ }
++}
++
++static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
++ raw_spinlock_t *lock, unsigned long irq_flags)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ /* Can the task run on the task's current CPU? If so, we're done */
++ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
++ if (is_migration_disabled(p))
++ __migrate_force_enable(p, rq);
++
++ if (task_on_cpu(p) || READ_ONCE(p->__state) == TASK_WAKING) {
++ struct migration_arg arg = { p, dest_cpu };
++
++ /* Need help from migration thread: drop lock and wait. */
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
++ return 0;
++ }
++ if (task_on_rq_queued(p)) {
++ /*
++ * OK, since we're going to drop the lock immediately
++ * afterwards anyway.
++ */
++ update_rq_clock(rq);
++ rq = move_queued_task(rq, p, dest_cpu);
++ lock = &rq->lock;
++ }
++ }
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return 0;
++}
++
++static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
++ struct affinity_context *ctx,
++ struct rq *rq,
++ raw_spinlock_t *lock,
++ unsigned long irq_flags)
++{
++ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
++ const struct cpumask *cpu_valid_mask = cpu_active_mask;
++ bool kthread = p->flags & PF_KTHREAD;
++ int dest_cpu;
++ int ret = 0;
++
++ if (kthread || is_migration_disabled(p)) {
++ /*
++ * Kernel threads are allowed on online && !active CPUs,
++ * however, during cpu-hot-unplug, even these might get pushed
++ * away if not KTHREAD_IS_PER_CPU.
++ *
++ * Specifically, migration_disabled() tasks must not fail the
++ * cpumask_any_and_distribute() pick below, esp. so on
++ * SCA_MIGRATE_ENABLE, otherwise we'll not call
++ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
++ */
++ cpu_valid_mask = cpu_online_mask;
++ }
++
++ if (!kthread && !cpumask_subset(ctx->new_mask, cpu_allowed_mask)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ /*
++ * Must re-check here, to close a race against __kthread_bind(),
++ * sched_setaffinity() is not guaranteed to observe the flag.
++ */
++ if ((ctx->flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ if (cpumask_equal(&p->cpus_mask, ctx->new_mask))
++ goto out;
++
++ dest_cpu = cpumask_any_and(cpu_valid_mask, ctx->new_mask);
++ if (dest_cpu >= nr_cpu_ids) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __do_set_cpus_allowed(p, ctx);
++
++ return affine_move_task(rq, p, dest_cpu, lock, irq_flags);
++
++out:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++
++ return ret;
++}
++
++/*
++ * Change a given task's CPU affinity. Migrate the thread to a
++ * is removed from the allowed bitmask.
++ *
++ * NOTE: the caller must have a valid reference to the task, the
++ * task must not exit() & deallocate itself prematurely. The
++ * call is not atomic; no spinlocks may be held.
++ */
++int __set_cpus_allowed_ptr(struct task_struct *p,
++ struct affinity_context *ctx)
++{
++ unsigned long irq_flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
++ * flags are set.
++ */
++ if (p->user_cpus_ptr &&
++ !(ctx->flags & SCA_USER) &&
++ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
++ ctx->new_mask = rq->scratch_mask;
++
++
++ return __set_cpus_allowed_ptr_locked(p, ctx, rq, lock, irq_flags);
++}
++
++int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++
++ return __set_cpus_allowed_ptr(p, &ac);
++}
++EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
++
++/*
++ * Change a given task's CPU affinity to the intersection of its current
++ * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
++ * If user_cpus_ptr is defined, use it as the basis for restricting CPU
++ * affinity or use cpu_online_mask instead.
++ *
++ * If the resulting mask is empty, leave the affinity unchanged and return
++ * -EINVAL.
++ */
++static int restrict_cpus_allowed_ptr(struct task_struct *p,
++ struct cpumask *new_mask,
++ const struct cpumask *subset_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++ unsigned long irq_flags;
++ raw_spinlock_t *lock;
++ struct rq *rq;
++ int err;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++
++ if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
++ err = -EINVAL;
++ goto err_unlock;
++ }
++
++ return __set_cpus_allowed_ptr_locked(p, &ac, rq, lock, irq_flags);
++
++err_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return err;
++}
++
++/*
++ * Restrict the CPU affinity of task @p so that it is a subset of
++ * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
++ * old affinity mask. If the resulting mask is empty, we warn and walk
++ * up the cpuset hierarchy until we find a suitable mask.
++ */
++void force_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ cpumask_var_t new_mask;
++ const struct cpumask *override_mask = task_cpu_possible_mask(p);
++
++ alloc_cpumask_var(&new_mask, GFP_KERNEL);
++
++ /*
++ * __migrate_task() can fail silently in the face of concurrent
++ * offlining of the chosen destination CPU, so take the hotplug
++ * lock to ensure that the migration succeeds.
++ */
++ cpus_read_lock();
++ if (!cpumask_available(new_mask))
++ goto out_set_mask;
++
++ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
++ goto out_free_mask;
++
++ /*
++ * We failed to find a valid subset of the affinity mask for the
++ * task, so override it based on its cpuset hierarchy.
++ */
++ cpuset_cpus_allowed(p, new_mask);
++ override_mask = new_mask;
++
++out_set_mask:
++ if (printk_ratelimit()) {
++ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
++ task_pid_nr(p), p->comm,
++ cpumask_pr_args(override_mask));
++ }
++
++ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
++out_free_mask:
++ cpus_read_unlock();
++ free_cpumask_var(new_mask);
++}
++
++/*
++ * Restore the affinity of a task @p which was previously restricted by a
++ * call to force_compatible_cpus_allowed_ptr().
++ *
++ * It is the caller's responsibility to serialise this with any calls to
++ * force_compatible_cpus_allowed_ptr(@p).
++ */
++void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ struct affinity_context ac = {
++ .new_mask = task_user_cpus(p),
++ .flags = 0,
++ };
++ int ret;
++
++ /*
++ * Try to restore the old affinity mask with __sched_setaffinity().
++ * Cpuset masking will be done there too.
++ */
++ ret = __sched_setaffinity(p, &ac);
++ WARN_ON_ONCE(ret);
++}
++
++static void
++ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq;
++
++ if (!schedstat_enabled())
++ return;
++
++ rq = this_rq();
++
++ if (cpu == rq->cpu) {
++ __schedstat_inc(rq->ttwu_local);
++ __schedstat_inc(p->stats.nr_wakeups_local);
++ } else {
++ /** Alt schedule FW ToDo:
++ * How to do ttwu_wake_remote
++ */
++ }
++
++ __schedstat_inc(rq->ttwu_count);
++ __schedstat_inc(p->stats.nr_wakeups);
++}
++
++/*
++ * Mark the task runnable.
++ */
++static inline void ttwu_do_wakeup(struct task_struct *p)
++{
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ trace_sched_wakeup(p);
++}
++
++static inline void
++ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
++{
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible--;
++
++ if (!(wake_flags & WF_MIGRATED) && p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ activate_task(p, rq);
++ wakeup_preempt(rq);
++
++ ttwu_do_wakeup(p);
++}
++
++/*
++ * Consider @p being inside a wait loop:
++ *
++ * for (;;) {
++ * set_current_state(TASK_UNINTERRUPTIBLE);
++ *
++ * if (CONDITION)
++ * break;
++ *
++ * schedule();
++ * }
++ * __set_current_state(TASK_RUNNING);
++ *
++ * between set_current_state() and schedule(). In this case @p is still
++ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
++ * an atomic manner.
++ *
++ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
++ * then schedule() must still happen and p->state can be changed to
++ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
++ * need to do a full wakeup with enqueue.
++ *
++ * Returns: %true when the wakeup is done,
++ * %false otherwise.
++ */
++static int ttwu_runnable(struct task_struct *p, int wake_flags)
++{
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ int ret = 0;
++
++ rq = __task_access_lock(p, &lock);
++ if (task_on_rq_queued(p)) {
++ if (!task_on_cpu(p)) {
++ /*
++ * When on_rq && !on_cpu the task is preempted, see if
++ * it should preempt the task that is current now.
++ */
++ update_rq_clock(rq);
++ wakeup_preempt(rq);
++ }
++ ttwu_do_wakeup(p);
++ ret = 1;
++ }
++ __task_access_unlock(p, lock);
++
++ return ret;
++}
++
++void sched_ttwu_pending(void *arg)
++{
++ struct llist_node *llist = arg;
++ struct rq *rq = this_rq();
++ struct task_struct *p, *t;
++ struct rq_flags rf;
++
++ if (!llist)
++ return;
++
++ rq_lock_irqsave(rq, &rf);
++ update_rq_clock(rq);
++
++ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
++ if (WARN_ON_ONCE(p->on_cpu))
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
++ set_task_cpu(p, cpu_of(rq));
++
++ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
++ }
++
++ /*
++ * Must be after enqueueing at least once task such that
++ * idle_cpu() does not observe a false-negative -- if it does,
++ * it is possible for select_idle_siblings() to stack a number
++ * of tasks on this CPU during that window.
++ *
++ * It is OK to clear ttwu_pending when another task pending.
++ * We will receive IPI after local IRQ enabled and then enqueue it.
++ * Since now nr_running > 0, idle_cpu() will always get correct result.
++ */
++ WRITE_ONCE(rq->ttwu_pending, 0);
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Prepare the scene for sending an IPI for a remote smp_call
++ *
++ * Returns true if the caller can proceed with sending the IPI.
++ * Returns false otherwise.
++ */
++bool call_function_single_prep_ipi(int cpu)
++{
++ if (set_nr_if_polling(cpu_rq(cpu)->idle)) {
++ trace_sched_wake_idle_without_ipi(cpu);
++ return false;
++ }
++
++ return true;
++}
++
++/*
++ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
++ * necessary. The wakee CPU on receipt of the IPI will queue the task
++ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
++ * of the wakeup instead of the waker.
++ */
++static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
++
++ WRITE_ONCE(rq->ttwu_pending, 1);
++ __smp_call_single_queue(cpu, &p->wake_entry.llist);
++}
++
++static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
++{
++ /*
++ * Do not complicate things with the async wake_list while the CPU is
++ * in hotplug state.
++ */
++ if (!cpu_active(cpu))
++ return false;
++
++ /* Ensure the task will still be allowed to run on the CPU. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /*
++ * If the CPU does not share cache, then queue the task on the
++ * remote rqs wakelist to avoid accessing remote data.
++ */
++ if (!cpus_share_cache(smp_processor_id(), cpu))
++ return true;
++
++ if (cpu == smp_processor_id())
++ return false;
++
++ /*
++ * If the wakee cpu is idle, or the task is descheduling and the
++ * only running task on the CPU, then use the wakelist to offload
++ * the task activation to the idle (or soon-to-be-idle) CPU as
++ * the current CPU is likely busy. nr_running is checked to
++ * avoid unnecessary task stacking.
++ *
++ * Note that we can only get here with (wakee) p->on_rq=0,
++ * p->on_cpu can be whatever, we've done the dequeue, so
++ * the wakee has been accounted out of ->nr_running.
++ */
++ if (!cpu_rq(cpu)->nr_running)
++ return true;
++
++ return false;
++}
++
++static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
++ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
++ __ttwu_queue_wakelist(p, cpu, wake_flags);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_if_idle(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ guard(rcu)();
++ if (is_idle_task(rcu_dereference(rq->curr))) {
++ guard(raw_spinlock_irqsave)(&rq->lock);
++ if (is_idle_task(rq->curr))
++ resched_curr(rq);
++ }
++}
++
++extern struct static_key_false sched_asym_cpucapacity;
++
++static __always_inline bool sched_asym_cpucap_active(void)
++{
++ return static_branch_unlikely(&sched_asym_cpucapacity);
++}
++
++bool cpus_equal_capacity(int this_cpu, int that_cpu)
++{
++ if (!sched_asym_cpucap_active())
++ return true;
++
++ if (this_cpu == that_cpu)
++ return true;
++
++ return arch_scale_cpu_capacity(this_cpu) == arch_scale_cpu_capacity(that_cpu);
++}
++
++bool cpus_share_cache(int this_cpu, int that_cpu)
++{
++ if (this_cpu == that_cpu)
++ return true;
++
++ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
++}
++
++static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (ttwu_queue_wakelist(p, cpu, wake_flags))
++ return;
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++ ttwu_do_activate(rq, p, wake_flags);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Invoked from try_to_wake_up() to check whether the task can be woken up.
++ *
++ * The caller holds p::pi_lock if p != current or has preemption
++ * disabled when p == current.
++ *
++ * The rules of saved_state:
++ *
++ * The related locking code always holds p::pi_lock when updating
++ * p::saved_state, which means the code is fully serialized in both cases.
++ *
++ * For PREEMPT_RT, the lock wait and lock wakeups happen via TASK_RTLOCK_WAIT.
++ * No other bits set. This allows to distinguish all wakeup scenarios.
++ *
++ * For FREEZER, the wakeup happens via TASK_FROZEN. No other bits set. This
++ * allows us to prevent early wakeup of tasks before they can be run on
++ * asymmetric ISA architectures (eg ARMv9).
++ */
++static __always_inline
++bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
++{
++ int match;
++
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
++ state != TASK_RTLOCK_WAIT);
++ }
++
++ *success = !!(match = __task_state_match(p, state));
++
++ /*
++ * Saved state preserves the task state across blocking on
++ * an RT lock or TASK_FREEZABLE tasks. If the state matches,
++ * set p::saved_state to TASK_RUNNING, but do not wake the task
++ * because it waits for a lock wakeup or __thaw_task(). Also
++ * indicate success because from the regular waker's point of
++ * view this has succeeded.
++ *
++ * After acquiring the lock the task will restore p::__state
++ * from p::saved_state which ensures that the regular
++ * wakeup is not lost. The restore will also set
++ * p::saved_state to TASK_RUNNING so any further tests will
++ * not result in false positives vs. @success
++ */
++ if (match < 0)
++ p->saved_state = TASK_RUNNING;
++
++ return match > 0;
++}
++
++/*
++ * Notes on Program-Order guarantees on SMP systems.
++ *
++ * MIGRATION
++ *
++ * The basic program-order guarantee on SMP systems is that when a task [t]
++ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
++ * execution on its new CPU [c1].
++ *
++ * For migration (of runnable tasks) this is provided by the following means:
++ *
++ * A) UNLOCK of the rq(c0)->lock scheduling out task t
++ * B) migration for t is required to synchronize *both* rq(c0)->lock and
++ * rq(c1)->lock (if not at the same time, then in that order).
++ * C) LOCK of the rq(c1)->lock scheduling in task
++ *
++ * Transitivity guarantees that B happens after A and C after B.
++ * Note: we only require RCpc transitivity.
++ * Note: the CPU doing B need not be c0 or c1
++ *
++ * Example:
++ *
++ * CPU0 CPU1 CPU2
++ *
++ * LOCK rq(0)->lock
++ * sched-out X
++ * sched-in Y
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(0)->lock // orders against CPU0
++ * dequeue X
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(1)->lock
++ * enqueue X
++ * UNLOCK rq(1)->lock
++ *
++ * LOCK rq(1)->lock // orders against CPU2
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(1)->lock
++ *
++ *
++ * BLOCKING -- aka. SLEEP + WAKEUP
++ *
++ * For blocking we (obviously) need to provide the same guarantee as for
++ * migration. However the means are completely different as there is no lock
++ * chain to provide order. Instead we do:
++ *
++ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
++ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
++ *
++ * Example:
++ *
++ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
++ *
++ * LOCK rq(0)->lock LOCK X->pi_lock
++ * dequeue X
++ * sched-out X
++ * smp_store_release(X->on_cpu, 0);
++ *
++ * smp_cond_load_acquire(&X->on_cpu, !VAL);
++ * X->state = WAKING
++ * set_task_cpu(X,2)
++ *
++ * LOCK rq(2)->lock
++ * enqueue X
++ * X->state = RUNNING
++ * UNLOCK rq(2)->lock
++ *
++ * LOCK rq(2)->lock // orders against CPU1
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(2)->lock
++ *
++ * UNLOCK X->pi_lock
++ * UNLOCK rq(0)->lock
++ *
++ *
++ * However; for wakeups there is a second guarantee we must provide, namely we
++ * must observe the state that lead to our wakeup. That is, not only must our
++ * task observe its own prior state, it must also observe the stores prior to
++ * its wakeup.
++ *
++ * This means that any means of doing remote wakeups must order the CPU doing
++ * the wakeup against the CPU the task is going to end up running on. This,
++ * however, is already required for the regular Program-Order guarantee above,
++ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
++ *
++ */
++
++/**
++ * try_to_wake_up - wake up a thread
++ * @p: the thread to be awakened
++ * @state: the mask of task states that can be woken
++ * @wake_flags: wake modifier flags (WF_*)
++ *
++ * Conceptually does:
++ *
++ * If (@state & @p->state) @p->state = TASK_RUNNING.
++ *
++ * If the task was not queued/runnable, also place it back on a runqueue.
++ *
++ * This function is atomic against schedule() which would dequeue the task.
++ *
++ * It issues a full memory barrier before accessing @p->state, see the comment
++ * with set_current_state().
++ *
++ * Uses p->pi_lock to serialize against concurrent wake-ups.
++ *
++ * Relies on p->pi_lock stabilizing:
++ * - p->sched_class
++ * - p->cpus_ptr
++ * - p->sched_task_group
++ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
++ *
++ * Tries really hard to only take one task_rq(p)->lock for performance.
++ * Takes rq->lock in:
++ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
++ * - ttwu_queue() -- new rq, for enqueue of the task;
++ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
++ *
++ * As a consequence we race really badly with just about everything. See the
++ * many memory barriers and their comments for details.
++ *
++ * Return: %true if @p->state changes (an actual wakeup was done),
++ * %false otherwise.
++ */
++int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
++{
++ guard(preempt)();
++ int cpu, success = 0;
++
++ if (p == current) {
++ /*
++ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
++ * == smp_processor_id()'. Together this means we can special
++ * case the whole 'p->on_rq && ttwu_runnable()' case below
++ * without taking any locks.
++ *
++ * In particular:
++ * - we rely on Program-Order guarantees for all the ordering,
++ * - we're serialized against set_special_state() by virtue of
++ * it disabling IRQs (this allows not taking ->pi_lock).
++ */
++ if (!ttwu_state_match(p, state, &success))
++ goto out;
++
++ trace_sched_waking(p);
++ ttwu_do_wakeup(p);
++ goto out;
++ }
++
++ /*
++ * If we are going to wake up a thread waiting for CONDITION we
++ * need to ensure that CONDITION=1 done by the caller can not be
++ * reordered with p->state check below. This pairs with smp_store_mb()
++ * in set_current_state() that the waiting thread does.
++ */
++ scoped_guard (raw_spinlock_irqsave, &p->pi_lock) {
++ smp_mb__after_spinlock();
++ if (!ttwu_state_match(p, state, &success))
++ break;
++
++ trace_sched_waking(p);
++
++ /*
++ * Ensure we load p->on_rq _after_ p->state, otherwise it would
++ * be possible to, falsely, observe p->on_rq == 0 and get stuck
++ * in smp_cond_load_acquire() below.
++ *
++ * sched_ttwu_pending() try_to_wake_up()
++ * STORE p->on_rq = 1 LOAD p->state
++ * UNLOCK rq->lock
++ *
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * UNLOCK rq->lock
++ *
++ * [task p]
++ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * A similar smp_rmb() lives in __task_needs_rq_lock().
++ */
++ smp_rmb();
++ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
++ break;
++
++ /*
++ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
++ * possible to, falsely, observe p->on_cpu == 0.
++ *
++ * One must be running (->on_cpu == 1) in order to remove oneself
++ * from the runqueue.
++ *
++ * __schedule() (switch to task 'p') try_to_wake_up()
++ * STORE p->on_cpu = 1 LOAD p->on_rq
++ * UNLOCK rq->lock
++ *
++ * __schedule() (put 'p' to sleep)
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * STORE p->on_rq = 0 LOAD p->on_cpu
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
++ * schedule()'s deactivate_task() has 'happened' and p will no longer
++ * care about it's own p->state. See the comment in __schedule().
++ */
++ smp_acquire__after_ctrl_dep();
++
++ /*
++ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
++ * == 0), which means we need to do an enqueue, change p->state to
++ * TASK_WAKING such that we can unlock p->pi_lock before doing the
++ * enqueue, such as ttwu_queue_wakelist().
++ */
++ WRITE_ONCE(p->__state, TASK_WAKING);
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, considering queueing p on the remote CPUs wake_list
++ * which potentially sends an IPI instead of spinning on p->on_cpu to
++ * let the waker make forward progress. This is safe because IRQs are
++ * disabled and the IPI will deliver after on_cpu is cleared.
++ *
++ * Ensure we load task_cpu(p) after p->on_cpu:
++ *
++ * set_task_cpu(p, cpu);
++ * STORE p->cpu = @cpu
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock
++ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
++ * STORE p->on_cpu = 1 LOAD p->cpu
++ *
++ * to ensure we observe the correct CPU on which the task is currently
++ * scheduling.
++ */
++ if (smp_load_acquire(&p->on_cpu) &&
++ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
++ break;
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, wait until it's done referencing the task.
++ *
++ * Pairs with the smp_store_release() in finish_task().
++ *
++ * This ensures that tasks getting woken will be fully ordered against
++ * their previous state and preserve Program Order.
++ */
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ sched_task_ttwu(p);
++
++ if ((wake_flags & WF_CURRENT_CPU) &&
++ cpumask_test_cpu(smp_processor_id(), p->cpus_ptr))
++ cpu = smp_processor_id();
++ else
++ cpu = select_task_rq(p);
++
++ if (cpu != task_cpu(p)) {
++ if (p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ wake_flags |= WF_MIGRATED;
++ set_task_cpu(p, cpu);
++ }
++
++ ttwu_queue(p, cpu, wake_flags);
++ }
++out:
++ if (success)
++ ttwu_stat(p, task_cpu(p), wake_flags);
++
++ return success;
++}
++
++static bool __task_needs_rq_lock(struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
++ * the task is blocked. Make sure to check @state since ttwu() can drop
++ * locks at the end, see ttwu_queue_wakelist().
++ */
++ if (state == TASK_RUNNING || state == TASK_WAKING)
++ return true;
++
++ /*
++ * Ensure we load p->on_rq after p->__state, otherwise it would be
++ * possible to, falsely, observe p->on_rq == 0.
++ *
++ * See try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ if (p->on_rq)
++ return true;
++
++ /*
++ * Ensure the task has finished __schedule() and will not be referenced
++ * anymore. Again, see try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ return false;
++}
++
++/**
++ * task_call_func - Invoke a function on task in fixed state
++ * @p: Process for which the function is to be invoked, can be @current.
++ * @func: Function to invoke.
++ * @arg: Argument to function.
++ *
++ * Fix the task in it's current state by avoiding wakeups and or rq operations
++ * and call @func(@arg) on it. This function can use task_is_runnable() and
++ * task_curr() to work out what the state is, if required. Given that @func
++ * can be invoked with a runqueue lock held, it had better be quite
++ * lightweight.
++ *
++ * Returns:
++ * Whatever @func returns
++ */
++int task_call_func(struct task_struct *p, task_call_f func, void *arg)
++{
++ struct rq *rq = NULL;
++ struct rq_flags rf;
++ int ret;
++
++ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
++
++ if (__task_needs_rq_lock(p))
++ rq = __task_rq_lock(p, &rf);
++
++ /*
++ * At this point the task is pinned; either:
++ * - blocked and we're holding off wakeups (pi->lock)
++ * - woken, and we're holding off enqueue (rq->lock)
++ * - queued, and we're holding off schedule (rq->lock)
++ * - running, and we're holding off de-schedule (rq->lock)
++ *
++ * The called function (@func) can use: task_curr(), p->on_rq and
++ * p->__state to differentiate between these states.
++ */
++ ret = func(p, arg);
++
++ if (rq)
++ __task_rq_unlock(rq, &rf);
++
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
++ return ret;
++}
++
++/**
++ * cpu_curr_snapshot - Return a snapshot of the currently running task
++ * @cpu: The CPU on which to snapshot the task.
++ *
++ * Returns the task_struct pointer of the task "currently" running on
++ * the specified CPU. If the same task is running on that CPU throughout,
++ * the return value will be a pointer to that task's task_struct structure.
++ * If the CPU did any context switches even vaguely concurrently with the
++ * execution of this function, the return value will be a pointer to the
++ * task_struct structure of a randomly chosen task that was running on
++ * that CPU somewhere around the time that this function was executing.
++ *
++ * If the specified CPU was offline, the return value is whatever it
++ * is, perhaps a pointer to the task_struct structure of that CPU's idle
++ * task, but there is no guarantee. Callers wishing a useful return
++ * value must take some action to ensure that the specified CPU remains
++ * online throughout.
++ *
++ * This function executes full memory barriers before and after fetching
++ * the pointer, which permits the caller to confine this function's fetch
++ * with respect to the caller's accesses to other shared variables.
++ */
++struct task_struct *cpu_curr_snapshot(int cpu)
++{
++ struct task_struct *t;
++
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ t = rcu_dereference(cpu_curr(cpu));
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ return t;
++}
++
++/**
++ * wake_up_process - Wake up a specific process
++ * @p: The process to be woken up.
++ *
++ * Attempt to wake up the nominated process and move it to the set of runnable
++ * processes.
++ *
++ * Return: 1 if the process was woken up, 0 if it was already running.
++ *
++ * This function executes a full memory barrier before accessing the task state.
++ */
++int wake_up_process(struct task_struct *p)
++{
++ return try_to_wake_up(p, TASK_NORMAL, 0);
++}
++EXPORT_SYMBOL(wake_up_process);
++
++int wake_up_state(struct task_struct *p, unsigned int state)
++{
++ return try_to_wake_up(p, state, 0);
++}
++
++/*
++ * Perform scheduler related setup for a newly forked process p.
++ * p is forked by current.
++ *
++ * __sched_fork() is basic setup which is also used by sched_init() to
++ * initialize the boot CPU's idle task.
++ */
++static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ p->on_rq = 0;
++ p->on_cpu = 0;
++ p->utime = 0;
++ p->stime = 0;
++ p->sched_time = 0;
++
++#ifdef CONFIG_SCHEDSTATS
++ /* Even if schedstat is disabled, there should not be garbage */
++ memset(&p->stats, 0, sizeof(p->stats));
++#endif
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++ INIT_HLIST_HEAD(&p->preempt_notifiers);
++#endif
++
++#ifdef CONFIG_COMPACTION
++ p->capture_control = NULL;
++#endif
++ p->wake_entry.u_flags = CSD_TYPE_TTWU;
++ init_sched_mm_cid(p);
++}
++
++/*
++ * fork()/clone()-time setup:
++ */
++int sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ __sched_fork(clone_flags, p);
++ /*
++ * We mark the process as NEW here. This guarantees that
++ * nobody will actually run it, and a signal or other external
++ * event cannot wake it up and insert it on the runqueue either.
++ */
++ p->__state = TASK_NEW;
++
++ /*
++ * Make sure we do not leak PI boosting priority to the child.
++ */
++ p->prio = current->normal_prio;
++
++ /*
++ * Revert to default priority/policy on fork if requested.
++ */
++ if (unlikely(p->sched_reset_on_fork)) {
++ if (task_has_rt_policy(p)) {
++ p->policy = SCHED_NORMAL;
++ p->static_prio = NICE_TO_PRIO(0);
++ p->rt_priority = 0;
++ } else if (PRIO_TO_NICE(p->static_prio) < 0)
++ p->static_prio = NICE_TO_PRIO(0);
++
++ p->prio = p->normal_prio = p->static_prio;
++
++ /*
++ * We don't need the reset flag anymore after the fork. It has
++ * fulfilled its duty:
++ */
++ p->sched_reset_on_fork = 0;
++ }
++
++#ifdef CONFIG_SCHED_INFO
++ if (unlikely(sched_info_on()))
++ memset(&p->sched_info, 0, sizeof(p->sched_info));
++#endif
++ init_task_preempt_count(p);
++
++ return 0;
++}
++
++int sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ /*
++ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
++ * required yet, but lockdep gets upset if rules are violated.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ /*
++ * Share the timeslice between parent and child, thus the
++ * total amount of pending timeslices in the system doesn't change,
++ * resulting in more scheduling fairness.
++ */
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ rq->curr->time_slice /= 2;
++ p->time_slice = rq->curr->time_slice;
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, rq->curr->time_slice);
++#endif
++
++ if (p->time_slice < RESCHED_NS) {
++ p->time_slice = sysctl_sched_base_slice;
++ resched_curr(rq);
++ }
++ sched_task_fork(p, rq);
++ raw_spin_unlock(&rq->lock);
++
++ rseq_migrate(p);
++ /*
++ * We're setting the CPU for the first time, we don't migrate,
++ * so use __set_task_cpu().
++ */
++ __set_task_cpu(p, smp_processor_id());
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++void sched_cancel_fork(struct task_struct *p)
++{
++}
++
++void sched_post_fork(struct task_struct *p)
++{
++}
++
++#ifdef CONFIG_SCHEDSTATS
++
++DEFINE_STATIC_KEY_FALSE(sched_schedstats);
++
++static void set_schedstats(bool enabled)
++{
++ if (enabled)
++ static_branch_enable(&sched_schedstats);
++ else
++ static_branch_disable(&sched_schedstats);
++}
++
++void force_schedstat_enabled(void)
++{
++ if (!schedstat_enabled()) {
++ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
++ static_branch_enable(&sched_schedstats);
++ }
++}
++
++static int __init setup_schedstats(char *str)
++{
++ int ret = 0;
++ if (!str)
++ goto out;
++
++ if (!strcmp(str, "enable")) {
++ set_schedstats(true);
++ ret = 1;
++ } else if (!strcmp(str, "disable")) {
++ set_schedstats(false);
++ ret = 1;
++ }
++out:
++ if (!ret)
++ pr_warn("Unable to parse schedstats=\n");
++
++ return ret;
++}
++__setup("schedstats=", setup_schedstats);
++
++#ifdef CONFIG_PROC_SYSCTL
++static int sysctl_schedstats(const struct ctl_table *table, int write, void *buffer,
++ size_t *lenp, loff_t *ppos)
++{
++ struct ctl_table t;
++ int err;
++ int state = static_branch_likely(&sched_schedstats);
++
++ if (write && !capable(CAP_SYS_ADMIN))
++ return -EPERM;
++
++ t = *table;
++ t.data = &state;
++ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
++ if (err < 0)
++ return err;
++ if (write)
++ set_schedstats(state);
++ return err;
++}
++#endif /* CONFIG_PROC_SYSCTL */
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_SYSCTL
++static const struct ctl_table sched_core_sysctls[] = {
++#ifdef CONFIG_SCHEDSTATS
++ {
++ .procname = "sched_schedstats",
++ .data = NULL,
++ .maxlen = sizeof(unsigned int),
++ .mode = 0644,
++ .proc_handler = sysctl_schedstats,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++#endif /* CONFIG_SCHEDSTATS */
++};
++static int __init sched_core_sysctl_init(void)
++{
++ register_sysctl_init("kernel", sched_core_sysctls);
++ return 0;
++}
++late_initcall(sched_core_sysctl_init);
++#endif /* CONFIG_SYSCTL */
++
++/*
++ * wake_up_new_task - wake up a newly created task for the first time.
++ *
++ * This function will do some initial scheduler statistics housekeeping
++ * that must be done for every newly created context, then puts the task
++ * on the runqueue and wakes it.
++ */
++void wake_up_new_task(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ rq = cpu_rq(select_task_rq(p));
++ rseq_migrate(p);
++ /*
++ * Fork balancing, do it here and not earlier because:
++ * - cpus_ptr can change in the fork path
++ * - any previously selected CPU might disappear through hotplug
++ *
++ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
++ * as we're not fully set-up yet.
++ */
++ __set_task_cpu(p, cpu_of(rq));
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ activate_task(p, rq);
++ trace_sched_wakeup_new(p);
++ wakeup_preempt(rq);
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++
++static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
++
++void preempt_notifier_inc(void)
++{
++ static_branch_inc(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_inc);
++
++void preempt_notifier_dec(void)
++{
++ static_branch_dec(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_dec);
++
++/**
++ * preempt_notifier_register - tell me when current is being preempted & rescheduled
++ * @notifier: notifier struct to register
++ */
++void preempt_notifier_register(struct preempt_notifier *notifier)
++{
++ if (!static_branch_unlikely(&preempt_notifier_key))
++ WARN(1, "registering preempt_notifier while notifiers disabled\n");
++
++ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_register);
++
++/**
++ * preempt_notifier_unregister - no longer interested in preemption notifications
++ * @notifier: notifier struct to unregister
++ *
++ * This is *not* safe to call from within a preemption notifier.
++ */
++void preempt_notifier_unregister(struct preempt_notifier *notifier)
++{
++ hlist_del(¬ifier->link);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
++
++static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_in(notifier, raw_smp_processor_id());
++}
++
++static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_in_preempt_notifiers(curr);
++}
++
++static void
++__fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_out(notifier, next);
++}
++
++static __always_inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_out_preempt_notifiers(curr, next);
++}
++
++#else /* !CONFIG_PREEMPT_NOTIFIERS: */
++
++static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++}
++
++static inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++}
++
++#endif /* !CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void prepare_task(struct task_struct *next)
++{
++ /*
++ * Claim the task as running, we do this before switching to it
++ * such that any running task will have this set.
++ *
++ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
++ * its ordering comment.
++ */
++ WRITE_ONCE(next->on_cpu, 1);
++}
++
++static inline void finish_task(struct task_struct *prev)
++{
++ /*
++ * This must be the very last reference to @prev from this CPU. After
++ * p->on_cpu is cleared, the task can be moved to a different CPU. We
++ * must ensure this doesn't happen until the switch is completely
++ * finished.
++ *
++ * In particular, the load of prev->state in finish_task_switch() must
++ * happen before this.
++ *
++ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
++ */
++ smp_store_release(&prev->on_cpu, 0);
++}
++
++static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ void (*func)(struct rq *rq);
++ struct balance_callback *next;
++
++ lockdep_assert_held(&rq->lock);
++
++ while (head) {
++ func = (void (*)(struct rq *))head->func;
++ next = head->next;
++ head->next = NULL;
++ head = next;
++
++ func(rq);
++ }
++}
++
++static void balance_push(struct rq *rq);
++
++/*
++ * balance_push_callback is a right abuse of the callback interface and plays
++ * by significantly different rules.
++ *
++ * Where the normal balance_callback's purpose is to be ran in the same context
++ * that queued it (only later, when it's safe to drop rq->lock again),
++ * balance_push_callback is specifically targeted at __schedule().
++ *
++ * This abuse is tolerated because it places all the unlikely/odd cases behind
++ * a single test, namely: rq->balance_callback == NULL.
++ */
++struct balance_callback balance_push_callback = {
++ .next = NULL,
++ .func = balance_push,
++};
++
++static inline struct balance_callback *
++__splice_balance_callbacks(struct rq *rq, bool split)
++{
++ struct balance_callback *head = rq->balance_callback;
++
++ if (likely(!head))
++ return NULL;
++
++ lockdep_assert_rq_held(rq);
++ /*
++ * Must not take balance_push_callback off the list when
++ * splice_balance_callbacks() and balance_callbacks() are not
++ * in the same rq->lock section.
++ *
++ * In that case it would be possible for __schedule() to interleave
++ * and observe the list empty.
++ */
++ if (split && head == &balance_push_callback)
++ head = NULL;
++ else
++ rq->balance_callback = NULL;
++
++ return head;
++}
++
++struct balance_callback *splice_balance_callbacks(struct rq *rq)
++{
++ return __splice_balance_callbacks(rq, true);
++}
++
++static void __balance_callbacks(struct rq *rq)
++{
++ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
++}
++
++void balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ unsigned long flags;
++
++ if (unlikely(head)) {
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ do_balance_callbacks(rq, head);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++ }
++}
++
++static inline void
++prepare_lock_switch(struct rq *rq, struct task_struct *next)
++{
++ /*
++ * Since the runqueue lock will be released by the next
++ * task (which is an invalid locking op but in the case
++ * of the scheduler it's an obvious special-case), so we
++ * do an early lockdep release here:
++ */
++ spin_release(&rq->lock.dep_map, _THIS_IP_);
++#ifdef CONFIG_DEBUG_SPINLOCK
++ /* this is a valid case when another task releases the spinlock */
++ rq->lock.owner = next;
++#endif
++}
++
++static inline void finish_lock_switch(struct rq *rq)
++{
++ /*
++ * If we are tracking spinlock dependencies then we have to
++ * fix up the runqueue lock - which gets 'carried over' from
++ * prev into current:
++ */
++ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++/*
++ * NOP if the arch has not defined these:
++ */
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static inline void kmap_local_sched_out(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_out();
++#endif
++}
++
++static inline void kmap_local_sched_in(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_in();
++#endif
++}
++
++/**
++ * prepare_task_switch - prepare to switch tasks
++ * @rq: the runqueue preparing to switch
++ * @next: the task we are going to switch to.
++ *
++ * This is called with the rq lock held and interrupts off. It must
++ * be paired with a subsequent finish_task_switch after the context
++ * switch.
++ *
++ * prepare_task_switch sets up locking and calls architecture specific
++ * hooks.
++ */
++static inline void
++prepare_task_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ kcov_prepare_switch(prev);
++ sched_info_switch(rq, prev, next);
++ perf_event_task_sched_out(prev, next);
++ rseq_preempt(prev);
++ fire_sched_out_preempt_notifiers(prev, next);
++ kmap_local_sched_out();
++ prepare_task(next);
++ prepare_arch_switch(next);
++}
++
++/**
++ * finish_task_switch - clean up after a task-switch
++ * @rq: runqueue associated with task-switch
++ * @prev: the thread we just switched away from.
++ *
++ * finish_task_switch must be called after the context switch, paired
++ * with a prepare_task_switch call before the context switch.
++ * finish_task_switch will reconcile locking set up by prepare_task_switch,
++ * and do any other architecture-specific cleanup actions.
++ *
++ * Note that we may have delayed dropping an mm in context_switch(). If
++ * so, we finish that here outside of the runqueue lock. (Doing it
++ * with the lock held can cause deadlocks; see schedule() for
++ * details.)
++ *
++ * The context switch have flipped the stack from under us and restored the
++ * local variables which were saved when this task called schedule() in the
++ * past. 'prev == current' is still correct but we need to recalculate this_rq
++ * because prev may have moved to another CPU.
++ */
++static struct rq *finish_task_switch(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ struct rq *rq = this_rq();
++ struct mm_struct *mm = rq->prev_mm;
++ unsigned int prev_state;
++
++ /*
++ * The previous task will have left us with a preempt_count of 2
++ * because it left us after:
++ *
++ * schedule()
++ * preempt_disable(); // 1
++ * __schedule()
++ * raw_spin_lock_irq(&rq->lock) // 2
++ *
++ * Also, see FORK_PREEMPT_COUNT.
++ */
++ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
++ "corrupted preempt_count: %s/%d/0x%x\n",
++ current->comm, current->pid, preempt_count()))
++ preempt_count_set(FORK_PREEMPT_COUNT);
++
++ rq->prev_mm = NULL;
++
++ /*
++ * A task struct has one reference for the use as "current".
++ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
++ * schedule one last time. The schedule call will never return, and
++ * the scheduled task must drop that reference.
++ *
++ * We must observe prev->state before clearing prev->on_cpu (in
++ * finish_task), otherwise a concurrent wakeup can get prev
++ * running on another CPU and we could rave with its RUNNING -> DEAD
++ * transition, resulting in a double drop.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ vtime_task_switch(prev);
++ perf_event_task_sched_in(prev, current);
++ finish_task(prev);
++ tick_nohz_task_switch();
++ finish_lock_switch(rq);
++ finish_arch_post_lock_switch();
++ kcov_finish_switch(current);
++ /*
++ * kmap_local_sched_out() is invoked with rq::lock held and
++ * interrupts disabled. There is no requirement for that, but the
++ * sched out code does not have an interrupt enabled section.
++ * Restoring the maps on sched in does not require interrupts being
++ * disabled either.
++ */
++ kmap_local_sched_in();
++
++ fire_sched_in_preempt_notifiers(current);
++ /*
++ * When switching through a kernel thread, the loop in
++ * membarrier_{private,global}_expedited() may have observed that
++ * kernel thread and not issued an IPI. It is therefore possible to
++ * schedule between user->kernel->user threads without passing though
++ * switch_mm(). Membarrier requires a barrier after storing to
++ * rq->curr, before returning to userspace, so provide them here:
++ *
++ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
++ * provided by mmdrop_lazy_tlb(),
++ * - a sync_core for SYNC_CORE.
++ */
++ if (mm) {
++ membarrier_mm_sync_core_before_usermode(mm);
++ mmdrop_lazy_tlb_sched(mm);
++ }
++ if (unlikely(prev_state == TASK_DEAD)) {
++ /* Task is done with its stack. */
++ put_task_stack(prev);
++
++ put_task_struct_rcu_user(prev);
++ }
++
++ return rq;
++}
++
++/**
++ * schedule_tail - first thing a freshly forked thread must call.
++ * @prev: the thread we just switched away from.
++ */
++asmlinkage __visible void schedule_tail(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ /*
++ * New tasks start with FORK_PREEMPT_COUNT, see there and
++ * finish_task_switch() for details.
++ *
++ * finish_task_switch() will drop rq->lock() and lower preempt_count
++ * and the preempt_enable() will end up enabling preemption (on
++ * PREEMPT_COUNT kernels).
++ */
++
++ finish_task_switch(prev);
++ /*
++ * This is a special case: the newly created task has just
++ * switched the context for the first time. It is returning from
++ * schedule for the first time in this path.
++ */
++ trace_sched_exit_tp(true);
++ preempt_enable();
++
++ if (current->set_child_tid)
++ put_user(task_pid_vnr(current), current->set_child_tid);
++
++ calculate_sigpending();
++}
++
++/*
++ * context_switch - switch to the new MM and the new thread's register state.
++ */
++static __always_inline struct rq *
++context_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ prepare_task_switch(rq, prev, next);
++
++ /*
++ * For paravirt, this is coupled with an exit in switch_to to
++ * combine the page table reload and the switch backend into
++ * one hypercall.
++ */
++ arch_start_context_switch(prev);
++
++ /*
++ * kernel -> kernel lazy + transfer active
++ * user -> kernel lazy + mmgrab_lazy_tlb() active
++ *
++ * kernel -> user switch + mmdrop_lazy_tlb() active
++ * user -> user switch
++ *
++ * switch_mm_cid() needs to be updated if the barriers provided
++ * by context_switch() are modified.
++ */
++ if (!next->mm) { // to kernel
++ enter_lazy_tlb(prev->active_mm, next);
++
++ next->active_mm = prev->active_mm;
++ if (prev->mm) // from user
++ mmgrab_lazy_tlb(prev->active_mm);
++ else
++ prev->active_mm = NULL;
++ } else { // to user
++ membarrier_switch_mm(rq, prev->active_mm, next->mm);
++ /*
++ * sys_membarrier() requires an smp_mb() between setting
++ * rq->curr / membarrier_switch_mm() and returning to userspace.
++ *
++ * The below provides this either through switch_mm(), or in
++ * case 'prev->active_mm == next->mm' through
++ * finish_task_switch()'s mmdrop().
++ */
++ switch_mm_irqs_off(prev->active_mm, next->mm, next);
++ lru_gen_use_mm(next->mm);
++
++ if (!prev->mm) { // from kernel
++ /* will mmdrop_lazy_tlb() in finish_task_switch(). */
++ rq->prev_mm = prev->active_mm;
++ prev->active_mm = NULL;
++ }
++ }
++
++ /* switch_mm_cid() requires the memory barriers above. */
++ switch_mm_cid(rq, prev, next);
++
++ prepare_lock_switch(rq, next);
++
++ /* Here we just switch the register state and the stack. */
++ switch_to(prev, next, prev);
++ barrier();
++
++ return finish_task_switch(prev);
++}
++
++/*
++ * nr_running, nr_uninterruptible and nr_context_switches:
++ *
++ * externally visible scheduler statistics: current number of runnable
++ * threads, total number of context switches performed since bootup.
++ */
++unsigned int nr_running(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_online_cpu(i)
++ sum += cpu_rq(i)->nr_running;
++
++ return sum;
++}
++
++/*
++ * Check if only the current task is running on the CPU.
++ *
++ * Caution: this function does not check that the caller has disabled
++ * preemption, thus the result might have a time-of-check-to-time-of-use
++ * race. The caller is responsible to use it correctly, for example:
++ *
++ * - from a non-preemptible section (of course)
++ *
++ * - from a thread that is bound to a single CPU
++ *
++ * - in a loop with very short iterations (e.g. a polling loop)
++ */
++bool single_task_running(void)
++{
++ return raw_rq()->nr_running == 1;
++}
++EXPORT_SYMBOL(single_task_running);
++
++unsigned long long nr_context_switches_cpu(int cpu)
++{
++ return cpu_rq(cpu)->nr_switches;
++}
++
++unsigned long long nr_context_switches(void)
++{
++ int i;
++ unsigned long long sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += cpu_rq(i)->nr_switches;
++
++ return sum;
++}
++
++/*
++ * Consumers of these two interfaces, like for example the cpuidle menu
++ * governor, are using nonsensical data. Preferring shallow idle state selection
++ * for a CPU that has IO-wait which might not even end up running the task when
++ * it does become runnable.
++ */
++
++unsigned int nr_iowait_cpu(int cpu)
++{
++ return atomic_read(&cpu_rq(cpu)->nr_iowait);
++}
++
++/*
++ * IO-wait accounting, and how it's mostly bollocks (on SMP).
++ *
++ * The idea behind IO-wait account is to account the idle time that we could
++ * have spend running if it were not for IO. That is, if we were to improve the
++ * storage performance, we'd have a proportional reduction in IO-wait time.
++ *
++ * This all works nicely on UP, where, when a task blocks on IO, we account
++ * idle time as IO-wait, because if the storage were faster, it could've been
++ * running and we'd not be idle.
++ *
++ * This has been extended to SMP, by doing the same for each CPU. This however
++ * is broken.
++ *
++ * Imagine for instance the case where two tasks block on one CPU, only the one
++ * CPU will have IO-wait accounted, while the other has regular idle. Even
++ * though, if the storage were faster, both could've ran at the same time,
++ * utilising both CPUs.
++ *
++ * This means, that when looking globally, the current IO-wait accounting on
++ * SMP is a lower bound, by reason of under accounting.
++ *
++ * Worse, since the numbers are provided per CPU, they are sometimes
++ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
++ * associated with any one particular CPU, it can wake to another CPU than it
++ * blocked on. This means the per CPU IO-wait number is meaningless.
++ *
++ * Task CPU affinities can make all that even more 'interesting'.
++ */
++
++unsigned int nr_iowait(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += nr_iowait_cpu(i);
++
++ return sum;
++}
++
++/*
++ * sched_exec - execve() is a valuable balancing opportunity, because at
++ * this point the task has the smallest effective memory and cache
++ * footprint.
++ */
++void sched_exec(void)
++{
++}
++
++DEFINE_PER_CPU(struct kernel_stat, kstat);
++DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
++
++EXPORT_PER_CPU_SYMBOL(kstat);
++EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
++
++static inline void update_curr(struct rq *rq, struct task_struct *p)
++{
++ s64 ns = rq->clock_task - p->last_ran;
++
++ p->sched_time += ns;
++ cgroup_account_cputime(p, ns);
++ account_group_exec_runtime(p, ns);
++
++ p->time_slice -= ns;
++ p->last_ran = rq->clock_task;
++}
++
++/*
++ * Return accounted runtime for the task.
++ * Return separately the current's pending runtime that have not been
++ * accounted yet.
++ */
++unsigned long long task_sched_runtime(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ u64 ns;
++
++#ifdef CONFIG_64BIT
++ /*
++ * 64-bit doesn't need locks to atomically read a 64-bit value.
++ * So we have a optimization chance when the task's delta_exec is 0.
++ * Reading ->on_cpu is racy, but this is OK.
++ *
++ * If we race with it leaving CPU, we'll take a lock. So we're correct.
++ * If we race with it entering CPU, unaccounted time is 0. This is
++ * indistinguishable from the read occurring a few cycles earlier.
++ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
++ * been accounted, so we're correct here as well.
++ */
++ if (!p->on_cpu || !task_on_rq_queued(p))
++ return tsk_seruntime(p);
++#endif
++
++ rq = task_access_lock_irqsave(p, &lock, &flags);
++ /*
++ * Must be ->curr _and_ ->on_rq. If dequeued, we would
++ * project cycles that may never be accounted to this
++ * thread, breaking clock_gettime().
++ */
++ if (p == rq->curr && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ update_curr(rq, p);
++ }
++ ns = tsk_seruntime(p);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ return ns;
++}
++
++/* This manages tasks that have run out of timeslice during a scheduler_tick */
++static inline void scheduler_task_tick(struct rq *rq)
++{
++ struct task_struct *p = rq->curr;
++
++ if (is_idle_task(p))
++ return;
++
++ update_curr(rq, p);
++ cpufreq_update_util(rq, 0);
++
++ /*
++ * Tasks have less than RESCHED_NS of time slice left they will be
++ * rescheduled.
++ */
++ if (p->time_slice >= RESCHED_NS)
++ return;
++ set_tsk_need_resched(p);
++ set_preempt_need_resched();
++}
++
++static u64 cpu_resched_latency(struct rq *rq)
++{
++ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
++ u64 resched_latency, now = rq_clock(rq);
++ static bool warned_once;
++
++ if (sysctl_resched_latency_warn_once && warned_once)
++ return 0;
++
++ if (!need_resched() || !latency_warn_ms)
++ return 0;
++
++ if (system_state == SYSTEM_BOOTING)
++ return 0;
++
++ if (!rq->last_seen_need_resched_ns) {
++ rq->last_seen_need_resched_ns = now;
++ rq->ticks_without_resched = 0;
++ return 0;
++ }
++
++ rq->ticks_without_resched++;
++ resched_latency = now - rq->last_seen_need_resched_ns;
++ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
++ return 0;
++
++ warned_once = true;
++
++ return resched_latency;
++}
++
++static int __init setup_resched_latency_warn_ms(char *str)
++{
++ long val;
++
++ if ((kstrtol(str, 0, &val))) {
++ pr_warn("Unable to set resched_latency_warn_ms\n");
++ return 1;
++ }
++
++ sysctl_resched_latency_warn_ms = val;
++ return 1;
++}
++__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
++
++/*
++ * This function gets called by the timer code, with HZ frequency.
++ * We call it with interrupts disabled.
++ */
++void sched_tick(void)
++{
++ int cpu __maybe_unused = smp_processor_id();
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *curr = rq->curr;
++ u64 resched_latency;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
++ arch_scale_freq_tick();
++
++ sched_clock_tick();
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ if (dynamic_preempt_lazy() && tif_test_bit(TIF_NEED_RESCHED_LAZY))
++ resched_curr(rq);
++
++ scheduler_task_tick(rq);
++ if (sched_feat(LATENCY_WARN))
++ resched_latency = cpu_resched_latency(rq);
++ calc_global_load_tick(rq);
++
++ task_tick_mm_cid(rq, rq->curr);
++
++ raw_spin_unlock(&rq->lock);
++
++ if (sched_feat(LATENCY_WARN) && resched_latency)
++ resched_latency_warn(cpu, resched_latency);
++
++ perf_event_task_tick();
++
++ if (curr->flags & PF_WQ_WORKER)
++ wq_worker_tick(curr);
++}
++
++#ifdef CONFIG_NO_HZ_FULL
++
++struct tick_work {
++ int cpu;
++ atomic_t state;
++ struct delayed_work work;
++};
++/* Values for ->state, see diagram below. */
++#define TICK_SCHED_REMOTE_OFFLINE 0
++#define TICK_SCHED_REMOTE_OFFLINING 1
++#define TICK_SCHED_REMOTE_RUNNING 2
++
++/*
++ * State diagram for ->state:
++ *
++ *
++ * TICK_SCHED_REMOTE_OFFLINE
++ * | ^
++ * | |
++ * | | sched_tick_remote()
++ * | |
++ * | |
++ * +--TICK_SCHED_REMOTE_OFFLINING
++ * | ^
++ * | |
++ * sched_tick_start() | | sched_tick_stop()
++ * | |
++ * V |
++ * TICK_SCHED_REMOTE_RUNNING
++ *
++ *
++ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
++ * and sched_tick_start() are happy to leave the state in RUNNING.
++ */
++
++static struct tick_work __percpu *tick_work_cpu;
++
++static void sched_tick_remote(struct work_struct *work)
++{
++ struct delayed_work *dwork = to_delayed_work(work);
++ struct tick_work *twork = container_of(dwork, struct tick_work, work);
++ int cpu = twork->cpu;
++ struct rq *rq = cpu_rq(cpu);
++ int os;
++
++ /*
++ * Handle the tick only if it appears the remote CPU is running in full
++ * dynticks mode. The check is racy by nature, but missing a tick or
++ * having one too much is no big deal because the scheduler tick updates
++ * statistics and checks timeslices in a time-independent way, regardless
++ * of when exactly it is running.
++ */
++ if (tick_nohz_tick_stopped_cpu(cpu)) {
++ guard(raw_spinlock_irqsave)(&rq->lock);
++ struct task_struct *curr = rq->curr;
++
++ if (cpu_online(cpu)) {
++ update_rq_clock(rq);
++
++ if (!is_idle_task(curr)) {
++ /*
++ * Make sure the next tick runs within a
++ * reasonable amount of time.
++ */
++ u64 delta = rq_clock_task(rq) - curr->last_ran;
++ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
++ }
++ scheduler_task_tick(rq);
++
++ calc_load_nohz_remote(rq);
++ }
++ }
++
++ /*
++ * Run the remote tick once per second (1Hz). This arbitrary
++ * frequency is large enough to avoid overload but short enough
++ * to keep scheduler internal stats reasonably up to date. But
++ * first update state to reflect hotplug activity if required.
++ */
++ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
++ if (os == TICK_SCHED_REMOTE_RUNNING)
++ queue_delayed_work(system_unbound_wq, dwork, HZ);
++}
++
++static void sched_tick_start(int cpu)
++{
++ int os;
++ struct tick_work *twork;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
++ if (os == TICK_SCHED_REMOTE_OFFLINE) {
++ twork->cpu = cpu;
++ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
++ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
++ }
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++static void sched_tick_stop(int cpu)
++{
++ struct tick_work *twork;
++ int os;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ /* There cannot be competing actions, but don't rely on stop-machine. */
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
++ WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
++ /* Don't cancel, as this would mess up the state machine. */
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++int __init sched_tick_offload_init(void)
++{
++ tick_work_cpu = alloc_percpu(struct tick_work);
++ BUG_ON(!tick_work_cpu);
++ return 0;
++}
++
++#else /* !CONFIG_NO_HZ_FULL: */
++static inline void sched_tick_start(int cpu) { }
++static inline void sched_tick_stop(int cpu) { }
++#endif /* !CONFIG_NO_HZ_FULL */
++
++#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
++ defined(CONFIG_PREEMPT_TRACER))
++/*
++ * If the value passed in is equal to the current preempt count
++ * then we just disabled preemption. Start timing the latency.
++ */
++static inline void preempt_latency_start(int val)
++{
++ if (preempt_count() == val) {
++ unsigned long ip = get_lock_parent_ip();
++#ifdef CONFIG_DEBUG_PREEMPT
++ current->preempt_disable_ip = ip;
++#endif
++ trace_preempt_off(CALLER_ADDR0, ip);
++ }
++}
++
++void preempt_count_add(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
++ return;
++#endif
++ __preempt_count_add(val);
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Spinlock count overflowing soon?
++ */
++ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
++ PREEMPT_MASK - 10);
++#endif
++ preempt_latency_start(val);
++}
++EXPORT_SYMBOL(preempt_count_add);
++NOKPROBE_SYMBOL(preempt_count_add);
++
++/*
++ * If the value passed in equals to the current preempt count
++ * then we just enabled preemption. Stop timing the latency.
++ */
++static inline void preempt_latency_stop(int val)
++{
++ if (preempt_count() == val)
++ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
++}
++
++void preempt_count_sub(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
++ return;
++ /*
++ * Is the spinlock portion underflowing?
++ */
++ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
++ !(preempt_count() & PREEMPT_MASK)))
++ return;
++#endif
++
++ preempt_latency_stop(val);
++ __preempt_count_sub(val);
++}
++EXPORT_SYMBOL(preempt_count_sub);
++NOKPROBE_SYMBOL(preempt_count_sub);
++
++#else
++static inline void preempt_latency_start(int val) { }
++static inline void preempt_latency_stop(int val) { }
++#endif
++
++static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ return p->preempt_disable_ip;
++#else
++ return 0;
++#endif
++}
++
++/*
++ * Print scheduling while atomic bug:
++ */
++static noinline void __schedule_bug(struct task_struct *prev)
++{
++ /* Save this before calling printk(), since that will clobber it */
++ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
++
++ if (oops_in_progress)
++ return;
++
++ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
++ prev->comm, prev->pid, preempt_count());
++
++ debug_show_held_locks(prev);
++ print_modules();
++ if (irqs_disabled())
++ print_irqtrace_events(prev);
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, preempt_disable_ip);
++ }
++ check_panic_on_warn("scheduling while atomic");
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++
++/*
++ * Various schedule()-time debugging checks and statistics:
++ */
++static inline void schedule_debug(struct task_struct *prev, bool preempt)
++{
++#ifdef CONFIG_SCHED_STACK_END_CHECK
++ if (task_stack_end_corrupted(prev))
++ panic("corrupted stack end detected inside scheduler\n");
++
++ if (task_scs_end_corrupted(prev))
++ panic("corrupted shadow stack detected inside scheduler\n");
++#endif
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
++ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
++ prev->comm, prev->pid, prev->non_block_count);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++ }
++#endif
++
++ if (unlikely(in_atomic_preempt_off())) {
++ __schedule_bug(prev);
++ preempt_count_set(PREEMPT_DISABLED);
++ }
++ rcu_sleep_check();
++ WARN_ON_ONCE(ct_state() == CT_STATE_USER);
++
++ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
++
++ schedstat_inc(this_rq()->sched_count);
++}
++
++#ifdef ALT_SCHED_DEBUG
++void alt_sched_debug(void)
++{
++ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx,"
++ " ecore_idle: 0x%04lx\n",
++ sched_rq_pending_mask.bits[0],
++ sched_idle_mask->bits[0],
++ sched_pcore_idle_mask->bits[0],
++ sched_ecore_idle_mask->bits[0]);
++}
++#endif
++
++
++#ifdef CONFIG_PREEMPT_RT
++#define SCHED_NR_MIGRATE_BREAK 8
++#else /* !CONFIG_PREEMPT_RT: */
++#define SCHED_NR_MIGRATE_BREAK 32
++#endif /* !CONFIG_PREEMPT_RT */
++
++__read_mostly unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
++
++/*
++ * Migrate pending tasks in @rq to @dest_cpu
++ */
++static inline int
++migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
++{
++ struct task_struct *p, *skip = rq->curr;
++ int nr_migrated = 0;
++ int nr_tries = min(rq->nr_running / 2, sysctl_sched_nr_migrate);
++
++ /* WA to check rq->curr is still on rq */
++ if (!task_on_rq_queued(skip))
++ return 0;
++
++ while (skip != rq->idle && nr_tries &&
++ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
++ skip = sched_rq_next_task(p, rq);
++ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
++ __SCHED_DEQUEUE_TASK(p, rq, 0, );
++ set_task_cpu(p, dest_cpu);
++ sched_task_sanity_check(p, dest_rq);
++ sched_mm_cid_migrate_to(dest_rq, p);
++ __SCHED_ENQUEUE_TASK(p, dest_rq, 0, );
++ nr_migrated++;
++ }
++ nr_tries--;
++ }
++
++ return nr_migrated;
++}
++
++static inline int take_other_rq_tasks(struct rq *rq, int cpu)
++{
++ cpumask_t *topo_mask, *end_mask, chk;
++
++ if (unlikely(!rq->online))
++ return 0;
++
++ if (cpumask_empty(&sched_rq_pending_mask))
++ return 0;
++
++ topo_mask = per_cpu(sched_cpu_topo_masks, cpu);
++ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
++ do {
++ int i;
++
++ if (!cpumask_and(&chk, &sched_rq_pending_mask, topo_mask))
++ continue;
++
++ for_each_cpu_wrap(i, &chk, cpu) {
++ int nr_migrated;
++ struct rq *src_rq;
++
++ src_rq = cpu_rq(i);
++ if (!do_raw_spin_trylock(&src_rq->lock))
++ continue;
++ spin_acquire(&src_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
++ sub_nr_running(src_rq, nr_migrated);
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++
++ add_nr_running(rq, nr_migrated);
++
++ update_sched_preempt_mask(rq);
++ cpufreq_update_util(rq, 0);
++
++ return 1;
++ }
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++ }
++ } while (++topo_mask < end_mask);
++
++ return 0;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sysctl_sched_base_slice;
++
++ sched_task_renew(p, rq);
++
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
++ requeue_task(p, rq);
++}
++
++static inline int balance_select_task_rq(struct task_struct *p, cpumask_t *avail_mask)
++{
++ cpumask_t mask;
++
++ if (!preempt_mask_check(&mask, avail_mask, task_sched_prio(p)))
++ return -1;
++
++ if (cpumask_and(&mask, &mask, p->cpus_ptr))
++ return best_mask_cpu(task_cpu(p), &mask);
++
++ return task_cpu(p);
++}
++
++static inline void
++__move_queued_task(struct rq *rq, struct task_struct *p, struct rq *dest_rq, int dest_cpu)
++{
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ set_task_cpu(p, dest_cpu);
++
++ sched_mm_cid_migrate_to(dest_rq, p);
++
++ sched_task_sanity_check(p, dest_rq);
++ enqueue_task(p, dest_rq, 0);
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
++ wakeup_preempt(dest_rq);
++}
++
++static inline void prio_balance(struct rq *rq, const int cpu)
++{
++ struct task_struct *p, *next;
++ cpumask_t mask;
++
++ if (!rq->online)
++ return;
++
++ if (!cpumask_empty(sched_idle_mask))
++ return;
++
++ if (0 == rq->prio_balance_time)
++ return;
++
++ if (rq->clock - rq->prio_balance_time < sysctl_sched_base_slice << 1)
++ return;
++
++ rq->prio_balance_time = rq->clock;
++
++ cpumask_copy(&mask, cpu_active_mask);
++ cpumask_clear_cpu(cpu, &mask);
++
++ p = sched_rq_next_task(rq->curr, rq);
++ while (p != rq->idle) {
++ next = sched_rq_next_task(p, rq);
++ if (!is_migration_disabled(p)) {
++ int dest_cpu;
++
++ dest_cpu = balance_select_task_rq(p, &mask);
++ if (dest_cpu < 0)
++ return;
++
++ if (cpu != dest_cpu) {
++ struct rq *dest_rq = cpu_rq(dest_cpu);
++
++ if (do_raw_spin_trylock(&dest_rq->lock)) {
++ cpumask_clear_cpu(dest_cpu, &mask);
++
++ spin_acquire(&dest_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ __move_queued_task(rq, p, dest_rq, dest_cpu);
++
++ spin_release(&dest_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&dest_rq->lock);
++ }
++ }
++ }
++ p = next;
++ }
++}
++
++/*
++ * Timeslices below RESCHED_NS are considered as good as expired as there's no
++ * point rescheduling when there's so little time left.
++ */
++static inline void check_curr(struct task_struct *p, struct rq *rq)
++{
++ if (unlikely(rq->idle == p))
++ return;
++
++ update_curr(rq, p);
++
++ if (p->time_slice < RESCHED_NS)
++ time_slice_expired(p, rq);
++}
++
++static inline struct task_struct *
++choose_next_task(struct rq *rq, int cpu)
++{
++ struct task_struct *next = sched_rq_first_task(rq);
++
++ if (next == rq->idle) {
++ if (!take_other_rq_tasks(rq, cpu)) {
++ if (likely(rq->balance_func && rq->online))
++ rq->balance_func(rq, cpu);
++
++ schedstat_inc(rq->sched_goidle);
++ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
++ return next;
++ }
++ next = sched_rq_first_task(rq);
++ }
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, next->time_slice);
++#endif
++ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu, next);*/
++ return next;
++}
++
++/*
++ * Constants for the sched_mode argument of __schedule().
++ *
++ * The mode argument allows RT enabled kernels to differentiate a
++ * preemption from blocking on an 'sleeping' spin/rwlock.
++ */
++ #define SM_IDLE (-1)
++ #define SM_NONE 0
++ #define SM_PREEMPT 1
++ #define SM_RTLOCK_WAIT 2
++
++/*
++ * Helper function for __schedule()
++ *
++ * If a task does not have signals pending, deactivate it
++ * Otherwise marks the task's __state as RUNNING
++ */
++static bool try_to_block_task(struct rq *rq, struct task_struct *p,
++ unsigned long *task_state_p)
++{
++ unsigned long task_state = *task_state_p;
++ if (signal_pending_state(task_state, p)) {
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ *task_state_p = TASK_RUNNING;
++ return false;
++ }
++ p->sched_contributes_to_load =
++ (task_state & TASK_UNINTERRUPTIBLE) &&
++ !(task_state & TASK_NOLOAD) &&
++ !(task_state & TASK_FROZEN);
++
++ /*
++ * __schedule() ttwu()
++ * prev_state = prev->state; if (p->on_rq && ...)
++ * if (prev_state) goto out;
++ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
++ * p->state = TASK_WAKING
++ *
++ * Where __schedule() and ttwu() have matching control dependencies.
++ *
++ * After this, schedule() must not care about p->state any more.
++ */
++ sched_task_deactivate(p, rq);
++ block_task(rq, p);
++ return true;
++}
++
++/*
++ * schedule() is the main scheduler function.
++ *
++ * The main means of driving the scheduler and thus entering this function are:
++ *
++ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
++ *
++ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
++ * paths. For example, see arch/x86/entry_64.S.
++ *
++ * To drive preemption between tasks, the scheduler sets the flag in timer
++ * interrupt handler sched_tick().
++ *
++ * 3. Wakeups don't really cause entry into schedule(). They add a
++ * task to the run-queue and that's it.
++ *
++ * Now, if the new task added to the run-queue preempts the current
++ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
++ * called on the nearest possible occasion:
++ *
++ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
++ *
++ * - in syscall or exception context, at the next outmost
++ * preempt_enable(). (this might be as soon as the wake_up()'s
++ * spin_unlock()!)
++ *
++ * - in IRQ context, return from interrupt-handler to
++ * preemptible context
++ *
++ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
++ * then at the next:
++ *
++ * - cond_resched() call
++ * - explicit schedule() call
++ * - return from syscall or exception to user-space
++ * - return from interrupt-handler to user-space
++ *
++ * WARNING: must be called with preemption disabled!
++ */
++static void __sched notrace __schedule(int sched_mode)
++{
++ struct task_struct *prev, *next;
++ /*
++ * On PREEMPT_RT kernel, SM_RTLOCK_WAIT is noted
++ * as a preemption by schedule_debug() and RCU.
++ */
++ bool preempt = sched_mode > SM_NONE;
++ bool is_switch = false;
++ unsigned long *switch_count;
++ unsigned long prev_state;
++ struct rq *rq;
++ int cpu;
++
++ /* Trace preemptions consistently with task switches */
++ trace_sched_entry_tp(preempt);
++
++ cpu = smp_processor_id();
++ rq = cpu_rq(cpu);
++ prev = rq->curr;
++
++ schedule_debug(prev, preempt);
++
++ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
++ hrtick_clear(rq);
++
++ klp_sched_try_switch(prev);
++
++ local_irq_disable();
++ rcu_note_context_switch(preempt);
++
++ /*
++ * Make sure that signal_pending_state()->signal_pending() below
++ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
++ * done by the caller to avoid the race with signal_wake_up():
++ *
++ * __set_current_state(@state) signal_wake_up()
++ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
++ * wake_up_state(p, state)
++ * LOCK rq->lock LOCK p->pi_state
++ * smp_mb__after_spinlock() smp_mb__after_spinlock()
++ * if (signal_pending_state()) if (p->state & @state)
++ *
++ * Also, the membarrier system call requires a full memory barrier
++ * after coming from user-space, before storing to rq->curr; this
++ * barrier matches a full barrier in the proximity of the membarrier
++ * system call exit.
++ */
++ raw_spin_lock(&rq->lock);
++ smp_mb__after_spinlock();
++
++ update_rq_clock(rq);
++
++ switch_count = &prev->nivcsw;
++
++ /* Task state changes only considers SM_PREEMPT as preemption */
++ preempt = sched_mode == SM_PREEMPT;
++
++ /*
++ * We must load prev->state once (task_struct::state is volatile), such
++ * that we form a control dependency vs deactivate_task() below.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ if (sched_mode == SM_IDLE) {
++ if (!rq->nr_running) {
++ next = prev;
++ goto picked;
++ }
++ } else if (!preempt && prev_state) {
++ try_to_block_task(rq, prev, &prev_state);
++ switch_count = &prev->nvcsw;
++ }
++
++ check_curr(prev, rq);
++
++ next = choose_next_task(rq, cpu);
++picked:
++ clear_tsk_need_resched(prev);
++ clear_preempt_need_resched();
++ rq->last_seen_need_resched_ns = 0;
++
++ is_switch = prev != next;
++ if (likely(is_switch)) {
++ next->last_ran = rq->clock_task;
++
++ /*printk(KERN_INFO "sched: %px -> %px\n", prev, next);*/
++ rq->nr_switches++;
++ /*
++ * RCU users of rcu_dereference(rq->curr) may not see
++ * changes to task_struct made by pick_next_task().
++ */
++ RCU_INIT_POINTER(rq->curr, next);
++ /*
++ * The membarrier system call requires each architecture
++ * to have a full memory barrier after updating
++ * rq->curr, before returning to user-space.
++ *
++ * Here are the schemes providing that barrier on the
++ * various architectures:
++ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC,
++ * RISC-V. switch_mm() relies on membarrier_arch_switch_mm()
++ * on PowerPC and on RISC-V.
++ * - finish_lock_switch() for weakly-ordered
++ * architectures where spin_unlock is a full barrier,
++ * - switch_to() for arm64 (weakly-ordered, spin_unlock
++ * is a RELEASE barrier),
++ *
++ * The barrier matches a full barrier in the proximity of
++ * the membarrier system call entry.
++ *
++ * On RISC-V, this barrier pairing is also needed for the
++ * SYNC_CORE command when switching between processes, cf.
++ * the inline comments in membarrier_arch_switch_mm().
++ */
++ ++*switch_count;
++
++ trace_sched_switch(preempt, prev, next, prev_state);
++
++ /* Also unlocks the rq: */
++ rq = context_switch(rq, prev, next);
++
++ cpu = cpu_of(rq);
++ } else {
++ __balance_callbacks(rq);
++ prio_balance(rq, cpu);
++ raw_spin_unlock_irq(&rq->lock);
++ }
++ trace_sched_exit_tp(is_switch);
++}
++
++void __noreturn do_task_dead(void)
++{
++ /* Causes final put_task_struct in finish_task_switch(): */
++ set_special_state(TASK_DEAD);
++
++ /* Tell freezer to ignore us: */
++ current->flags |= PF_NOFREEZE;
++
++ __schedule(SM_NONE);
++ BUG();
++
++ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
++ for (;;)
++ cpu_relax();
++}
++
++static inline void sched_submit_work(struct task_struct *tsk)
++{
++ static DEFINE_WAIT_OVERRIDE_MAP(sched_map, LD_WAIT_CONFIG);
++ unsigned int task_flags;
++
++ /*
++ * Establish LD_WAIT_CONFIG context to ensure none of the code called
++ * will use a blocking primitive -- which would lead to recursion.
++ */
++ lock_map_acquire_try(&sched_map);
++
++ task_flags = tsk->flags;
++ /*
++ * If a worker goes to sleep, notify and ask workqueue whether it
++ * wants to wake up a task to maintain concurrency.
++ */
++ if (task_flags & PF_WQ_WORKER)
++ wq_worker_sleeping(tsk);
++ else if (task_flags & PF_IO_WORKER)
++ io_wq_worker_sleeping(tsk);
++
++ /*
++ * spinlock and rwlock must not flush block requests. This will
++ * deadlock if the callback attempts to acquire a lock which is
++ * already acquired.
++ */
++ WARN_ON_ONCE(current->__state & TASK_RTLOCK_WAIT);
++
++ /*
++ * If we are going to sleep and we have plugged IO queued,
++ * make sure to submit it to avoid deadlocks.
++ */
++ blk_flush_plug(tsk->plug, true);
++
++ lock_map_release(&sched_map);
++}
++
++static void sched_update_worker(struct task_struct *tsk)
++{
++ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER | PF_BLOCK_TS)) {
++ if (tsk->flags & PF_BLOCK_TS)
++ blk_plug_invalidate_ts(tsk);
++ if (tsk->flags & PF_WQ_WORKER)
++ wq_worker_running(tsk);
++ else if (tsk->flags & PF_IO_WORKER)
++ io_wq_worker_running(tsk);
++ }
++}
++
++static __always_inline void __schedule_loop(int sched_mode)
++{
++ do {
++ preempt_disable();
++ __schedule(sched_mode);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++}
++
++asmlinkage __visible void __sched schedule(void)
++{
++ struct task_struct *tsk = current;
++
++#ifdef CONFIG_RT_MUTEXES
++ lockdep_assert(!tsk->sched_rt_mutex);
++#endif
++
++ if (!task_is_running(tsk))
++ sched_submit_work(tsk);
++ __schedule_loop(SM_NONE);
++ sched_update_worker(tsk);
++}
++EXPORT_SYMBOL(schedule);
++
++/*
++ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
++ * state (have scheduled out non-voluntarily) by making sure that all
++ * tasks have either left the run queue or have gone into user space.
++ * As idle tasks do not do either, they must not ever be preempted
++ * (schedule out non-voluntarily).
++ *
++ * schedule_idle() is similar to schedule_preempt_disable() except that it
++ * never enables preemption because it does not call sched_submit_work().
++ */
++void __sched schedule_idle(void)
++{
++ /*
++ * As this skips calling sched_submit_work(), which the idle task does
++ * regardless because that function is a NOP when the task is in a
++ * TASK_RUNNING state, make sure this isn't used someplace that the
++ * current task can be in any other state. Note, idle is always in the
++ * TASK_RUNNING state.
++ */
++ WARN_ON_ONCE(current->__state);
++ do {
++ __schedule(SM_IDLE);
++ } while (need_resched());
++}
++
++#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
++asmlinkage __visible void __sched schedule_user(void)
++{
++ /*
++ * If we come here after a random call to set_need_resched(),
++ * or we have been woken up remotely but the IPI has not yet arrived,
++ * we haven't yet exited the RCU idle mode. Do it here manually until
++ * we find a better solution.
++ *
++ * NB: There are buggy callers of this function. Ideally we
++ * should warn if prev_state != CT_STATE_USER, but that will trigger
++ * too frequently to make sense yet.
++ */
++ enum ctx_state prev_state = exception_enter();
++ schedule();
++ exception_exit(prev_state);
++}
++#endif
++
++/**
++ * schedule_preempt_disabled - called with preemption disabled
++ *
++ * Returns with preemption disabled. Note: preempt_count must be 1
++ */
++void __sched schedule_preempt_disabled(void)
++{
++ sched_preempt_enable_no_resched();
++ schedule();
++ preempt_disable();
++}
++
++#ifdef CONFIG_PREEMPT_RT
++void __sched notrace schedule_rtlock(void)
++{
++ __schedule_loop(SM_RTLOCK_WAIT);
++}
++NOKPROBE_SYMBOL(schedule_rtlock);
++#endif
++
++static void __sched notrace preempt_schedule_common(void)
++{
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ __schedule(SM_PREEMPT);
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ } while (need_resched());
++}
++
++#ifdef CONFIG_PREEMPTION
++/*
++ * This is the entry point to schedule() from in-kernel preemption
++ * off of preempt_enable.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule(void)
++{
++ /*
++ * If there is a non-zero preempt_count or interrupts are disabled,
++ * we do not want to preempt the current task. Just return..
++ */
++ if (likely(!preemptible()))
++ return;
++
++ preempt_schedule_common();
++}
++NOKPROBE_SYMBOL(preempt_schedule);
++EXPORT_SYMBOL(preempt_schedule);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++# ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++# ifndef preempt_schedule_dynamic_enabled
++# define preempt_schedule_dynamic_enabled preempt_schedule
++# define preempt_schedule_dynamic_disabled NULL
++# endif
++DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
++void __sched notrace dynamic_preempt_schedule(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
++ return;
++ preempt_schedule();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule);
++EXPORT_SYMBOL(dynamic_preempt_schedule);
++# endif
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++/**
++ * preempt_schedule_notrace - preempt_schedule called by tracing
++ *
++ * The tracing infrastructure uses preempt_enable_notrace to prevent
++ * recursion and tracing preempt enabling caused by the tracing
++ * infrastructure itself. But as tracing can happen in areas coming
++ * from userspace or just about to enter userspace, a preempt enable
++ * can occur before user_exit() is called. This will cause the scheduler
++ * to be called when the system is still in usermode.
++ *
++ * To prevent this, the preempt_enable_notrace will use this function
++ * instead of preempt_schedule() to exit user context if needed before
++ * calling the scheduler.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
++{
++ enum ctx_state prev_ctx;
++
++ if (likely(!preemptible()))
++ return;
++
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ /*
++ * Needs preempt disabled in case user_exit() is traced
++ * and the tracer calls preempt_enable_notrace() causing
++ * an infinite recursion.
++ */
++ prev_ctx = exception_enter();
++ __schedule(SM_PREEMPT);
++ exception_exit(prev_ctx);
++
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++ } while (need_resched());
++}
++EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++# ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++# ifndef preempt_schedule_notrace_dynamic_enabled
++# define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
++# define preempt_schedule_notrace_dynamic_disabled NULL
++# endif
++DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
++void __sched notrace dynamic_preempt_schedule_notrace(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
++ return;
++ preempt_schedule_notrace();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
++EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
++# endif
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++#endif /* CONFIG_PREEMPTION */
++
++/*
++ * This is the entry point to schedule() from kernel preemption
++ * off of IRQ context.
++ * Note, that this is called and return with IRQs disabled. This will
++ * protect us against recursive calling from IRQ contexts.
++ */
++asmlinkage __visible void __sched preempt_schedule_irq(void)
++{
++ enum ctx_state prev_state;
++
++ /* Catch callers which need to be fixed */
++ BUG_ON(preempt_count() || !irqs_disabled());
++
++ prev_state = exception_enter();
++
++ do {
++ preempt_disable();
++ local_irq_enable();
++ __schedule(SM_PREEMPT);
++ local_irq_disable();
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++
++ exception_exit(prev_state);
++}
++
++int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
++ void *key)
++{
++ WARN_ON_ONCE(wake_flags & ~(WF_SYNC|WF_CURRENT_CPU));
++ return try_to_wake_up(curr->private, mode, wake_flags);
++}
++EXPORT_SYMBOL(default_wake_function);
++
++void check_task_changed(struct task_struct *p, struct rq *rq)
++{
++ /* Trigger resched if task sched_prio has been modified. */
++ if (task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ requeue_task(p, rq);
++ wakeup_preempt(rq);
++ }
++}
++
++void __setscheduler_prio(struct task_struct *p, int prio)
++{
++ p->prio = prio;
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++/*
++ * Would be more useful with typeof()/auto_type but they don't mix with
++ * bit-fields. Since it's a local thing, use int. Keep the generic sounding
++ * name such that if someone were to implement this function we get to compare
++ * notes.
++ */
++#define fetch_and_set(x, v) ({ int _x = (x); (x) = (v); _x; })
++
++void rt_mutex_pre_schedule(void)
++{
++ lockdep_assert(!fetch_and_set(current->sched_rt_mutex, 1));
++ sched_submit_work(current);
++}
++
++void rt_mutex_schedule(void)
++{
++ lockdep_assert(current->sched_rt_mutex);
++ __schedule_loop(SM_NONE);
++}
++
++void rt_mutex_post_schedule(void)
++{
++ sched_update_worker(current);
++ lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
++}
++
++/*
++ * rt_mutex_setprio - set the current priority of a task
++ * @p: task to boost
++ * @pi_task: donor task
++ *
++ * This function changes the 'effective' priority of a task. It does
++ * not touch ->normal_prio like __setscheduler().
++ *
++ * Used by the rt_mutex code to implement priority inheritance
++ * logic. Call site only calls if the priority of the task changed.
++ */
++void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
++{
++ int prio;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ /* XXX used to be waiter->prio, not waiter->task->prio */
++ prio = __rt_effective_prio(pi_task, p->normal_prio);
++
++ /*
++ * If nothing changed; bail early.
++ */
++ if (p->pi_top_task == pi_task && prio == p->prio)
++ return;
++
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Set under pi_lock && rq->lock, such that the value can be used under
++ * either lock.
++ *
++ * Note that there is loads of tricky to make this pointer cache work
++ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
++ * ensure a task is de-boosted (pi_task is set to NULL) before the
++ * task is allowed to run again (and can exit). This ensures the pointer
++ * points to a blocked task -- which guarantees the task is present.
++ */
++ p->pi_top_task = pi_task;
++
++ /*
++ * For FIFO/RR we only need to set prio, if that matches we're done.
++ */
++ if (prio == p->prio)
++ goto out_unlock;
++
++ /*
++ * Idle task boosting is a no-no in general. There is one
++ * exception, when PREEMPT_RT and NOHZ is active:
++ *
++ * The idle task calls get_next_timer_interrupt() and holds
++ * the timer wheel base->lock on the CPU and another CPU wants
++ * to access the timer (probably to cancel it). We can safely
++ * ignore the boosting request, as the idle CPU runs this code
++ * with interrupts disabled and will complete the lock
++ * protected section without being interrupted. So there is no
++ * real need to boost.
++ */
++ if (unlikely(p == rq->idle)) {
++ WARN_ON(p != rq->curr);
++ WARN_ON(p->pi_blocked_on);
++ goto out_unlock;
++ }
++
++ trace_sched_pi_setprio(p, pi_task);
++
++ __setscheduler_prio(p, prio);
++
++ check_task_changed(p, rq);
++out_unlock:
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++
++ if (task_on_rq_queued(p))
++ __balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++
++ preempt_enable();
++}
++#endif /* CONFIG_RT_MUTEXES */
++
++#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
++int __sched __cond_resched(void)
++{
++ if (should_resched(0) && !irqs_disabled()) {
++ preempt_schedule_common();
++ return 1;
++ }
++ /*
++ * In PREEMPT_RCU kernels, ->rcu_read_lock_nesting tells the tick
++ * whether the current CPU is in an RCU read-side critical section,
++ * so the tick can report quiescent states even for CPUs looping
++ * in kernel context. In contrast, in non-preemptible kernels,
++ * RCU readers leave no in-memory hints, which means that CPU-bound
++ * processes executing in kernel context might never report an
++ * RCU quiescent state. Therefore, the following code causes
++ * cond_resched() to report a quiescent state, but only when RCU
++ * is in urgent need of one.
++ * A third case, preemptible, but non-PREEMPT_RCU provides for
++ * urgently needed quiescent states via rcu_flavor_sched_clock_irq().
++ */
++#ifndef CONFIG_PREEMPT_RCU
++ rcu_all_qs();
++#endif
++ return 0;
++}
++EXPORT_SYMBOL(__cond_resched);
++#endif
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++# ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++# define cond_resched_dynamic_enabled __cond_resched
++# define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(cond_resched);
++
++# define might_resched_dynamic_enabled __cond_resched
++# define might_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(might_resched);
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
++int __sched dynamic_cond_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_cond_resched);
++
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
++int __sched dynamic_might_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_might_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_might_resched);
++# endif
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++/*
++ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
++ * call schedule, and on return reacquire the lock.
++ *
++ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
++ * operations here to prevent schedule() from being called twice (once via
++ * spin_unlock(), once by hand).
++ */
++int __cond_resched_lock(spinlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held(lock);
++
++ if (spin_needbreak(lock) || resched) {
++ spin_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ spin_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_lock);
++
++int __cond_resched_rwlock_read(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_read(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ read_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ read_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_read);
++
++int __cond_resched_rwlock_write(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_write(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ write_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ write_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_write);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++
++# ifdef CONFIG_GENERIC_ENTRY
++# include <linux/entry-common.h>
++# endif
++
++/*
++ * SC:cond_resched
++ * SC:might_resched
++ * SC:preempt_schedule
++ * SC:preempt_schedule_notrace
++ * SC:irqentry_exit_cond_resched
++ *
++ *
++ * NONE:
++ * cond_resched <- __cond_resched
++ * might_resched <- RET0
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ * dynamic_preempt_lazy <- false
++ *
++ * VOLUNTARY:
++ * cond_resched <- __cond_resched
++ * might_resched <- __cond_resched
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ * dynamic_preempt_lazy <- false
++ *
++ * FULL:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ * dynamic_preempt_lazy <- false
++ *
++ * LAZY:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ * dynamic_preempt_lazy <- true
++ */
++
++enum {
++ preempt_dynamic_undefined = -1,
++ preempt_dynamic_none,
++ preempt_dynamic_voluntary,
++ preempt_dynamic_full,
++ preempt_dynamic_lazy,
++};
++
++int preempt_dynamic_mode = preempt_dynamic_undefined;
++
++int sched_dynamic_mode(const char *str)
++{
++# ifndef CONFIG_PREEMPT_RT
++ if (!strcmp(str, "none"))
++ return preempt_dynamic_none;
++
++ if (!strcmp(str, "voluntary"))
++ return preempt_dynamic_voluntary;
++# endif
++
++ if (!strcmp(str, "full"))
++ return preempt_dynamic_full;
++
++# ifdef CONFIG_ARCH_HAS_PREEMPT_LAZY
++ if (!strcmp(str, "lazy"))
++ return preempt_dynamic_lazy;
++# endif
++
++ return -EINVAL;
++}
++
++# define preempt_dynamic_key_enable(f) static_key_enable(&sk_dynamic_##f.key)
++# define preempt_dynamic_key_disable(f) static_key_disable(&sk_dynamic_##f.key)
++
++# if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++# define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
++# define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
++# elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++# define preempt_dynamic_enable(f) preempt_dynamic_key_enable(f)
++# define preempt_dynamic_disable(f) preempt_dynamic_key_disable(f)
++# else
++# error "Unsupported PREEMPT_DYNAMIC mechanism"
++# endif
++
++static DEFINE_MUTEX(sched_dynamic_mutex);
++
++static void __sched_dynamic_update(int mode)
++{
++ /*
++ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
++ * the ZERO state, which is invalid.
++ */
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++
++ switch (mode) {
++ case preempt_dynamic_none:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: none\n");
++ break;
++
++ case preempt_dynamic_voluntary:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: voluntary\n");
++ break;
++
++ case preempt_dynamic_full:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_disable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: full\n");
++ break;
++
++ case preempt_dynamic_lazy:
++ preempt_dynamic_disable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ preempt_dynamic_key_enable(preempt_lazy);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: lazy\n");
++ break;
++ }
++
++ preempt_dynamic_mode = mode;
++}
++
++void sched_dynamic_update(int mode)
++{
++ mutex_lock(&sched_dynamic_mutex);
++ __sched_dynamic_update(mode);
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++static int __init setup_preempt_mode(char *str)
++{
++ int mode = sched_dynamic_mode(str);
++ if (mode < 0) {
++ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
++ return 0;
++ }
++
++ sched_dynamic_update(mode);
++ return 1;
++}
++__setup("preempt=", setup_preempt_mode);
++
++static void __init preempt_dynamic_init(void)
++{
++ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
++ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
++ sched_dynamic_update(preempt_dynamic_none);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
++ sched_dynamic_update(preempt_dynamic_voluntary);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_LAZY)) {
++ sched_dynamic_update(preempt_dynamic_lazy);
++ } else {
++ /* Default static call setting, nothing to do */
++ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
++ preempt_dynamic_mode = preempt_dynamic_full;
++ pr_info("Dynamic Preempt: full\n");
++ }
++ }
++}
++
++# define PREEMPT_MODEL_ACCESSOR(mode) \
++ bool preempt_model_##mode(void) \
++ { \
++ WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
++ return preempt_dynamic_mode == preempt_dynamic_##mode; \
++ } \
++ EXPORT_SYMBOL_GPL(preempt_model_##mode)
++
++PREEMPT_MODEL_ACCESSOR(none);
++PREEMPT_MODEL_ACCESSOR(voluntary);
++PREEMPT_MODEL_ACCESSOR(full);
++PREEMPT_MODEL_ACCESSOR(lazy);
++
++#else /* !CONFIG_PREEMPT_DYNAMIC: */
++
++#define preempt_dynamic_mode -1
++
++static inline void preempt_dynamic_init(void) { }
++
++#endif /* CONFIG_PREEMPT_DYNAMIC */
++
++const char *preempt_modes[] = {
++ "none", "voluntary", "full", "lazy", NULL,
++};
++
++const char *preempt_model_str(void)
++{
++ bool brace = IS_ENABLED(CONFIG_PREEMPT_RT) &&
++ (IS_ENABLED(CONFIG_PREEMPT_DYNAMIC) ||
++ IS_ENABLED(CONFIG_PREEMPT_LAZY));
++ static char buf[128];
++
++ if (IS_ENABLED(CONFIG_PREEMPT_BUILD)) {
++ struct seq_buf s;
++
++ seq_buf_init(&s, buf, sizeof(buf));
++ seq_buf_puts(&s, "PREEMPT");
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RT))
++ seq_buf_printf(&s, "%sRT%s",
++ brace ? "_{" : "_",
++ brace ? "," : "");
++
++ if (IS_ENABLED(CONFIG_PREEMPT_DYNAMIC)) {
++ seq_buf_printf(&s, "(%s)%s",
++ preempt_dynamic_mode >= 0 ?
++ preempt_modes[preempt_dynamic_mode] : "undef",
++ brace ? "}" : "");
++ return seq_buf_str(&s);
++ }
++
++ if (IS_ENABLED(CONFIG_PREEMPT_LAZY)) {
++ seq_buf_printf(&s, "LAZY%s",
++ brace ? "}" : "");
++ return seq_buf_str(&s);
++ }
++
++ return seq_buf_str(&s);
++ }
++
++ if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY_BUILD))
++ return "VOLUNTARY";
++
++ return "NONE";
++}
++
++int io_schedule_prepare(void)
++{
++ int old_iowait = current->in_iowait;
++
++ current->in_iowait = 1;
++ blk_flush_plug(current->plug, true);
++ return old_iowait;
++}
++
++void io_schedule_finish(int token)
++{
++ current->in_iowait = token;
++}
++
++/*
++ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
++ * that process accounting knows that this is a task in IO wait state.
++ *
++ * But don't do that if it is a deliberate, throttling IO wait (this task
++ * has set its backing_dev_info: the queue against which it should throttle)
++ */
++
++long __sched io_schedule_timeout(long timeout)
++{
++ int token;
++ long ret;
++
++ token = io_schedule_prepare();
++ ret = schedule_timeout(timeout);
++ io_schedule_finish(token);
++
++ return ret;
++}
++EXPORT_SYMBOL(io_schedule_timeout);
++
++void __sched io_schedule(void)
++{
++ int token;
++
++ token = io_schedule_prepare();
++ schedule();
++ io_schedule_finish(token);
++}
++EXPORT_SYMBOL(io_schedule);
++
++void sched_show_task(struct task_struct *p)
++{
++ unsigned long free;
++ int ppid;
++
++ if (!try_get_task_stack(p))
++ return;
++
++ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
++
++ if (task_is_running(p))
++ pr_cont(" running task ");
++ free = stack_not_used(p);
++ ppid = 0;
++ rcu_read_lock();
++ if (pid_alive(p))
++ ppid = task_pid_nr(rcu_dereference(p->real_parent));
++ rcu_read_unlock();
++ pr_cont(" stack:%-5lu pid:%-5d tgid:%-5d ppid:%-6d task_flags:0x%04x flags:0x%08lx\n",
++ free, task_pid_nr(p), task_tgid_nr(p),
++ ppid, p->flags, read_task_thread_flags(p));
++
++ print_worker_info(KERN_INFO, p);
++ print_stop_info(KERN_INFO, p);
++ show_stack(p, NULL, KERN_INFO);
++ put_task_stack(p);
++}
++EXPORT_SYMBOL_GPL(sched_show_task);
++
++static inline bool
++state_filter_match(unsigned long state_filter, struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /* no filter, everything matches */
++ if (!state_filter)
++ return true;
++
++ /* filter, but doesn't match */
++ if (!(state & state_filter))
++ return false;
++
++ /*
++ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
++ * TASK_KILLABLE).
++ */
++ if (state_filter == TASK_UNINTERRUPTIBLE && (state & TASK_NOLOAD))
++ return false;
++
++ return true;
++}
++
++
++void show_state_filter(unsigned int state_filter)
++{
++ struct task_struct *g, *p;
++
++ rcu_read_lock();
++ for_each_process_thread(g, p) {
++ /*
++ * reset the NMI-timeout, listing all files on a slow
++ * console might take a lot of time:
++ * Also, reset softlockup watchdogs on all CPUs, because
++ * another CPU might be blocked waiting for us to process
++ * an IPI.
++ */
++ touch_nmi_watchdog();
++ touch_all_softlockup_watchdogs();
++ if (state_filter_match(state_filter, p))
++ sched_show_task(p);
++ }
++
++ /* TODO: Alt schedule FW should support this
++ if (!state_filter)
++ sysrq_sched_debug_show();
++ */
++ rcu_read_unlock();
++ /*
++ * Only show locks if all tasks are dumped:
++ */
++ if (!state_filter)
++ debug_show_all_locks();
++}
++
++void dump_cpu_task(int cpu)
++{
++ if (in_hardirq() && cpu == smp_processor_id()) {
++ struct pt_regs *regs;
++
++ regs = get_irq_regs();
++ if (regs) {
++ show_regs(regs);
++ return;
++ }
++ }
++
++ if (trigger_single_cpu_backtrace(cpu))
++ return;
++
++ pr_info("Task dump for CPU %d:\n", cpu);
++ sched_show_task(cpu_curr(cpu));
++}
++
++/**
++ * init_idle - set up an idle thread for a given CPU
++ * @idle: task in question
++ * @cpu: CPU the idle task belongs to
++ *
++ * NOTE: this function does not set the idle thread's NEED_RESCHED
++ * flag, to make booting more robust.
++ */
++void __init init_idle(struct task_struct *idle, int cpu)
++{
++ struct affinity_context ac = (struct affinity_context) {
++ .new_mask = cpumask_of(cpu),
++ .flags = 0,
++ };
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&idle->pi_lock, flags);
++ raw_spin_lock(&rq->lock);
++
++ idle->last_ran = rq->clock_task;
++ idle->__state = TASK_RUNNING;
++ /*
++ * PF_KTHREAD should already be set at this point; regardless, make it
++ * look like a proper per-CPU kthread.
++ */
++ idle->flags |= PF_KTHREAD | PF_NO_SETAFFINITY;
++ kthread_set_per_cpu(idle, cpu);
++
++ sched_queue_init_idle(&rq->queue, idle);
++
++ /*
++ * No validation and serialization required at boot time and for
++ * setting up the idle tasks of not yet online CPUs.
++ */
++ set_cpus_allowed_common(idle, &ac);
++
++ /* Silence PROVE_RCU */
++ rcu_read_lock();
++ __set_task_cpu(idle, cpu);
++ rcu_read_unlock();
++
++ rq->idle = idle;
++ rcu_assign_pointer(rq->curr, idle);
++ idle->on_cpu = 1;
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
++
++ /* Set the preempt count _outside_ the spinlocks! */
++ init_idle_preempt_count(idle, cpu);
++
++ ftrace_graph_init_idle_task(idle, cpu);
++ vtime_init_idle(idle, cpu);
++ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
++}
++
++int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
++ const struct cpumask __maybe_unused *trial)
++{
++ return 1;
++}
++
++int task_can_attach(struct task_struct *p)
++{
++ int ret = 0;
++
++ /*
++ * Kthreads which disallow setaffinity shouldn't be moved
++ * to a new cpuset; we don't want to change their CPU
++ * affinity and isolating such threads by their set of
++ * allowed nodes is unnecessary. Thus, cpusets are not
++ * applicable for such threads. This prevents checking for
++ * success of set_cpus_allowed_ptr() on all attached tasks
++ * before cpus_mask may be changed.
++ */
++ if (p->flags & PF_NO_SETAFFINITY)
++ ret = -EINVAL;
++
++ return ret;
++}
++
++bool sched_smp_initialized __read_mostly;
++
++#ifdef CONFIG_HOTPLUG_CPU
++/*
++ * Invoked on the outgoing CPU in context of the CPU hotplug thread
++ * after ensuring that there are no user space tasks left on the CPU.
++ *
++ * If there is a lazy mm in use on the hotplug thread, drop it and
++ * switch to init_mm.
++ *
++ * The reference count on init_mm is dropped in finish_cpu().
++ */
++static void sched_force_init_mm(void)
++{
++ struct mm_struct *mm = current->active_mm;
++
++ if (mm != &init_mm) {
++ mmgrab_lazy_tlb(&init_mm);
++ local_irq_disable();
++ current->active_mm = &init_mm;
++ switch_mm_irqs_off(mm, &init_mm, current);
++ local_irq_enable();
++ finish_arch_post_lock_switch();
++ mmdrop_lazy_tlb(mm);
++ }
++
++ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
++}
++
++static int __balance_push_cpu_stop(void *arg)
++{
++ struct task_struct *p = arg;
++ struct rq *rq = this_rq();
++ struct rq_flags rf;
++ int cpu;
++
++ raw_spin_lock_irq(&p->pi_lock);
++ rq_lock(rq, &rf);
++
++ update_rq_clock(rq);
++
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ cpu = select_fallback_rq(rq->cpu, p);
++ rq = __migrate_task(rq, p, cpu);
++ }
++
++ rq_unlock(rq, &rf);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ put_task_struct(p);
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
++
++/*
++ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
++ * effective when the hotplug motion is down.
++ */
++static void balance_push(struct rq *rq)
++{
++ struct task_struct *push_task = rq->curr;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Ensure the thing is persistent until balance_push_set(.on = false);
++ */
++ rq->balance_callback = &balance_push_callback;
++
++ /*
++ * Only active while going offline and when invoked on the outgoing
++ * CPU.
++ */
++ if (!cpu_dying(rq->cpu) || rq != this_rq())
++ return;
++
++ /*
++ * Both the cpu-hotplug and stop task are in this case and are
++ * required to complete the hotplug process.
++ */
++ if (kthread_is_per_cpu(push_task) ||
++ is_migration_disabled(push_task)) {
++
++ /*
++ * If this is the idle task on the outgoing CPU try to wake
++ * up the hotplug control thread which might wait for the
++ * last task to vanish. The rcuwait_active() check is
++ * accurate here because the waiter is pinned on this CPU
++ * and can't obviously be running in parallel.
++ *
++ * On RT kernels this also has to check whether there are
++ * pinned and scheduled out tasks on the runqueue. They
++ * need to leave the migrate disabled section first.
++ */
++ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
++ rcuwait_active(&rq->hotplug_wait)) {
++ raw_spin_unlock(&rq->lock);
++ rcuwait_wake_up(&rq->hotplug_wait);
++ raw_spin_lock(&rq->lock);
++ }
++ return;
++ }
++
++ get_task_struct(push_task);
++ /*
++ * Temporarily drop rq->lock such that we can wake-up the stop task.
++ * Both preemption and IRQs are still disabled.
++ */
++ preempt_disable();
++ raw_spin_unlock(&rq->lock);
++ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
++ this_cpu_ptr(&push_work));
++ preempt_enable();
++ /*
++ * At this point need_resched() is true and we'll take the loop in
++ * schedule(). The next pick is obviously going to be the stop task
++ * which kthread_is_per_cpu() and will push this task away.
++ */
++ raw_spin_lock(&rq->lock);
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct rq_flags rf;
++
++ rq_lock_irqsave(rq, &rf);
++ if (on) {
++ WARN_ON_ONCE(rq->balance_callback);
++ rq->balance_callback = &balance_push_callback;
++ } else if (rq->balance_callback == &balance_push_callback) {
++ rq->balance_callback = NULL;
++ }
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Invoked from a CPUs hotplug control thread after the CPU has been marked
++ * inactive. All tasks which are not per CPU kernel threads are either
++ * pushed off this CPU now via balance_push() or placed on a different CPU
++ * during wakeup. Wait until the CPU is quiescent.
++ */
++static void balance_hotplug_wait(void)
++{
++ struct rq *rq = this_rq();
++
++ rcuwait_wait_event(&rq->hotplug_wait,
++ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
++ TASK_UNINTERRUPTIBLE);
++}
++
++#else /* !CONFIG_HOTPLUG_CPU: */
++
++static void balance_push(struct rq *rq)
++{
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++}
++
++static inline void balance_hotplug_wait(void)
++{
++}
++#endif /* !CONFIG_HOTPLUG_CPU */
++
++static void set_rq_offline(struct rq *rq)
++{
++ if (rq->online) {
++ update_rq_clock(rq);
++ rq->online = false;
++ }
++}
++
++static void set_rq_online(struct rq *rq)
++{
++ if (!rq->online)
++ rq->online = true;
++}
++
++static inline void sched_set_rq_online(struct rq *rq, int cpu)
++{
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_online(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++static inline void sched_set_rq_offline(struct rq *rq, int cpu)
++{
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_offline(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++/*
++ * used to mark begin/end of suspend/resume:
++ */
++static int num_cpus_frozen;
++
++/*
++ * Update cpusets according to cpu_active mask. If cpusets are
++ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
++ * around partition_sched_domains().
++ *
++ * If we come here as part of a suspend/resume, don't touch cpusets because we
++ * want to restore it back to its original state upon resume anyway.
++ */
++static void cpuset_cpu_active(void)
++{
++ if (cpuhp_tasks_frozen) {
++ /*
++ * num_cpus_frozen tracks how many CPUs are involved in suspend
++ * resume sequence. As long as this is not the last online
++ * operation in the resume sequence, just build a single sched
++ * domain, ignoring cpusets.
++ */
++ cpuset_reset_sched_domains();
++ if (--num_cpus_frozen)
++ return;
++ /*
++ * This is the last CPU online operation. So fall through and
++ * restore the original sched domains by considering the
++ * cpuset configurations.
++ */
++ cpuset_force_rebuild();
++ }
++
++ cpuset_update_active_cpus();
++}
++
++static void cpuset_cpu_inactive(unsigned int cpu)
++{
++ if (!cpuhp_tasks_frozen) {
++ cpuset_update_active_cpus();
++ } else {
++ num_cpus_frozen++;
++ cpuset_reset_sched_domains();
++ }
++}
++
++static inline void sched_smt_present_inc(int cpu)
++{
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_inc_cpuslocked(&sched_smt_present);
++ cpumask_or(&sched_smt_mask, &sched_smt_mask, cpu_smt_mask(cpu));
++ }
++#endif /* CONFIG_SCHED_SMT */
++}
++
++static inline void sched_smt_present_dec(int cpu)
++{
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_dec_cpuslocked(&sched_smt_present);
++ if (!static_branch_likely(&sched_smt_present))
++ cpumask_clear(sched_pcore_idle_mask);
++ cpumask_andnot(&sched_smt_mask, &sched_smt_mask, cpu_smt_mask(cpu));
++ }
++#endif /* CONFIG_SCHED_SMT */
++}
++
++int sched_cpu_activate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ /*
++ * Clear the balance_push callback and prepare to schedule
++ * regular tasks.
++ */
++ balance_push_set(cpu, false);
++
++ set_cpu_active(cpu, true);
++
++ if (sched_smp_initialized)
++ cpuset_cpu_active();
++
++ /*
++ * Put the rq online, if not already. This happens:
++ *
++ * 1) In the early boot process, because we build the real domains
++ * after all cpus have been brought up.
++ *
++ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
++ * domains.
++ */
++ sched_set_rq_online(rq, cpu);
++
++ /*
++ * When going up, increment the number of cores with SMT present.
++ */
++ sched_smt_present_inc(cpu);
++
++ return 0;
++}
++
++int sched_cpu_deactivate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ set_cpu_active(cpu, false);
++
++ /*
++ * From this point forward, this CPU will refuse to run any task that
++ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
++ * push those tasks away until this gets cleared, see
++ * sched_cpu_dying().
++ */
++ balance_push_set(cpu, true);
++
++ /*
++ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
++ * users of this state to go away such that all new such users will
++ * observe it.
++ *
++ * Specifically, we rely on ttwu to no longer target this CPU, see
++ * ttwu_queue_cond() and is_cpu_allowed().
++ *
++ * Do sync before park smpboot threads to take care the RCU boost case.
++ */
++ synchronize_rcu();
++
++ sched_set_rq_offline(rq, cpu);
++
++ /*
++ * When going down, decrement the number of cores with SMT present.
++ */
++ sched_smt_present_dec(cpu);
++
++ if (!sched_smp_initialized)
++ return 0;
++
++ cpuset_cpu_inactive(cpu);
++
++ return 0;
++}
++
++static void sched_rq_cpu_starting(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ rq->calc_load_update = calc_load_update;
++}
++
++int sched_cpu_starting(unsigned int cpu)
++{
++ sched_rq_cpu_starting(cpu);
++ sched_tick_start(cpu);
++ return 0;
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++
++/*
++ * Invoked immediately before the stopper thread is invoked to bring the
++ * CPU down completely. At this point all per CPU kthreads except the
++ * hotplug thread (current) and the stopper thread (inactive) have been
++ * either parked or have been unbound from the outgoing CPU. Ensure that
++ * any of those which might be on the way out are gone.
++ *
++ * If after this point a bound task is being woken on this CPU then the
++ * responsible hotplug callback has failed to do it's job.
++ * sched_cpu_dying() will catch it with the appropriate fireworks.
++ */
++int sched_cpu_wait_empty(unsigned int cpu)
++{
++ balance_hotplug_wait();
++ sched_force_init_mm();
++ return 0;
++}
++
++/*
++ * Since this CPU is going 'away' for a while, fold any nr_active delta we
++ * might have. Called from the CPU stopper task after ensuring that the
++ * stopper is the last running task on the CPU, so nr_active count is
++ * stable. We need to take the tear-down thread which is calling this into
++ * account, so we hand in adjust = 1 to the load calculation.
++ *
++ * Also see the comment "Global load-average calculations".
++ */
++static void calc_load_migrate(struct rq *rq)
++{
++ long delta = calc_load_fold_active(rq, 1);
++
++ if (delta)
++ atomic_long_add(delta, &calc_load_tasks);
++}
++
++static void dump_rq_tasks(struct rq *rq, const char *loglvl)
++{
++ struct task_struct *g, *p;
++ int cpu = cpu_of(rq);
++
++ lockdep_assert_held(&rq->lock);
++
++ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
++ for_each_process_thread(g, p) {
++ if (task_cpu(p) != cpu)
++ continue;
++
++ if (!task_on_rq_queued(p))
++ continue;
++
++ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
++ }
++}
++
++int sched_cpu_dying(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /* Handle pending wakeups and then migrate everything off */
++ sched_tick_stop(cpu);
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
++ WARN(true, "Dying CPU not properly vacated!");
++ dump_rq_tasks(rq, KERN_WARNING);
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ calc_load_migrate(rq);
++ hrtick_clear(rq);
++ return 0;
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++static void sched_init_topology_cpumask_early(void)
++{
++ int cpu;
++ cpumask_t *tmp;
++
++ for_each_possible_cpu(cpu) {
++ /* init topo masks */
++ tmp = per_cpu(sched_cpu_topo_masks, cpu);
++
++ cpumask_copy(tmp, cpu_possible_mask);
++ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
++ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
++ }
++}
++
++#define TOPOLOGY_CPUMASK(name, mask, last)\
++ if (cpumask_and(topo, topo, mask)) { \
++ cpumask_copy(topo, mask); \
++ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
++ cpu, (topo++)->bits[0]); \
++ } \
++ if (!last) \
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(mask), \
++ nr_cpumask_bits);
++
++static void sched_init_topology_cpumask(void)
++{
++ int cpu;
++ cpumask_t *topo;
++
++ for_each_online_cpu(cpu) {
++ topo = per_cpu(sched_cpu_topo_masks, cpu);
++
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(cpumask_of(cpu)),
++ nr_cpumask_bits);
++#ifdef CONFIG_SCHED_SMT
++ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
++#endif /* CONFIG_SCHED_SMT */
++ TOPOLOGY_CPUMASK(cluster, topology_cluster_cpumask(cpu), false);
++
++ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
++ per_cpu(sched_cpu_llc_mask, cpu) = topo;
++ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
++
++ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
++
++ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
++
++ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
++ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
++ cpu, per_cpu(sd_llc_id, cpu),
++ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
++ per_cpu(sched_cpu_topo_masks, cpu)));
++ }
++}
++
++void __init sched_init_smp(void)
++{
++ /* Move init over to a non-isolated CPU */
++ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
++ BUG();
++ current->flags &= ~PF_NO_SETAFFINITY;
++
++ sched_init_topology();
++ sched_init_topology_cpumask();
++
++ sched_smp_initialized = true;
++}
++
++static int __init migration_init(void)
++{
++ sched_cpu_starting(smp_processor_id());
++ return 0;
++}
++early_initcall(migration_init);
++
++int in_sched_functions(unsigned long addr)
++{
++ return in_lock_functions(addr) ||
++ (addr >= (unsigned long)__sched_text_start
++ && addr < (unsigned long)__sched_text_end);
++}
++
++#ifdef CONFIG_CGROUP_SCHED
++/*
++ * Default task group.
++ * Every task in system belongs to this group at bootup.
++ */
++struct task_group root_task_group;
++LIST_HEAD(task_groups);
++
++/* Cacheline aligned slab cache for task_group */
++static struct kmem_cache *task_group_cache __ro_after_init;
++#endif /* CONFIG_CGROUP_SCHED */
++
++void __init sched_init(void)
++{
++ int i;
++ struct rq *rq;
++
++ printk(KERN_INFO "sched/alt: "ALT_SCHED_NAME" CPU Scheduler "ALT_SCHED_VERSION\
++ " by Alfred Chen.\n");
++
++ wait_bit_init();
++
++ for (i = 0; i < SCHED_QUEUE_BITS; i++)
++ cpumask_copy(sched_preempt_mask + i, cpu_present_mask);
++
++#ifdef CONFIG_CGROUP_SCHED
++ task_group_cache = KMEM_CACHE(task_group, 0);
++
++ list_add(&root_task_group.list, &task_groups);
++ INIT_LIST_HEAD(&root_task_group.children);
++ INIT_LIST_HEAD(&root_task_group.siblings);
++#endif /* CONFIG_CGROUP_SCHED */
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++
++ sched_queue_init(&rq->queue);
++ rq->prio = IDLE_TASK_SCHED_PRIO;
++ rq->prio_balance_time = 0;
++#ifdef CONFIG_SCHED_PDS
++ rq->prio_idx = rq->prio;
++#endif
++
++ raw_spin_lock_init(&rq->lock);
++ rq->nr_running = rq->nr_uninterruptible = 0;
++ rq->calc_load_active = 0;
++ rq->calc_load_update = jiffies + LOAD_FREQ;
++ rq->online = false;
++ rq->cpu = i;
++
++ rq->clear_idle_mask_func = cpumask_clear_cpu;
++ rq->set_idle_mask_func = cpumask_set_cpu;
++ rq->balance_func = NULL;
++ rq->active_balance_arg.active = 0;
++
++#ifdef CONFIG_NO_HZ_COMMON
++ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
++#endif
++ rq->balance_callback = &balance_push_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ rcuwait_init(&rq->hotplug_wait);
++#endif
++ rq->nr_switches = 0;
++
++ hrtick_rq_init(rq);
++ atomic_set(&rq->nr_iowait, 0);
++
++ zalloc_cpumask_var_node(&rq->scratch_mask, GFP_KERNEL, cpu_to_node(i));
++ }
++ /* Set rq->online for cpu 0 */
++ cpu_rq(0)->online = true;
++ /*
++ * The boot idle thread does lazy MMU switching as well:
++ */
++ mmgrab_lazy_tlb(&init_mm);
++ enter_lazy_tlb(&init_mm, current);
++
++ /*
++ * The idle task doesn't need the kthread struct to function, but it
++ * is dressed up as a per-CPU kthread and thus needs to play the part
++ * if we want to avoid special-casing it in code that deals with per-CPU
++ * kthreads.
++ */
++ WARN_ON(!set_kthread_struct(current));
++
++ /*
++ * Make us the idle thread. Technically, schedule() should not be
++ * called from this thread, however somewhere below it might be,
++ * but because we are the idle thread, we just pick up running again
++ * when this runqueue becomes "idle".
++ */
++ __sched_fork(0, current);
++ init_idle(current, smp_processor_id());
++
++ calc_load_update = jiffies + LOAD_FREQ;
++
++ idle_thread_set_boot_cpu();
++ balance_push_set(smp_processor_id(), false);
++
++ sched_init_topology_cpumask_early();
++
++ preempt_dynamic_init();
++}
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++
++void __might_sleep(const char *file, int line)
++{
++ unsigned int state = get_current_state();
++ /*
++ * Blocking primitives will set (and therefore destroy) current->state,
++ * since we will exit with TASK_RUNNING make sure we enter with it,
++ * otherwise we will destroy state.
++ */
++ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
++ "do not call blocking ops when !TASK_RUNNING; "
++ "state=%x set at [<%p>] %pS\n", state,
++ (void *)current->task_state_change,
++ (void *)current->task_state_change);
++
++ __might_resched(file, line, 0);
++}
++EXPORT_SYMBOL(__might_sleep);
++
++static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
++{
++ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
++ return;
++
++ if (preempt_count() == preempt_offset)
++ return;
++
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, ip);
++}
++
++static inline bool resched_offsets_ok(unsigned int offsets)
++{
++ unsigned int nested = preempt_count();
++
++ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
++
++ return nested == offsets;
++}
++
++void __might_resched(const char *file, int line, unsigned int offsets)
++{
++ /* Ratelimiting timestamp: */
++ static unsigned long prev_jiffy;
++
++ unsigned long preempt_disable_ip;
++
++ /* WARN_ON_ONCE() by default, no rate limit required: */
++ rcu_sleep_check();
++
++ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
++ !is_idle_task(current) && !current->non_block_count) ||
++ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
++ oops_in_progress)
++ return;
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ /* Save this before calling printk(), since that will clobber it: */
++ preempt_disable_ip = get_preempt_disable_ip(current);
++
++ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
++ file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), current->non_block_count,
++ current->pid, current->comm);
++ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
++ offsets & MIGHT_RESCHED_PREEMPT_MASK);
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
++ pr_err("RCU nest depth: %d, expected: %u\n",
++ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
++ }
++
++ if (task_stack_end_corrupted(current))
++ pr_emerg("Thread overran stack, or stack corrupted\n");
++
++ debug_show_held_locks(current);
++ if (irqs_disabled())
++ print_irqtrace_events(current);
++
++ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
++ preempt_disable_ip);
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL(__might_resched);
++
++void __cant_sleep(const char *file, int line, int preempt_offset)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > preempt_offset)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
++ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_sleep);
++
++void __cant_migrate(const char *file, int line)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (is_migration_disabled(current))
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > 0)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), is_migration_disabled(current),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_migrate);
++#endif /* CONFIG_DEBUG_ATOMIC_SLEEP */
++
++#ifdef CONFIG_MAGIC_SYSRQ
++void normalize_rt_tasks(void)
++{
++ struct task_struct *g, *p;
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ };
++
++ read_lock(&tasklist_lock);
++ for_each_process_thread(g, p) {
++ /*
++ * Only normalize user tasks:
++ */
++ if (p->flags & PF_KTHREAD)
++ continue;
++
++ schedstat_set(p->stats.wait_start, 0);
++ schedstat_set(p->stats.sleep_start, 0);
++ schedstat_set(p->stats.block_start, 0);
++
++ if (!rt_or_dl_task(p)) {
++ /*
++ * Renice negative nice level userspace
++ * tasks back to 0:
++ */
++ if (task_nice(p) < 0)
++ set_user_nice(p, 0);
++ continue;
++ }
++
++ __sched_setscheduler(p, &attr, false, false);
++ }
++ read_unlock(&tasklist_lock);
++}
++#endif /* CONFIG_MAGIC_SYSRQ */
++
++#ifdef CONFIG_KGDB_KDB
++/*
++ * These functions are only useful for KDB.
++ *
++ * They can only be called when the whole system has been
++ * stopped - every CPU needs to be quiescent, and no scheduling
++ * activity can take place. Using them for anything else would
++ * be a serious bug, and as a result, they aren't even visible
++ * under any other configuration.
++ */
++
++/**
++ * curr_task - return the current task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ *
++ * Return: The current task for @cpu.
++ */
++struct task_struct *curr_task(int cpu)
++{
++ return cpu_curr(cpu);
++}
++
++#endif /* CONFIG_KGDB_KDB */
++
++#ifdef CONFIG_CGROUP_SCHED
++static void sched_free_group(struct task_group *tg)
++{
++ kmem_cache_free(task_group_cache, tg);
++}
++
++static void sched_free_group_rcu(struct rcu_head *rhp)
++{
++ sched_free_group(container_of(rhp, struct task_group, rcu));
++}
++
++static void sched_unregister_group(struct task_group *tg)
++{
++ /*
++ * We have to wait for yet another RCU grace period to expire, as
++ * print_cfs_stats() might run concurrently.
++ */
++ call_rcu(&tg->rcu, sched_free_group_rcu);
++}
++
++/* allocate runqueue etc for a new task group */
++struct task_group *sched_create_group(struct task_group *parent)
++{
++ struct task_group *tg;
++
++ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
++ if (!tg)
++ return ERR_PTR(-ENOMEM);
++
++ return tg;
++}
++
++void sched_online_group(struct task_group *tg, struct task_group *parent)
++{
++}
++
++/* RCU callback to free various structures associated with a task group */
++static void sched_unregister_group_rcu(struct rcu_head *rhp)
++{
++ /* Now it should be safe to free those cfs_rqs: */
++ sched_unregister_group(container_of(rhp, struct task_group, rcu));
++}
++
++void sched_destroy_group(struct task_group *tg)
++{
++ /* Wait for possible concurrent references to cfs_rqs complete: */
++ call_rcu(&tg->rcu, sched_unregister_group_rcu);
++}
++
++void sched_release_group(struct task_group *tg)
++{
++}
++
++static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
++{
++ return css ? container_of(css, struct task_group, css) : NULL;
++}
++
++static struct cgroup_subsys_state *
++cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
++{
++ struct task_group *parent = css_tg(parent_css);
++ struct task_group *tg;
++
++ if (!parent) {
++ /* This is early initialization for the top cgroup */
++ return &root_task_group.css;
++ }
++
++ tg = sched_create_group(parent);
++ if (IS_ERR(tg))
++ return ERR_PTR(-ENOMEM);
++ return &tg->css;
++}
++
++/* Expose task group only after completing cgroup initialization */
++static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++ struct task_group *parent = css_tg(css->parent);
++
++ if (parent)
++ sched_online_group(tg, parent);
++ return 0;
++}
++
++static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ sched_release_group(tg);
++}
++
++static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ /*
++ * Relies on the RCU grace period between css_released() and this.
++ */
++ sched_unregister_group(tg);
++}
++
++#ifdef CONFIG_RT_GROUP_SCHED
++static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
++{
++ return 0;
++}
++#endif /* CONFIG_RT_GROUP_SCHED */
++
++static void cpu_cgroup_attach(struct cgroup_taskset *tset)
++{
++}
++
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++static int sched_group_set_shares(struct task_group *tg, unsigned long shares)
++{
++ return 0;
++}
++
++static int sched_group_set_idle(struct task_group *tg, long idle)
++{
++ return 0;
++}
++
++static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 shareval)
++{
++ return sched_group_set_shares(css_tg(css), shareval);
++}
++
++static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static s64 cpu_idle_read_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_idle_write_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft, s64 idle)
++{
++ return sched_group_set_idle(css_tg(css), idle);
++}
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++
++#ifdef CONFIG_CFS_BANDWIDTH
++static s64 cpu_cfs_quota_read_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_cfs_quota_write_s64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, s64 cfs_quota_us)
++{
++ return 0;
++}
++
++static u64 cpu_cfs_period_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_cfs_period_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 cfs_period_us)
++{
++ return 0;
++}
++
++static u64 cpu_cfs_burst_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_cfs_burst_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 cfs_burst_us)
++{
++ return 0;
++}
++
++static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static int cpu_cfs_local_stat_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++#endif /* CONFIG_CFS_BANDWIDTH */
++
++#ifdef CONFIG_RT_GROUP_SCHED
++static int cpu_rt_runtime_write(struct cgroup_subsys_state *css,
++ struct cftype *cft, s64 val)
++{
++ return 0;
++}
++
++static s64 cpu_rt_runtime_read(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_rt_period_write_uint(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 rt_period_us)
++{
++ return 0;
++}
++
++static u64 cpu_rt_period_read_uint(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++#endif /* CONFIG_RT_GROUP_SCHED */
++
++#ifdef CONFIG_UCLAMP_TASK_GROUP
++static int cpu_uclamp_min_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static int cpu_uclamp_max_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static ssize_t cpu_uclamp_min_write(struct kernfs_open_file *of,
++ char *buf, size_t nbytes,
++ loff_t off)
++{
++ return nbytes;
++}
++
++static ssize_t cpu_uclamp_max_write(struct kernfs_open_file *of,
++ char *buf, size_t nbytes,
++ loff_t off)
++{
++ return nbytes;
++}
++#endif /* CONFIG_UCLAMP_TASK_GROUP */
++
++static struct cftype cpu_legacy_files[] = {
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++ {
++ .name = "shares",
++ .read_u64 = cpu_shares_read_u64,
++ .write_u64 = cpu_shares_write_u64,
++ },
++ {
++ .name = "idle",
++ .read_s64 = cpu_idle_read_s64,
++ .write_s64 = cpu_idle_write_s64,
++ },
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++#ifdef CONFIG_CFS_BANDWIDTH
++ {
++ .name = "cfs_quota_us",
++ .read_s64 = cpu_cfs_quota_read_s64,
++ .write_s64 = cpu_cfs_quota_write_s64,
++ },
++ {
++ .name = "cfs_period_us",
++ .read_u64 = cpu_cfs_period_read_u64,
++ .write_u64 = cpu_cfs_period_write_u64,
++ },
++ {
++ .name = "cfs_burst_us",
++ .read_u64 = cpu_cfs_burst_read_u64,
++ .write_u64 = cpu_cfs_burst_write_u64,
++ },
++ {
++ .name = "stat",
++ .seq_show = cpu_cfs_stat_show,
++ },
++ {
++ .name = "stat.local",
++ .seq_show = cpu_cfs_local_stat_show,
++ },
++#endif /* CONFIG_CFS_BANDWIDTH */
++#ifdef CONFIG_RT_GROUP_SCHED
++ {
++ .name = "rt_runtime_us",
++ .read_s64 = cpu_rt_runtime_read,
++ .write_s64 = cpu_rt_runtime_write,
++ },
++ {
++ .name = "rt_period_us",
++ .read_u64 = cpu_rt_period_read_uint,
++ .write_u64 = cpu_rt_period_write_uint,
++ },
++#endif /* CONFIG_RT_GROUP_SCHED */
++#ifdef CONFIG_UCLAMP_TASK_GROUP
++ {
++ .name = "uclamp.min",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_min_show,
++ .write = cpu_uclamp_min_write,
++ },
++ {
++ .name = "uclamp.max",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_max_show,
++ .write = cpu_uclamp_max_write,
++ },
++#endif /* CONFIG_UCLAMP_TASK_GROUP */
++ { } /* Terminate */
++};
++
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++static u64 cpu_weight_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_weight_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft, u64 weight)
++{
++ return 0;
++}
++
++static s64 cpu_weight_nice_read_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ return 0;
++}
++
++static int cpu_weight_nice_write_s64(struct cgroup_subsys_state *css,
++ struct cftype *cft, s64 nice)
++{
++ return 0;
++}
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++
++#ifdef CONFIG_CFS_BANDWIDTH
++static int cpu_max_show(struct seq_file *sf, void *v)
++{
++ return 0;
++}
++
++static ssize_t cpu_max_write(struct kernfs_open_file *of,
++ char *buf, size_t nbytes, loff_t off)
++{
++ return nbytes;
++}
++#endif /* CONFIG_CFS_BANDWIDTH */
++
++static struct cftype cpu_files[] = {
++#ifdef CONFIG_GROUP_SCHED_WEIGHT
++ {
++ .name = "weight",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_u64 = cpu_weight_read_u64,
++ .write_u64 = cpu_weight_write_u64,
++ },
++ {
++ .name = "weight.nice",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_s64 = cpu_weight_nice_read_s64,
++ .write_s64 = cpu_weight_nice_write_s64,
++ },
++ {
++ .name = "idle",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_s64 = cpu_idle_read_s64,
++ .write_s64 = cpu_idle_write_s64,
++ },
++#endif /* CONFIG_GROUP_SCHED_WEIGHT */
++#ifdef CONFIG_CFS_BANDWIDTH
++ {
++ .name = "max",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_max_show,
++ .write = cpu_max_write,
++ },
++ {
++ .name = "max.burst",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .read_u64 = cpu_cfs_burst_read_u64,
++ .write_u64 = cpu_cfs_burst_write_u64,
++ },
++#endif /* CONFIG_CFS_BANDWIDTH */
++#ifdef CONFIG_UCLAMP_TASK_GROUP
++ {
++ .name = "uclamp.min",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_min_show,
++ .write = cpu_uclamp_min_write,
++ },
++ {
++ .name = "uclamp.max",
++ .flags = CFTYPE_NOT_ON_ROOT,
++ .seq_show = cpu_uclamp_max_show,
++ .write = cpu_uclamp_max_write,
++ },
++#endif /* CONFIG_UCLAMP_TASK_GROUP */
++ { } /* terminate */
++};
++
++static int cpu_extra_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++static int cpu_local_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++struct cgroup_subsys cpu_cgrp_subsys = {
++ .css_alloc = cpu_cgroup_css_alloc,
++ .css_online = cpu_cgroup_css_online,
++ .css_released = cpu_cgroup_css_released,
++ .css_free = cpu_cgroup_css_free,
++ .css_extra_stat_show = cpu_extra_stat_show,
++ .css_local_stat_show = cpu_local_stat_show,
++#ifdef CONFIG_RT_GROUP_SCHED
++ .can_attach = cpu_cgroup_can_attach,
++#endif /* CONFIG_RT_GROUP_SCHED */
++ .attach = cpu_cgroup_attach,
++ .legacy_cftypes = cpu_legacy_files,
++ .dfl_cftypes = cpu_files,
++ .early_init = true,
++ .threaded = true,
++};
++#endif /* CONFIG_CGROUP_SCHED */
++
++#undef CREATE_TRACE_POINTS
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#
++/*
++ * @cid_lock: Guarantee forward-progress of cid allocation.
++ *
++ * Concurrency ID allocation within a bitmap is mostly lock-free. The cid_lock
++ * is only used when contention is detected by the lock-free allocation so
++ * forward progress can be guaranteed.
++ */
++DEFINE_RAW_SPINLOCK(cid_lock);
++
++/*
++ * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
++ *
++ * When @use_cid_lock is 0, the cid allocation is lock-free. When contention is
++ * detected, it is set to 1 to ensure that all newly coming allocations are
++ * serialized by @cid_lock until the allocation which detected contention
++ * completes and sets @use_cid_lock back to 0. This guarantees forward progress
++ * of a cid allocation.
++ */
++int use_cid_lock;
++
++/*
++ * mm_cid remote-clear implements a lock-free algorithm to clear per-mm/cpu cid
++ * concurrently with respect to the execution of the source runqueue context
++ * switch.
++ *
++ * There is one basic properties we want to guarantee here:
++ *
++ * (1) Remote-clear should _never_ mark a per-cpu cid UNSET when it is actively
++ * used by a task. That would lead to concurrent allocation of the cid and
++ * userspace corruption.
++ *
++ * Provide this guarantee by introducing a Dekker memory ordering to guarantee
++ * that a pair of loads observe at least one of a pair of stores, which can be
++ * shown as:
++ *
++ * X = Y = 0
++ *
++ * w[X]=1 w[Y]=1
++ * MB MB
++ * r[Y]=y r[X]=x
++ *
++ * Which guarantees that x==0 && y==0 is impossible. But rather than using
++ * values 0 and 1, this algorithm cares about specific state transitions of the
++ * runqueue current task (as updated by the scheduler context switch), and the
++ * per-mm/cpu cid value.
++ *
++ * Let's introduce task (Y) which has task->mm == mm and task (N) which has
++ * task->mm != mm for the rest of the discussion. There are two scheduler state
++ * transitions on context switch we care about:
++ *
++ * (TSA) Store to rq->curr with transition from (N) to (Y)
++ *
++ * (TSB) Store to rq->curr with transition from (Y) to (N)
++ *
++ * On the remote-clear side, there is one transition we care about:
++ *
++ * (TMA) cmpxchg to *pcpu_cid to set the LAZY flag
++ *
++ * There is also a transition to UNSET state which can be performed from all
++ * sides (scheduler, remote-clear). It is always performed with a cmpxchg which
++ * guarantees that only a single thread will succeed:
++ *
++ * (TMB) cmpxchg to *pcpu_cid to mark UNSET
++ *
++ * Just to be clear, what we do _not_ want to happen is a transition to UNSET
++ * when a thread is actively using the cid (property (1)).
++ *
++ * Let's looks at the relevant combinations of TSA/TSB, and TMA transitions.
++ *
++ * Scenario A) (TSA)+(TMA) (from next task perspective)
++ *
++ * CPU0 CPU1
++ *
++ * Context switch CS-1 Remote-clear
++ * - store to rq->curr: (N)->(Y) (TSA) - cmpxchg to *pcpu_id to LAZY (TMA)
++ * (implied barrier after cmpxchg)
++ * - switch_mm_cid()
++ * - memory barrier (see switch_mm_cid()
++ * comment explaining how this barrier
++ * is combined with other scheduler
++ * barriers)
++ * - mm_cid_get (next)
++ * - READ_ONCE(*pcpu_cid) - rcu_dereference(src_rq->curr)
++ *
++ * This Dekker ensures that either task (Y) is observed by the
++ * rcu_dereference() or the LAZY flag is observed by READ_ONCE(), or both are
++ * observed.
++ *
++ * If task (Y) store is observed by rcu_dereference(), it means that there is
++ * still an active task on the cpu. Remote-clear will therefore not transition
++ * to UNSET, which fulfills property (1).
++ *
++ * If task (Y) is not observed, but the lazy flag is observed by READ_ONCE(),
++ * it will move its state to UNSET, which clears the percpu cid perhaps
++ * uselessly (which is not an issue for correctness). Because task (Y) is not
++ * observed, CPU1 can move ahead to set the state to UNSET. Because moving
++ * state to UNSET is done with a cmpxchg expecting that the old state has the
++ * LAZY flag set, only one thread will successfully UNSET.
++ *
++ * If both states (LAZY flag and task (Y)) are observed, the thread on CPU0
++ * will observe the LAZY flag and transition to UNSET (perhaps uselessly), and
++ * CPU1 will observe task (Y) and do nothing more, which is fine.
++ *
++ * What we are effectively preventing with this Dekker is a scenario where
++ * neither LAZY flag nor store (Y) are observed, which would fail property (1)
++ * because this would UNSET a cid which is actively used.
++ */
++
++void sched_mm_cid_migrate_from(struct task_struct *t)
++{
++ t->migrate_from_cpu = task_cpu(t);
++}
++
++static
++int __sched_mm_cid_migrate_from_fetch_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid)
++{
++ struct mm_struct *mm = t->mm;
++ struct task_struct *src_task;
++ int src_cid, last_mm_cid;
++
++ if (!mm)
++ return -1;
++
++ last_mm_cid = t->last_mm_cid;
++ /*
++ * If the migrated task has no last cid, or if the current
++ * task on src rq uses the cid, it means the source cid does not need
++ * to be moved to the destination cpu.
++ */
++ if (last_mm_cid == -1)
++ return -1;
++ src_cid = READ_ONCE(src_pcpu_cid->cid);
++ if (!mm_cid_is_valid(src_cid) || last_mm_cid != src_cid)
++ return -1;
++
++ /*
++ * If we observe an active task using the mm on this rq, it means we
++ * are not the last task to be migrated from this cpu for this mm, so
++ * there is no need to move src_cid to the destination cpu.
++ */
++ guard(rcu)();
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ t->last_mm_cid = -1;
++ return -1;
++ }
++
++ return src_cid;
++}
++
++static
++int __sched_mm_cid_migrate_from_try_steal_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid,
++ int src_cid)
++{
++ struct task_struct *src_task;
++ struct mm_struct *mm = t->mm;
++ int lazy_cid;
++
++ if (src_cid == -1)
++ return -1;
++
++ /*
++ * Attempt to clear the source cpu cid to move it to the destination
++ * cpu.
++ */
++ lazy_cid = mm_cid_set_lazy_put(src_cid);
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &src_cid, lazy_cid))
++ return -1;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, this task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ scoped_guard (rcu) {
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ rcu_read_unlock();
++ /*
++ * We observed an active task for this mm, there is therefore
++ * no point in moving this cid to the destination cpu.
++ */
++ t->last_mm_cid = -1;
++ return -1;
++ }
++ }
++
++ /*
++ * The src_cid is unused, so it can be unset.
++ */
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ return -1;
++ WRITE_ONCE(src_pcpu_cid->recent_cid, MM_CID_UNSET);
++ return src_cid;
++}
++
++/*
++ * Migration to dst cpu. Called with dst_rq lock held.
++ * Interrupts are disabled, which keeps the window of cid ownership without the
++ * source rq lock held small.
++ */
++void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t)
++{
++ struct mm_cid *src_pcpu_cid, *dst_pcpu_cid;
++ struct mm_struct *mm = t->mm;
++ int src_cid, src_cpu;
++ bool dst_cid_is_set;
++ struct rq *src_rq;
++
++ lockdep_assert_rq_held(dst_rq);
++
++ if (!mm)
++ return;
++ src_cpu = t->migrate_from_cpu;
++ if (src_cpu == -1) {
++ t->last_mm_cid = -1;
++ return;
++ }
++ /*
++ * Move the src cid if the dst cid is unset. This keeps id
++ * allocation closest to 0 in cases where few threads migrate around
++ * many CPUs.
++ *
++ * If destination cid or recent cid is already set, we may have
++ * to just clear the src cid to ensure compactness in frequent
++ * migrations scenarios.
++ *
++ * It is not useful to clear the src cid when the number of threads is
++ * greater or equal to the number of allowed CPUs, because user-space
++ * can expect that the number of allowed cids can reach the number of
++ * allowed CPUs.
++ */
++ dst_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(dst_rq));
++ dst_cid_is_set = !mm_cid_is_unset(READ_ONCE(dst_pcpu_cid->cid)) ||
++ !mm_cid_is_unset(READ_ONCE(dst_pcpu_cid->recent_cid));
++ if (dst_cid_is_set && atomic_read(&mm->mm_users) >= READ_ONCE(mm->nr_cpus_allowed))
++ return;
++ src_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, src_cpu);
++ src_rq = cpu_rq(src_cpu);
++ src_cid = __sched_mm_cid_migrate_from_fetch_cid(src_rq, t, src_pcpu_cid);
++ if (src_cid == -1)
++ return;
++ src_cid = __sched_mm_cid_migrate_from_try_steal_cid(src_rq, t, src_pcpu_cid,
++ src_cid);
++ if (src_cid == -1)
++ return;
++ if (dst_cid_is_set) {
++ __mm_cid_put(mm, src_cid);
++ return;
++ }
++ /* Move src_cid to dst cpu. */
++ mm_cid_snapshot_time(dst_rq, mm);
++ WRITE_ONCE(dst_pcpu_cid->cid, src_cid);
++ WRITE_ONCE(dst_pcpu_cid->recent_cid, src_cid);
++}
++
++static void sched_mm_cid_remote_clear(struct mm_struct *mm, struct mm_cid *pcpu_cid,
++ int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *t;
++ int cid, lazy_cid;
++
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid))
++ return;
++
++ /*
++ * Clear the cpu cid if it is set to keep cid allocation compact. If
++ * there happens to be other tasks left on the source cpu using this
++ * mm, the next task using this mm will reallocate its cid on context
++ * switch.
++ */
++ lazy_cid = mm_cid_set_lazy_put(cid);
++ if (!try_cmpxchg(&pcpu_cid->cid, &cid, lazy_cid))
++ return;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, that task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ scoped_guard (rcu) {
++ t = rcu_dereference(rq->curr);
++ if (READ_ONCE(t->mm_cid_active) && t->mm == mm)
++ return;
++ }
++
++ /*
++ * The cid is unused, so it can be unset.
++ * Disable interrupts to keep the window of cid ownership without rq
++ * lock small.
++ */
++ scoped_guard (irqsave) {
++ if (try_cmpxchg(&pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ __mm_cid_put(mm, cid);
++ }
++}
++
++static void sched_mm_cid_remote_clear_old(struct mm_struct *mm, int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct mm_cid *pcpu_cid;
++ struct task_struct *curr;
++ u64 rq_clock;
++
++ /*
++ * rq->clock load is racy on 32-bit but one spurious clear once in a
++ * while is irrelevant.
++ */
++ rq_clock = READ_ONCE(rq->clock);
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++
++ /*
++ * In order to take care of infrequently scheduled tasks, bump the time
++ * snapshot associated with this cid if an active task using the mm is
++ * observed on this rq.
++ */
++ scoped_guard (rcu) {
++ curr = rcu_dereference(rq->curr);
++ if (READ_ONCE(curr->mm_cid_active) && curr->mm == mm) {
++ WRITE_ONCE(pcpu_cid->time, rq_clock);
++ return;
++ }
++ }
++
++ if (rq_clock < pcpu_cid->time + SCHED_MM_CID_PERIOD_NS)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void sched_mm_cid_remote_clear_weight(struct mm_struct *mm, int cpu,
++ int weight)
++{
++ struct mm_cid *pcpu_cid;
++ int cid;
++
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid) || cid < weight)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void task_mm_cid_work(struct callback_head *work)
++{
++ unsigned long now = jiffies, old_scan, next_scan;
++ struct task_struct *t = current;
++ struct cpumask *cidmask;
++ struct mm_struct *mm;
++ int weight, cpu;
++
++ WARN_ON_ONCE(t != container_of(work, struct task_struct, cid_work));
++
++ work->next = work; /* Prevent double-add */
++ if (t->flags & PF_EXITING)
++ return;
++ mm = t->mm;
++ if (!mm)
++ return;
++ old_scan = READ_ONCE(mm->mm_cid_next_scan);
++ next_scan = now + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ if (!old_scan) {
++ unsigned long res;
++
++ res = cmpxchg(&mm->mm_cid_next_scan, old_scan, next_scan);
++ if (res != old_scan)
++ old_scan = res;
++ else
++ old_scan = next_scan;
++ }
++ if (time_before(now, old_scan))
++ return;
++ if (!try_cmpxchg(&mm->mm_cid_next_scan, &old_scan, next_scan))
++ return;
++ cidmask = mm_cidmask(mm);
++ /* Clear cids that were not recently used. */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_old(mm, cpu);
++ weight = cpumask_weight(cidmask);
++ /*
++ * Clear cids that are greater or equal to the cidmask weight to
++ * recompact it.
++ */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_weight(mm, cpu, weight);
++}
++
++void init_sched_mm_cid(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ int mm_users = 0;
++
++ if (mm) {
++ mm_users = atomic_read(&mm->mm_users);
++ if (mm_users == 1)
++ mm->mm_cid_next_scan = jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ }
++ t->cid_work.next = &t->cid_work; /* Protect against double add */
++ init_task_work(&t->cid_work, task_mm_cid_work);
++}
++
++void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
++{
++ struct callback_head *work = &curr->cid_work;
++ unsigned long now = jiffies;
++
++ if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) ||
++ work->next != work)
++ return;
++ if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan)))
++ return;
++
++ /* No page allocation under rq lock */
++ task_work_add(curr, work, TWA_RESUME);
++}
++
++void sched_mm_cid_exit_signals(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ guard(rq_lock_irqsave)(rq);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++}
++
++void sched_mm_cid_before_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ guard(rq_lock_irqsave)(rq);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++}
++
++void sched_mm_cid_after_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ scoped_guard (rq_lock_irqsave, rq) {
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 1);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ t->last_mm_cid = t->mm_cid = mm_cid_get(rq, t, mm);
++ }
++ rseq_set_notify_resume(t);
++}
++
++void sched_mm_cid_fork(struct task_struct *t)
++{
++ WARN_ON_ONCE(!t->mm || t->mm_cid != -1);
++ t->mm_cid_active = 1;
++}
++#endif /* CONFIG_SCHED_MM_CID */
+diff --git a/kernel/sched/alt_core.h b/kernel/sched/alt_core.h
+new file mode 100644
+index 000000000000..bb9512c76566
+--- /dev/null
++++ b/kernel/sched/alt_core.h
+@@ -0,0 +1,177 @@
++#ifndef _KERNEL_SCHED_ALT_CORE_H
++#define _KERNEL_SCHED_ALT_CORE_H
++
++/*
++ * Compile time debug macro
++ * #define ALT_SCHED_DEBUG
++ */
++
++/*
++ * Task related inlined functions
++ */
++static inline bool is_migration_disabled(struct task_struct *p)
++{
++ return p->migration_disabled;
++}
++
++/* rt_prio(prio) defined in include/linux/sched/rt.h */
++#define rt_task(p) rt_prio((p)->prio)
++#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
++#define task_has_rt_policy(p) (rt_policy((p)->policy))
++
++struct affinity_context {
++ const struct cpumask *new_mask;
++ struct cpumask *user_mask;
++ unsigned int flags;
++};
++
++/* CONFIG_SCHED_CLASS_EXT is not supported */
++#define scx_switched_all() false
++
++#define SCA_CHECK 0x01
++#define SCA_MIGRATE_DISABLE 0x02
++#define SCA_MIGRATE_ENABLE 0x04
++#define SCA_USER 0x08
++
++extern int __set_cpus_allowed_ptr(struct task_struct *p, struct affinity_context *ctx);
++
++static inline cpumask_t *alloc_user_cpus_ptr(int node)
++{
++ /*
++ * See do_set_cpus_allowed() above for the rcu_head usage.
++ */
++ int size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
++
++ return kmalloc_node(size, GFP_KERNEL, node);
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
++{
++ if (pi_task)
++ prio = min(prio, pi_task->prio);
++
++ return prio;
++}
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ struct task_struct *pi_task = rt_mutex_get_top_task(p);
++
++ return __rt_effective_prio(pi_task, prio);
++}
++
++#else /* !CONFIG_RT_MUTEXES: */
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ return prio;
++}
++
++#endif /* !CONFIG_RT_MUTEXES */
++
++extern int __sched_setscheduler(struct task_struct *p, const struct sched_attr *attr, bool user, bool pi);
++extern int __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
++extern void __setscheduler_prio(struct task_struct *p, int prio);
++
++/*
++ * Context API
++ */
++static inline struct rq *__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock(&rq->lock);
++ if (likely((p->on_cpu || task_on_rq_queued(p)) && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ *plock = NULL;
++ return rq;
++ }
++ }
++}
++
++static inline void __task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
++{
++ if (NULL != lock)
++ raw_spin_unlock(lock);
++}
++
++void check_task_changed(struct task_struct *p, struct rq *rq);
++
++/*
++ * RQ related inlined functions
++ */
++
++/*
++ * This routine assume that the idle task always in queue
++ */
++static inline struct task_struct *sched_rq_first_task(struct rq *rq)
++{
++ const struct list_head *head = &rq->queue.heads[sched_rq_prio_idx(rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++}
++
++static inline struct task_struct * sched_rq_next_task(struct task_struct *p, struct rq *rq)
++{
++ struct list_head *next = p->sq_node.next;
++
++ if (&rq->queue.heads[0] <= next && next < &rq->queue.heads[SCHED_LEVELS]) {
++ struct list_head *head;
++ unsigned long idx = next - &rq->queue.heads[0];
++
++ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
++ sched_idx2prio(idx, rq) + 1);
++ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++ }
++
++ return list_next_entry(p, sq_node);
++}
++
++extern void requeue_task(struct task_struct *p, struct rq *rq);
++
++#ifdef ALT_SCHED_DEBUG
++extern void alt_sched_debug(void);
++#else
++static inline void alt_sched_debug(void) {}
++#endif
++
++extern int sched_yield_type;
++
++extern cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DECLARE_STATIC_KEY_FALSE(sched_smt_present);
++DECLARE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
++
++extern cpumask_t sched_smt_mask ____cacheline_aligned_in_smp;
++
++extern cpumask_t *const sched_idle_mask;
++extern cpumask_t *const sched_sg_idle_mask;
++extern cpumask_t *const sched_pcore_idle_mask;
++extern cpumask_t *const sched_ecore_idle_mask;
++
++extern struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu);
++
++typedef bool (*idle_select_func_t)(struct cpumask *dstp, const struct cpumask *src1p,
++ const struct cpumask *src2p);
++
++extern idle_select_func_t idle_select_func;
++
++/* balance callback */
++extern struct balance_callback *splice_balance_callbacks(struct rq *rq);
++extern void balance_callbacks(struct rq *rq, struct balance_callback *head);
++
++#endif /* _KERNEL_SCHED_ALT_CORE_H */
+diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
+new file mode 100644
+index 000000000000..1dbd7eb6a434
+--- /dev/null
++++ b/kernel/sched/alt_debug.c
+@@ -0,0 +1,32 @@
++/*
++ * kernel/sched/alt_debug.c
++ *
++ * Print the alt scheduler debugging details
++ *
++ * Author: Alfred Chen
++ * Date : 2020
++ */
++#include "sched.h"
++#include "linux/sched/debug.h"
++
++/*
++ * This allows printing both to /proc/sched_debug and
++ * to the console
++ */
++#define SEQ_printf(m, x...) \
++ do { \
++ if (m) \
++ seq_printf(m, x); \
++ else \
++ pr_cont(x); \
++ } while (0)
++
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{
++ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
++ get_nr_threads(p));
++}
++
++void proc_sched_set_task(struct task_struct *p)
++{}
+diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
+new file mode 100644
+index 000000000000..5b9a53c669f5
+--- /dev/null
++++ b/kernel/sched/alt_sched.h
+@@ -0,0 +1,1018 @@
++#ifndef _KERNEL_SCHED_ALT_SCHED_H
++#define _KERNEL_SCHED_ALT_SCHED_H
++
++#include <linux/context_tracking.h>
++#include <linux/profile.h>
++#include <linux/stop_machine.h>
++#include <linux/syscalls.h>
++#include <linux/tick.h>
++
++#include <trace/events/power.h>
++#include <trace/events/sched.h>
++
++#include "../workqueue_internal.h"
++
++#include "cpupri.h"
++
++#ifdef CONFIG_CGROUP_SCHED
++/* task group related information */
++struct task_group {
++ struct cgroup_subsys_state css;
++
++ struct rcu_head rcu;
++ struct list_head list;
++
++ struct task_group *parent;
++ struct list_head siblings;
++ struct list_head children;
++};
++
++extern struct task_group *sched_create_group(struct task_group *parent);
++extern void sched_online_group(struct task_group *tg,
++ struct task_group *parent);
++extern void sched_destroy_group(struct task_group *tg);
++extern void sched_release_group(struct task_group *tg);
++#endif /* CONFIG_CGROUP_SCHED */
++
++#define MIN_SCHED_NORMAL_PRIO (32)
++/*
++ * levels: RT(0-24), reserved(25-31), NORMAL(32-63), cpu idle task(64)
++ *
++ * -- BMQ --
++ * NORMAL: (lower boost range 12, NICE_WIDTH 40, higher boost range 12) / 2
++ * -- PDS --
++ * NORMAL: SCHED_EDGE_DELTA + ((NICE_WIDTH 40) / 2)
++ */
++#define SCHED_LEVELS (64 + 1)
++
++#define IDLE_TASK_SCHED_PRIO (SCHED_LEVELS - 1)
++
++/*
++ * Increase resolution of nice-level calculations for 64-bit architectures.
++ * The extra resolution improves shares distribution and load balancing of
++ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
++ * hierarchies, especially on larger systems. This is not a user-visible change
++ * and does not change the user-interface for setting shares/weights.
++ *
++ * We increase resolution only if we have enough bits to allow this increased
++ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
++ * are pretty high and the returns do not justify the increased costs.
++ *
++ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
++ * increase coverage and consistency always enable it on 64-bit platforms.
++ */
++#ifdef CONFIG_64BIT
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
++# define scale_load_down(w) \
++({ \
++ unsigned long __w = (w); \
++ if (__w) \
++ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
++ __w; \
++})
++#else
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) (w)
++# define scale_load_down(w) (w)
++#endif
++
++/* task_struct::on_rq states: */
++#define TASK_ON_RQ_QUEUED 1
++#define TASK_ON_RQ_MIGRATING 2
++
++static inline int task_on_rq_queued(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_QUEUED;
++}
++
++static inline int task_on_rq_migrating(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
++}
++
++/* Wake flags. The first three directly map to some SD flag value */
++#define WF_EXEC 0x02 /* Wakeup after exec; maps to SD_BALANCE_EXEC */
++#define WF_FORK 0x04 /* Wakeup after fork; maps to SD_BALANCE_FORK */
++#define WF_TTWU 0x08 /* Wakeup; maps to SD_BALANCE_WAKE */
++
++#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
++#define WF_MIGRATED 0x20 /* Internal use, task got migrated */
++#define WF_CURRENT_CPU 0x40 /* Prefer to move the wakee to the current CPU. */
++
++static_assert(WF_EXEC == SD_BALANCE_EXEC);
++static_assert(WF_FORK == SD_BALANCE_FORK);
++static_assert(WF_TTWU == SD_BALANCE_WAKE);
++
++#define SCHED_QUEUE_BITS (SCHED_LEVELS - 1)
++
++struct sched_queue {
++ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
++ struct list_head heads[SCHED_LEVELS];
++};
++
++struct rq;
++struct cpuidle_state;
++
++struct balance_callback {
++ struct balance_callback *next;
++ void (*func)(struct rq *rq);
++};
++
++typedef void (*balance_func_t)(struct rq *rq, int cpu);
++typedef void (*set_idle_mask_func_t)(unsigned int cpu, struct cpumask *dstp);
++typedef void (*clear_idle_mask_func_t)(int cpu, struct cpumask *dstp);
++
++struct balance_arg {
++ struct task_struct *task;
++ int active;
++ cpumask_t *cpumask;
++};
++
++/*
++ * This is the main, per-CPU runqueue data structure.
++ * This data should only be modified by the local cpu.
++ */
++struct rq {
++ /* runqueue lock: */
++ raw_spinlock_t lock;
++
++ struct task_struct __rcu *curr;
++ struct task_struct *idle;
++ struct task_struct *stop;
++ struct mm_struct *prev_mm;
++
++ struct sched_queue queue ____cacheline_aligned;
++
++ int prio;
++#ifdef CONFIG_SCHED_PDS
++ int prio_idx;
++ u64 time_edge;
++#endif
++
++ /* switch count */
++ u64 nr_switches;
++
++ atomic_t nr_iowait;
++
++ u64 last_seen_need_resched_ns;
++ int ticks_without_resched;
++
++#ifdef CONFIG_MEMBARRIER
++ int membarrier_state;
++#endif
++
++ set_idle_mask_func_t set_idle_mask_func;
++ clear_idle_mask_func_t clear_idle_mask_func;
++
++ int cpu; /* cpu of this runqueue */
++ bool online;
++
++ unsigned int ttwu_pending;
++ unsigned char nohz_idle_balance;
++ unsigned char idle_balance;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ struct sched_avg avg_irq;
++#endif
++
++ balance_func_t balance_func;
++ struct balance_arg active_balance_arg ____cacheline_aligned;
++ struct cpu_stop_work active_balance_work;
++
++ struct balance_callback *balance_callback;
++
++#ifdef CONFIG_HOTPLUG_CPU
++ struct rcuwait hotplug_wait;
++#endif
++ unsigned int nr_pinned;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ u64 prev_irq_time;
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++#ifdef CONFIG_PARAVIRT
++ u64 prev_steal_time;
++#endif /* CONFIG_PARAVIRT */
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ u64 prev_steal_time_rq;
++#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
++
++ /* For genenal cpu load util */
++ s32 load_history;
++ u64 load_block;
++ u64 load_stamp;
++
++ /* calc_load related fields */
++ unsigned long calc_load_update;
++ long calc_load_active;
++
++ /* Ensure that all clocks are in the same cache line */
++ u64 clock ____cacheline_aligned;
++ u64 clock_task;
++ u64 prio_balance_time;
++
++ unsigned int nr_running;
++ unsigned long nr_uninterruptible;
++
++#ifdef CONFIG_SCHED_HRTICK
++ call_single_data_t hrtick_csd;
++ struct hrtimer hrtick_timer;
++ ktime_t hrtick_time;
++#endif
++
++#ifdef CONFIG_SCHEDSTATS
++
++ /* latency stats */
++ struct sched_info rq_sched_info;
++ unsigned long long rq_cpu_time;
++ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
++
++ /* sys_sched_yield() stats */
++ unsigned int yld_count;
++
++ /* schedule() stats */
++ unsigned int sched_switch;
++ unsigned int sched_count;
++ unsigned int sched_goidle;
++
++ /* try_to_wake_up() stats */
++ unsigned int ttwu_count;
++ unsigned int ttwu_local;
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_CPU_IDLE
++ /* Must be inspected within a rcu lock section */
++ struct cpuidle_state *idle_state;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++ call_single_data_t nohz_csd;
++ atomic_t nohz_flags;
++#endif /* CONFIG_NO_HZ_COMMON */
++
++ /* Scratch cpumask to be temporarily used under rq_lock */
++ cpumask_var_t scratch_mask;
++};
++
++extern unsigned int sysctl_sched_base_slice;
++
++extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
++
++extern unsigned long calc_load_update;
++extern atomic_long_t calc_load_tasks;
++
++extern void calc_global_load_tick(struct rq *this_rq);
++extern long calc_load_fold_active(struct rq *this_rq, long adjust);
++
++DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
++#define this_rq() this_cpu_ptr(&runqueues)
++#define task_rq(p) cpu_rq(task_cpu(p))
++#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
++#define raw_rq() raw_cpu_ptr(&runqueues)
++
++#ifdef CONFIG_SYSCTL
++void register_sched_domain_sysctl(void);
++void unregister_sched_domain_sysctl(void);
++#else
++static inline void register_sched_domain_sysctl(void)
++{
++}
++static inline void unregister_sched_domain_sysctl(void)
++{
++}
++#endif
++
++extern bool sched_smp_initialized;
++
++enum {
++#ifdef CONFIG_SCHED_SMT
++ SMT_LEVEL_SPACE_HOLDER,
++#endif
++ COREGROUP_LEVEL_SPACE_HOLDER,
++ CORE_LEVEL_SPACE_HOLDER,
++ OTHER_LEVEL_SPACE_HOLDER,
++ NR_CPU_AFFINITY_LEVELS
++};
++
++DECLARE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++
++static inline int
++__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
++{
++ int cpu;
++
++ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
++ mask++;
++
++ return cpu;
++}
++
++static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
++{
++ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
++}
++
++extern void resched_latency_warn(int cpu, u64 latency);
++
++#ifndef arch_scale_freq_tick
++static __always_inline
++void arch_scale_freq_tick(void)
++{
++}
++#endif
++
++#ifndef arch_scale_freq_capacity
++static __always_inline
++unsigned long arch_scale_freq_capacity(int cpu)
++{
++ return SCHED_CAPACITY_SCALE;
++}
++#endif
++
++static inline u64 __rq_clock_broken(struct rq *rq)
++{
++ return READ_ONCE(rq->clock);
++}
++
++static inline u64 rq_clock(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock;
++}
++
++static inline u64 rq_clock_task(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock_task;
++}
++
++/*
++ * {de,en}queue flags:
++ *
++ * DEQUEUE_SLEEP - task is no longer runnable
++ * ENQUEUE_WAKEUP - task just became runnable
++ *
++ */
++
++#define DEQUEUE_SLEEP 0x01
++
++#define ENQUEUE_WAKEUP 0x01
++
++
++/*
++ * Below are scheduler API which using in other kernel code
++ * It use the dummy rq_flags
++ * ToDo : BMQ need to support these APIs for compatibility with mainline
++ * scheduler code.
++ */
++struct rq_flags {
++ unsigned long flags;
++};
++
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock);
++
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock);
++
++static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++}
++
++static inline void
++rq_lock(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock(&rq->lock);
++}
++
++static inline void
++rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++rq_lock_irq(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irq(&rq->lock);
++}
++
++static inline void
++rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++static inline struct rq *
++this_rq_lock_irq(struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ local_irq_disable();
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ return rq;
++}
++
++static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
++{
++ return &rq->lock;
++}
++
++static inline raw_spinlock_t *rq_lockp(struct rq *rq)
++{
++ return __rq_lockp(rq);
++}
++
++static inline void lockdep_assert_rq_held(struct rq *rq)
++{
++ lockdep_assert_held(__rq_lockp(rq));
++}
++
++extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
++extern void raw_spin_rq_unlock(struct rq *rq);
++
++static inline void raw_spin_rq_lock(struct rq *rq)
++{
++ raw_spin_rq_lock_nested(rq, 0);
++}
++
++static inline void raw_spin_rq_lock_irq(struct rq *rq)
++{
++ local_irq_disable();
++ raw_spin_rq_lock(rq);
++}
++
++static inline void raw_spin_rq_unlock_irq(struct rq *rq)
++{
++ raw_spin_rq_unlock(rq);
++ local_irq_enable();
++}
++
++static inline int task_current(struct rq *rq, struct task_struct *p)
++{
++ return rq->curr == p;
++}
++
++static inline bool task_on_cpu(struct task_struct *p)
++{
++ return p->on_cpu;
++}
++
++extern struct static_key_false sched_schedstats;
++
++#ifdef CONFIG_CPU_IDLE
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++ rq->idle_state = idle_state;
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ WARN_ON(!rcu_read_lock_held());
++ return rq->idle_state;
++}
++#else
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ return NULL;
++}
++#endif
++
++static inline int cpu_of(const struct rq *rq)
++{
++ return rq->cpu;
++}
++
++extern void resched_cpu(int cpu);
++
++#include "stats.h"
++
++#ifdef CONFIG_NO_HZ_COMMON
++#define NOHZ_BALANCE_KICK_BIT 0
++#define NOHZ_STATS_KICK_BIT 1
++
++#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
++#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
++
++#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
++
++#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
++
++/* TODO: needed?
++extern void nohz_balance_exit_idle(struct rq *rq);
++#else
++static inline void nohz_balance_exit_idle(struct rq *rq) { }
++*/
++#endif
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++struct irqtime {
++ u64 total;
++ u64 tick_delta;
++ u64 irq_start_time;
++ struct u64_stats_sync sync;
++};
++
++DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
++extern int sched_clock_irqtime;
++
++static inline int irqtime_enabled(void)
++{
++ return sched_clock_irqtime;
++}
++
++/*
++ * Returns the irqtime minus the softirq time computed by ksoftirqd.
++ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
++ * and never move forward.
++ */
++static inline u64 irq_time_read(int cpu)
++{
++ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
++ unsigned int seq;
++ u64 total;
++
++ do {
++ seq = __u64_stats_fetch_begin(&irqtime->sync);
++ total = irqtime->total;
++ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
++
++ return total;
++}
++#else
++
++static inline int irqtime_enabled(void)
++{
++ return 0;
++}
++
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++
++#ifdef CONFIG_CPU_FREQ
++DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++extern int __init sched_tick_offload_init(void);
++#else
++static inline int sched_tick_offload_init(void) { return 0; }
++#endif
++
++#ifdef arch_scale_freq_capacity
++#ifndef arch_scale_freq_invariant
++#define arch_scale_freq_invariant() (true)
++#endif
++#else /* arch_scale_freq_capacity */
++#define arch_scale_freq_invariant() (false)
++#endif
++
++unsigned long sugov_effective_cpu_perf(int cpu, unsigned long actual,
++ unsigned long min,
++ unsigned long max);
++
++extern void schedule_idle(void);
++
++#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
++
++/*
++ * !! For sched_setattr_nocheck() (kernel) only !!
++ *
++ * This is actually gross. :(
++ *
++ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
++ * tasks, but still be able to sleep. We need this on platforms that cannot
++ * atomically change clock frequency. Remove once fast switching will be
++ * available on such platforms.
++ *
++ * SUGOV stands for SchedUtil GOVernor.
++ */
++#define SCHED_FLAG_SUGOV 0x10000000
++
++#ifdef CONFIG_MEMBARRIER
++/*
++ * The scheduler provides memory barriers required by membarrier between:
++ * - prior user-space memory accesses and store to rq->membarrier_state,
++ * - store to rq->membarrier_state and following user-space memory accesses.
++ * In the same way it provides those guarantees around store to rq->curr.
++ */
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++ int membarrier_state;
++
++ if (prev_mm == next_mm)
++ return;
++
++ membarrier_state = atomic_read(&next_mm->membarrier_state);
++ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
++ return;
++
++ WRITE_ONCE(rq->membarrier_state, membarrier_state);
++}
++#else
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++}
++#endif
++
++#ifdef CONFIG_NUMA
++extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
++#else
++static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return nr_cpu_ids;
++}
++#endif
++
++extern void swake_up_all_locked(struct swait_queue_head *q);
++extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
++
++extern int try_to_wake_up(struct task_struct *tsk, unsigned int state, int wake_flags);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++extern int preempt_dynamic_mode;
++extern int sched_dynamic_mode(const char *str);
++extern void sched_dynamic_update(int mode);
++#endif
++extern const char *preempt_modes[];
++
++static inline void nohz_run_idle_balance(int cpu) { }
++
++static inline unsigned long
++uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
++{
++ if (clamp_id == UCLAMP_MIN)
++ return 0;
++
++ return SCHED_CAPACITY_SCALE;
++}
++
++static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
++
++static inline bool uclamp_is_used(void)
++{
++ return false;
++}
++
++static inline unsigned long
++uclamp_rq_get(struct rq *rq, enum uclamp_id clamp_id)
++{
++ if (clamp_id == UCLAMP_MIN)
++ return 0;
++
++ return SCHED_CAPACITY_SCALE;
++}
++
++static inline void
++uclamp_rq_set(struct rq *rq, enum uclamp_id clamp_id, unsigned int value)
++{
++}
++
++static inline bool uclamp_rq_is_idle(struct rq *rq)
++{
++ return false;
++}
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#define SCHED_MM_CID_PERIOD_NS (100ULL * 1000000) /* 100ms */
++#define MM_CID_SCAN_DELAY 100 /* 100ms */
++
++extern raw_spinlock_t cid_lock;
++extern int use_cid_lock;
++
++extern void sched_mm_cid_migrate_from(struct task_struct *t);
++extern void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t);
++extern void task_tick_mm_cid(struct rq *rq, struct task_struct *curr);
++extern void init_sched_mm_cid(struct task_struct *t);
++
++static inline void __mm_cid_put(struct mm_struct *mm, int cid)
++{
++ if (cid < 0)
++ return;
++ cpumask_clear_cpu(cid, mm_cidmask(mm));
++}
++
++/*
++ * The per-mm/cpu cid can have the MM_CID_LAZY_PUT flag set or transition to
++ * the MM_CID_UNSET state without holding the rq lock, but the rq lock needs to
++ * be held to transition to other states.
++ *
++ * State transitions synchronized with cmpxchg or try_cmpxchg need to be
++ * consistent across cpus, which prevents use of this_cpu_cmpxchg.
++ */
++static inline void mm_cid_put_lazy(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (!mm_cid_is_lazy_put(cid) ||
++ !try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int mm_cid_pcpu_unset(struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid, res;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ for (;;) {
++ if (mm_cid_is_unset(cid))
++ return MM_CID_UNSET;
++ /*
++ * Attempt transition from valid or lazy-put to unset.
++ */
++ res = cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, cid, MM_CID_UNSET);
++ if (res == cid)
++ break;
++ cid = res;
++ }
++ return cid;
++}
++
++static inline void mm_cid_put(struct mm_struct *mm)
++{
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = mm_cid_pcpu_unset(mm);
++ if (cid == MM_CID_UNSET)
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int __mm_cid_try_get(struct task_struct *t, struct mm_struct *mm)
++{
++ struct cpumask *cidmask = mm_cidmask(mm);
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid, max_nr_cid, allowed_max_nr_cid;
++
++ /*
++ * After shrinking the number of threads or reducing the number
++ * of allowed cpus, reduce the value of max_nr_cid so expansion
++ * of cid allocation will preserve cache locality if the number
++ * of threads or allowed cpus increase again.
++ */
++ max_nr_cid = atomic_read(&mm->max_nr_cid);
++ while ((allowed_max_nr_cid = min_t(int, READ_ONCE(mm->nr_cpus_allowed),
++ atomic_read(&mm->mm_users))),
++ max_nr_cid > allowed_max_nr_cid) {
++ /* atomic_try_cmpxchg loads previous mm->max_nr_cid into max_nr_cid. */
++ if (atomic_try_cmpxchg(&mm->max_nr_cid, &max_nr_cid, allowed_max_nr_cid)) {
++ max_nr_cid = allowed_max_nr_cid;
++ break;
++ }
++ }
++ /* Try to re-use recent cid. This improves cache locality. */
++ cid = __this_cpu_read(pcpu_cid->recent_cid);
++ if (!mm_cid_is_unset(cid) && cid < max_nr_cid &&
++ !cpumask_test_and_set_cpu(cid, cidmask))
++ return cid;
++ /*
++ * Expand cid allocation if the maximum number of concurrency
++ * IDs allocated (max_nr_cid) is below the number cpus allowed
++ * and number of threads. Expanding cid allocation as much as
++ * possible improves cache locality.
++ */
++ cid = max_nr_cid;
++ while (cid < READ_ONCE(mm->nr_cpus_allowed) && cid < atomic_read(&mm->mm_users)) {
++ /* atomic_try_cmpxchg loads previous mm->max_nr_cid into cid. */
++ if (!atomic_try_cmpxchg(&mm->max_nr_cid, &cid, cid + 1))
++ continue;
++ if (!cpumask_test_and_set_cpu(cid, cidmask))
++ return cid;
++ }
++ /*
++ * Find the first available concurrency id.
++ * Retry finding first zero bit if the mask is temporarily
++ * filled. This only happens during concurrent remote-clear
++ * which owns a cid without holding a rq lock.
++ */
++ for (;;) {
++ cid = cpumask_first_zero(cidmask);
++ if (cid < READ_ONCE(mm->nr_cpus_allowed))
++ break;
++ cpu_relax();
++ }
++ if (cpumask_test_and_set_cpu(cid, cidmask))
++ return -1;
++
++ return cid;
++}
++
++/*
++ * Save a snapshot of the current runqueue time of this cpu
++ * with the per-cpu cid value, allowing to estimate how recently it was used.
++ */
++static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
++{
++ struct mm_cid *pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(rq));
++
++ lockdep_assert_rq_held(rq);
++ WRITE_ONCE(pcpu_cid->time, rq->clock);
++}
++
++static inline int __mm_cid_get(struct rq *rq, struct task_struct *t,
++ struct mm_struct *mm)
++{
++ int cid;
++
++ /*
++ * All allocations (even those using the cid_lock) are lock-free. If
++ * use_cid_lock is set, hold the cid_lock to perform cid allocation to
++ * guarantee forward progress.
++ */
++ if (!READ_ONCE(use_cid_lock)) {
++ cid = __mm_cid_try_get(t, mm);
++ if (cid >= 0)
++ goto end;
++ raw_spin_lock(&cid_lock);
++ } else {
++ raw_spin_lock(&cid_lock);
++ cid = __mm_cid_try_get(t, mm);
++ if (cid >= 0)
++ goto unlock;
++ }
++
++ /*
++ * cid concurrently allocated. Retry while forcing following
++ * allocations to use the cid_lock to ensure forward progress.
++ */
++ WRITE_ONCE(use_cid_lock, 1);
++ /*
++ * Set use_cid_lock before allocation. Only care about program order
++ * because this is only required for forward progress.
++ */
++ barrier();
++ /*
++ * Retry until it succeeds. It is guaranteed to eventually succeed once
++ * all newcoming allocations observe the use_cid_lock flag set.
++ */
++ do {
++ cid = __mm_cid_try_get(t, mm);
++ cpu_relax();
++ } while (cid < 0);
++ /*
++ * Allocate before clearing use_cid_lock. Only care about
++ * program order because this is for forward progress.
++ */
++ barrier();
++ WRITE_ONCE(use_cid_lock, 0);
++unlock:
++ raw_spin_unlock(&cid_lock);
++end:
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++}
++
++static inline int mm_cid_get(struct rq *rq, struct task_struct *t,
++ struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ struct cpumask *cpumask;
++ int cid;
++
++ lockdep_assert_rq_held(rq);
++ cpumask = mm_cidmask(mm);
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (mm_cid_is_valid(cid)) {
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++ }
++ if (mm_cid_is_lazy_put(cid)) {
++ if (try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++ }
++ cid = __mm_cid_get(rq, t, mm);
++ __this_cpu_write(pcpu_cid->cid, cid);
++ __this_cpu_write(pcpu_cid->recent_cid, cid);
++
++ return cid;
++}
++
++static inline void switch_mm_cid(struct rq *rq,
++ struct task_struct *prev,
++ struct task_struct *next)
++{
++ /*
++ * Provide a memory barrier between rq->curr store and load of
++ * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
++ *
++ * Should be adapted if context_switch() is modified.
++ */
++ if (!next->mm) { // to kernel
++ /*
++ * user -> kernel transition does not guarantee a barrier, but
++ * we can use the fact that it performs an atomic operation in
++ * mmgrab().
++ */
++ if (prev->mm) // from user
++ smp_mb__after_mmgrab();
++ /*
++ * kernel -> kernel transition does not change rq->curr->mm
++ * state. It stays NULL.
++ */
++ } else { // to user
++ /*
++ * kernel -> user transition does not provide a barrier
++ * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
++ * Provide it here.
++ */
++ if (!prev->mm) // from kernel
++ smp_mb();
++ /*
++ * user -> user transition guarantees a memory barrier through
++ * switch_mm() when current->mm changes. If current->mm is
++ * unchanged, no barrier is needed.
++ */
++ }
++ if (prev->mm_cid_active) {
++ mm_cid_snapshot_time(rq, prev->mm);
++ mm_cid_put_lazy(prev);
++ prev->mm_cid = -1;
++ }
++ if (next->mm_cid_active)
++ next->last_mm_cid = next->mm_cid = mm_cid_get(rq, next, next->mm);
++}
++
++#else
++static inline void switch_mm_cid(struct rq *rq, struct task_struct *prev, struct task_struct *next) { }
++static inline void sched_mm_cid_migrate_from(struct task_struct *t) { }
++static inline void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t) { }
++static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
++static inline void init_sched_mm_cid(struct task_struct *t) { }
++#endif
++
++extern struct balance_callback balance_push_callback;
++
++static inline void
++queue_balance_callback(struct rq *rq,
++ struct balance_callback *head,
++ void (*func)(struct rq *rq))
++{
++ lockdep_assert_rq_held(rq);
++
++ /*
++ * Don't (re)queue an already queued item; nor queue anything when
++ * balance_push() is active, see the comment with
++ * balance_push_callback.
++ */
++ if (unlikely(head->next || rq->balance_callback == &balance_push_callback))
++ return;
++
++ head->func = func;
++ head->next = rq->balance_callback;
++ rq->balance_callback = head;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#include "bmq.h"
++#endif
++#ifdef CONFIG_SCHED_PDS
++#include "pds.h"
++#endif
++
++#endif /* _KERNEL_SCHED_ALT_SCHED_H */
+diff --git a/kernel/sched/alt_topology.c b/kernel/sched/alt_topology.c
+new file mode 100644
+index 000000000000..376a08a5afda
+--- /dev/null
++++ b/kernel/sched/alt_topology.c
+@@ -0,0 +1,347 @@
++#include "alt_core.h"
++#include "alt_topology.h"
++
++static cpumask_t sched_pcore_mask ____cacheline_aligned_in_smp;
++
++static int __init sched_pcore_mask_setup(char *str)
++{
++ if (cpulist_parse(str, &sched_pcore_mask))
++ pr_warn("sched/alt: pcore_cpus= incorrect CPU range\n");
++
++ return 0;
++}
++__setup("pcore_cpus=", sched_pcore_mask_setup);
++
++/*
++ * set/clear idle mask functions
++ */
++#ifdef CONFIG_SCHED_SMT
++static void set_idle_mask_smt(unsigned int cpu, struct cpumask *dstp)
++{
++ cpumask_set_cpu(cpu, dstp);
++ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++}
++
++static void clear_idle_mask_smt(int cpu, struct cpumask *dstp)
++{
++ cpumask_clear_cpu(cpu, dstp);
++ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++}
++#endif
++
++static void set_idle_mask_pcore(unsigned int cpu, struct cpumask *dstp)
++{
++ cpumask_set_cpu(cpu, dstp);
++ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
++}
++
++static void clear_idle_mask_pcore(int cpu, struct cpumask *dstp)
++{
++ cpumask_clear_cpu(cpu, dstp);
++ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
++}
++
++static void set_idle_mask_ecore(unsigned int cpu, struct cpumask *dstp)
++{
++ cpumask_set_cpu(cpu, dstp);
++ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
++}
++
++static void clear_idle_mask_ecore(int cpu, struct cpumask *dstp)
++{
++ cpumask_clear_cpu(cpu, dstp);
++ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
++}
++
++/*
++ * Idle cpu/rq selection functions
++ */
++#ifdef CONFIG_SCHED_SMT
++static bool p1_idle_select_func(struct cpumask *dstp, const struct cpumask *src1p,
++ const struct cpumask *src2p)
++{
++ return cpumask_and(dstp, src1p, src2p + 1) ||
++ cpumask_and(dstp, src1p, src2p);
++}
++#endif
++
++static bool p1p2_idle_select_func(struct cpumask *dstp, const struct cpumask *src1p,
++ const struct cpumask *src2p)
++{
++ return cpumask_and(dstp, src1p, src2p + 1) ||
++ cpumask_and(dstp, src1p, src2p + 2) ||
++ cpumask_and(dstp, src1p, src2p);
++}
++
++/* common balance functions */
++static int active_balance_cpu_stop(void *data)
++{
++ struct balance_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++ cpumask_t tmp;
++
++ local_irq_save(flags);
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++
++ arg->active = 0;
++
++ if (task_on_rq_queued(p) && task_rq(p) == rq &&
++ cpumask_and(&tmp, p->cpus_ptr, arg->cpumask) &&
++ !is_migration_disabled(p)) {
++ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu_of(rq)));
++ rq = move_queued_task(rq, p, dcpu);
++ }
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++/* trigger_active_balance - for @rq */
++static inline int
++trigger_active_balance(struct rq *src_rq, struct rq *rq, cpumask_t *target_mask)
++{
++ struct balance_arg *arg;
++ unsigned long flags;
++ struct task_struct *p;
++ int res;
++
++ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
++ return 0;
++
++ arg = &rq->active_balance_arg;
++ res = (1 == rq->nr_running) && \
++ !is_migration_disabled((p = sched_rq_first_task(rq))) && \
++ cpumask_intersects(p->cpus_ptr, target_mask) && \
++ !arg->active;
++ if (res) {
++ arg->task = p;
++ arg->cpumask = target_mask;
++
++ arg->active = 1;
++ }
++
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ if (res) {
++ preempt_disable();
++ raw_spin_unlock(&src_rq->lock);
++
++ stop_one_cpu_nowait(cpu_of(rq), active_balance_cpu_stop, arg,
++ &rq->active_balance_work);
++
++ preempt_enable();
++ raw_spin_lock(&src_rq->lock);
++ }
++
++ return res;
++}
++
++static inline int
++ecore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
++{
++ if (cpumask_andnot(single_task_mask, single_task_mask, &sched_pcore_mask)) {
++ int i, cpu = cpu_of(rq);
++
++ for_each_cpu_wrap(i, single_task_mask, cpu)
++ if (trigger_active_balance(rq, cpu_rq(i), target_mask))
++ return 1;
++ }
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct balance_callback, active_balance_head);
++
++#ifdef CONFIG_SCHED_SMT
++static inline int
++smt_pcore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
++{
++ cpumask_t smt_single_mask;
++
++ if (cpumask_and(&smt_single_mask, single_task_mask, &sched_smt_mask)) {
++ int i, cpu = cpu_of(rq);
++
++ for_each_cpu_wrap(i, &smt_single_mask, cpu) {
++ if (cpumask_subset(cpu_smt_mask(i), &smt_single_mask) &&
++ trigger_active_balance(rq, cpu_rq(i), target_mask))
++ return 1;
++ }
++ }
++
++ return 0;
++}
++
++/* smt p core balance functions */
++static inline void smt_pcore_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ (/* smt core group balance */
++ (static_key_count(&sched_smt_present.key) > 1 &&
++ smt_pcore_source_balance(rq, &single_task_mask, sched_sg_idle_mask)
++ ) ||
++ /* e core to idle smt core balance */
++ ecore_source_balance(rq, &single_task_mask, sched_sg_idle_mask)))
++ return;
++}
++
++static void smt_pcore_balance_func(struct rq *rq, const int cpu)
++{
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
++}
++
++/* smt balance functions */
++static inline void smt_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ static_key_count(&sched_smt_present.key) > 1 &&
++ smt_pcore_source_balance(rq, &single_task_mask, sched_sg_idle_mask))
++ return;
++}
++
++static void smt_balance_func(struct rq *rq, const int cpu)
++{
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
++}
++
++/* e core balance functions */
++static inline void ecore_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ /* smt occupied p core to idle e core balance */
++ smt_pcore_source_balance(rq, &single_task_mask, sched_ecore_idle_mask))
++ return;
++}
++
++static void ecore_balance_func(struct rq *rq, const int cpu)
++{
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
++}
++#endif /* CONFIG_SCHED_SMT */
++
++/* p core balance functions */
++static inline void pcore_balance(struct rq *rq)
++{
++ cpumask_t single_task_mask;
++
++ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
++ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ /* idle e core to p core balance */
++ ecore_source_balance(rq, &single_task_mask, sched_pcore_idle_mask))
++ return;
++}
++
++static void pcore_balance_func(struct rq *rq, const int cpu)
++{
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
++}
++
++#ifdef ALT_SCHED_DEBUG
++#define SCHED_DEBUG_INFO(...) printk(KERN_INFO __VA_ARGS__)
++#else
++#define SCHED_DEBUG_INFO(...) do { } while(0)
++#endif
++
++#define SET_IDLE_SELECT_FUNC(func) \
++{ \
++ idle_select_func = func; \
++ printk(KERN_INFO "sched: "#func); \
++}
++
++#define SET_RQ_BALANCE_FUNC(rq, cpu, func) \
++{ \
++ rq->balance_func = func; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#func, cpu); \
++}
++
++#define SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_func, clear_func) \
++{ \
++ rq->set_idle_mask_func = set_func; \
++ rq->clear_idle_mask_func = clear_func; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#set_func" "#clear_func, cpu); \
++}
++
++void sched_init_topology(void)
++{
++ int cpu;
++ struct rq *rq;
++ cpumask_t sched_ecore_mask = { CPU_BITS_NONE };
++ int ecore_present = 0;
++
++#ifdef CONFIG_SCHED_SMT
++ if (!cpumask_empty(&sched_smt_mask))
++ printk(KERN_INFO "sched: smt mask: 0x%08lx\n", sched_smt_mask.bits[0]);
++#endif
++
++ if (!cpumask_empty(&sched_pcore_mask)) {
++ cpumask_andnot(&sched_ecore_mask, cpu_online_mask, &sched_pcore_mask);
++ printk(KERN_INFO "sched: pcore mask: 0x%08lx, ecore mask: 0x%08lx\n",
++ sched_pcore_mask.bits[0], sched_ecore_mask.bits[0]);
++
++ ecore_present = !cpumask_empty(&sched_ecore_mask);
++ }
++
++#ifdef CONFIG_SCHED_SMT
++ /* idle select function */
++ if (cpumask_equal(&sched_smt_mask, cpu_online_mask)) {
++ SET_IDLE_SELECT_FUNC(p1_idle_select_func);
++ } else
++#endif
++ if (!cpumask_empty(&sched_pcore_mask)) {
++ SET_IDLE_SELECT_FUNC(p1p2_idle_select_func);
++ }
++
++ for_each_online_cpu(cpu) {
++ rq = cpu_rq(cpu);
++ /* take chance to reset time slice for idle tasks */
++ rq->idle->time_slice = sysctl_sched_base_slice;
++
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_weight(cpu_smt_mask(cpu)) > 1) {
++ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_smt, clear_idle_mask_smt);
++
++ if (cpumask_test_cpu(cpu, &sched_pcore_mask) &&
++ !cpumask_intersects(&sched_ecore_mask, &sched_smt_mask)) {
++ SET_RQ_BALANCE_FUNC(rq, cpu, smt_pcore_balance_func);
++ } else {
++ SET_RQ_BALANCE_FUNC(rq, cpu, smt_balance_func);
++ }
++
++ continue;
++ }
++#endif
++ /* !SMT or only one cpu in sg */
++ if (cpumask_test_cpu(cpu, &sched_pcore_mask)) {
++ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_pcore, clear_idle_mask_pcore);
++
++ if (ecore_present)
++ SET_RQ_BALANCE_FUNC(rq, cpu, pcore_balance_func);
++
++ continue;
++ }
++ if (cpumask_test_cpu(cpu, &sched_ecore_mask)) {
++ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_ecore, clear_idle_mask_ecore);
++#ifdef CONFIG_SCHED_SMT
++ if (cpumask_intersects(&sched_pcore_mask, &sched_smt_mask))
++ SET_RQ_BALANCE_FUNC(rq, cpu, ecore_balance_func);
++#endif
++ }
++ }
++}
+diff --git a/kernel/sched/alt_topology.h b/kernel/sched/alt_topology.h
+new file mode 100644
+index 000000000000..076174cd2bc6
+--- /dev/null
++++ b/kernel/sched/alt_topology.h
+@@ -0,0 +1,6 @@
++#ifndef _KERNEL_SCHED_ALT_TOPOLOGY_H
++#define _KERNEL_SCHED_ALT_TOPOLOGY_H
++
++extern void sched_init_topology(void);
++
++#endif /* _KERNEL_SCHED_ALT_TOPOLOGY_H */
+diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
+new file mode 100644
+index 000000000000..5a7835246ec3
+--- /dev/null
++++ b/kernel/sched/bmq.h
+@@ -0,0 +1,103 @@
++#ifndef _KERNEL_SCHED_BMQ_H
++#define _KERNEL_SCHED_BMQ_H
++
++#define ALT_SCHED_NAME "BMQ"
++
++/*
++ * BMQ only routines
++ */
++static inline void boost_task(struct task_struct *p, int n)
++{
++ int limit;
++
++ switch (p->policy) {
++ case SCHED_NORMAL:
++ limit = -MAX_PRIORITY_ADJ;
++ break;
++ case SCHED_BATCH:
++ limit = 0;
++ break;
++ default:
++ return;
++ }
++
++ p->boost_prio = max(limit, p->boost_prio - n);
++}
++
++static inline void deboost_task(struct task_struct *p)
++{
++ if (p->boost_prio < MAX_PRIORITY_ADJ)
++ p->boost_prio++;
++}
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms) {}
++
++/* This API is used in task_prio(), return value readed by human users */
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ return p->prio + p->boost_prio - MIN_NORMAL_PRIO;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MIN_NORMAL_PRIO)? (p->prio >> 2) :
++ MIN_SCHED_NORMAL_PRIO + (p->prio + p->boost_prio - MIN_NORMAL_PRIO) / 2;
++}
++
++#define TASK_SCHED_PRIO_IDX(p, rq, idx, prio) \
++ prio = task_sched_prio(p); \
++ idx = prio;
++
++static inline int sched_prio2idx(int prio, struct rq *rq)
++{
++ return prio;
++}
++
++static inline int sched_idx2prio(int idx, struct rq *rq)
++{
++ return idx;
++}
++
++static inline int sched_rq_prio_idx(struct rq *rq)
++{
++ return rq->prio;
++}
++
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio + p->boost_prio > DEFAULT_PRIO);
++}
++
++static inline void sched_update_rq_clock(struct rq *rq) {}
++
++static inline void sched_task_renew(struct task_struct *p, const struct rq *rq)
++{
++ deboost_task(p);
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
++static inline void sched_task_fork(struct task_struct *p, struct rq *rq) {}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++static inline void sched_task_ttwu(struct task_struct *p)
++{
++ s64 delta = this_rq()->clock_task > p->last_ran;
++
++ if (likely(delta > 0))
++ boost_task(p, delta >> 22);
++}
++
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
++{
++ boost_task(p, 1);
++}
++
++#endif /* _KERNEL_SCHED_BMQ_H */
+diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
+index c4a488e67aa7..42575d419e28 100644
+--- a/kernel/sched/build_policy.c
++++ b/kernel/sched/build_policy.c
+@@ -49,13 +49,17 @@
+
+ #include "idle.c"
+
+-#include "rt.c"
+-#include "cpudeadline.c"
++#ifndef CONFIG_SCHED_ALT
++# include "rt.c"
++# include "cpudeadline.c"
++#endif
+
+ #include "pelt.c"
+
+ #include "cputime.c"
++#ifndef CONFIG_SCHED_ALT
+ #include "deadline.c"
++#endif
+
+ #ifdef CONFIG_SCHED_CLASS_EXT
+ # include "ext.c"
+diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
+index e2cf3b08d4e9..a64bf71a6c69 100644
+--- a/kernel/sched/build_utility.c
++++ b/kernel/sched/build_utility.c
+@@ -56,6 +56,10 @@
+
+ #include "clock.c"
+
++#ifdef CONFIG_SCHED_ALT
++# include "alt_topology.c"
++#endif
++
+ #ifdef CONFIG_CGROUP_CPUACCT
+ # include "cpuacct.c"
+ #endif
+@@ -68,7 +72,7 @@
+ # include "cpufreq_schedutil.c"
+ #endif
+
+-#include "debug.c"
++# include "debug.c"
+
+ #ifdef CONFIG_SCHEDSTATS
+ # include "stats.c"
+@@ -81,7 +85,9 @@
+ #include "wait.c"
+
+ #include "cpupri.c"
+-#include "stop_task.c"
++#ifndef CONFIG_SCHED_ALT
++# include "stop_task.c"
++#endif
+
+ #include "topology.c"
+
+diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
+index 0ab5f9d4bc59..60f374ffa96d 100644
+--- a/kernel/sched/cpufreq_schedutil.c
++++ b/kernel/sched/cpufreq_schedutil.c
+@@ -225,6 +225,7 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsigned long actual,
+
+ static void sugov_get_util(struct sugov_cpu *sg_cpu, unsigned long boost)
+ {
++#ifndef CONFIG_SCHED_ALT
+ unsigned long min, max, util = scx_cpuperf_target(sg_cpu->cpu);
+
+ if (!scx_switched_all())
+@@ -233,6 +234,10 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu, unsigned long boost)
+ util = max(util, boost);
+ sg_cpu->bw_min = min;
+ sg_cpu->util = sugov_effective_cpu_perf(sg_cpu->cpu, util, min, max);
++#else /* CONFIG_SCHED_ALT */
++ sg_cpu->bw_min = 0;
++ sg_cpu->util = rq_load_util(cpu_rq(sg_cpu->cpu), arch_scale_cpu_capacity(sg_cpu->cpu));
++#endif /* CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -392,8 +397,10 @@ static inline bool sugov_hold_freq(struct sugov_cpu *sg_cpu) { return false; }
+ */
+ static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_min)
+- sg_cpu->sg_policy->need_freq_update = true;
++ sg_cpu->sg_policy->limits_changed = true;
++#endif
+ }
+
+ static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
+@@ -687,6 +694,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
+ }
+
+ ret = sched_setattr_nocheck(thread, &attr);
++
+ if (ret) {
+ kthread_stop(thread);
+ pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
+diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
+index 7097de2c8cda..52b5626ce7b6 100644
+--- a/kernel/sched/cputime.c
++++ b/kernel/sched/cputime.c
+@@ -127,7 +127,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
+ p->utime += cputime;
+ account_group_user_time(p, cputime);
+
+- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
++ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
+
+ /* Add user time to cpustat. */
+ task_group_account_field(p, index, cputime);
+@@ -151,7 +151,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
+ p->gtime += cputime;
+
+ /* Add guest time to cpustat. */
+- if (task_nice(p) > 0) {
++ if (task_running_nice(p)) {
+ task_group_account_field(p, CPUTIME_NICE, cputime);
+ cpustat[CPUTIME_GUEST_NICE] += cputime;
+ } else {
+@@ -289,7 +289,7 @@ static inline u64 account_other_time(u64 max)
+ #ifdef CONFIG_64BIT
+ static inline u64 read_sum_exec_runtime(struct task_struct *t)
+ {
+- return t->se.sum_exec_runtime;
++ return tsk_seruntime(t);
+ }
+ #else /* !CONFIG_64BIT: */
+ static u64 read_sum_exec_runtime(struct task_struct *t)
+@@ -299,7 +299,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
+ struct rq *rq;
+
+ rq = task_rq_lock(t, &rf);
+- ns = t->se.sum_exec_runtime;
++ ns = tsk_seruntime(t);
+ task_rq_unlock(rq, t, &rf);
+
+ return ns;
+@@ -624,7 +624,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
+ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
+ {
+ struct task_cputime cputime = {
+- .sum_exec_runtime = p->se.sum_exec_runtime,
++ .sum_exec_runtime = tsk_seruntime(p),
+ };
+
+ if (task_cputime(p, &cputime.utime, &cputime.stime))
+diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
+index 02e16b70a790..2687421bc524 100644
+--- a/kernel/sched/debug.c
++++ b/kernel/sched/debug.c
+@@ -10,6 +10,7 @@
+ #include <linux/nmi.h>
+ #include "sched.h"
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * This allows printing both to /sys/kernel/debug/sched/debug and
+ * to the console
+@@ -215,6 +216,8 @@ static const struct file_operations sched_scaling_fops = {
+ .release = single_release,
+ };
+
++#endif /* !CONFIG_SCHED_ALT */
++
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+
+ static ssize_t sched_dynamic_write(struct file *filp, const char __user *ubuf,
+@@ -280,6 +283,7 @@ static const struct file_operations sched_dynamic_fops = {
+
+ #endif /* CONFIG_PREEMPT_DYNAMIC */
+
++#ifndef CONFIG_SCHED_ALT
+ __read_mostly bool sched_debug_verbose;
+
+ static struct dentry *sd_dentry;
+@@ -464,9 +468,11 @@ static const struct file_operations fair_server_period_fops = {
+ .llseek = seq_lseek,
+ .release = single_release,
+ };
++#endif /* !CONFIG_SCHED_ALT */
+
+ static struct dentry *debugfs_sched;
+
++#ifndef CONFIG_SCHED_ALT
+ static void debugfs_fair_server_init(void)
+ {
+ struct dentry *d_fair;
+@@ -487,6 +493,7 @@ static void debugfs_fair_server_init(void)
+ debugfs_create_file("period", 0644, d_cpu, (void *) cpu, &fair_server_period_fops);
+ }
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ static __init int sched_init_debug(void)
+ {
+@@ -494,14 +501,17 @@ static __init int sched_init_debug(void)
+
+ debugfs_sched = debugfs_create_dir("sched", NULL);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
+ debugfs_create_file_unsafe("verbose", 0644, debugfs_sched, &sched_debug_verbose, &sched_verbose_fops);
++#endif /* !CONFIG_SCHED_ALT */
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+ debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
+ #endif
+
+ debugfs_create_u32("base_slice_ns", 0644, debugfs_sched, &sysctl_sched_base_slice);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_u32("latency_warn_ms", 0644, debugfs_sched, &sysctl_resched_latency_warn_ms);
+ debugfs_create_u32("latency_warn_once", 0644, debugfs_sched, &sysctl_resched_latency_warn_once);
+
+@@ -524,13 +534,18 @@ static __init int sched_init_debug(void)
+ #endif /* CONFIG_NUMA_BALANCING */
+
+ debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
++#endif /* !CONFIG_SCHED_ALT */
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_fair_server_init();
++#endif /* !CONFIG_SCHED_ALT */
+
+ return 0;
+ }
+ late_initcall(sched_init_debug);
+
++#ifndef CONFIG_SCHED_ALT
++
+ static cpumask_var_t sd_sysctl_cpus;
+
+ static int sd_flags_show(struct seq_file *m, void *v)
+@@ -1263,6 +1278,11 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
+
+ sched_show_numa(p, m);
+ }
++#else
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ void proc_sched_set_task(struct task_struct *p)
+ {
+diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
+index c39b089d4f09..9903f1d96dc3 100644
+--- a/kernel/sched/idle.c
++++ b/kernel/sched/idle.c
+@@ -428,6 +428,7 @@ void cpu_startup_entry(enum cpuhp_state state)
+ do_idle();
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * idle-task scheduling class.
+ */
+@@ -539,3 +540,4 @@ DEFINE_SCHED_CLASS(idle) = {
+ .switched_to = switched_to_idle,
+ .update_curr = update_curr_idle,
+ };
++#endif
+diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
+new file mode 100644
+index 000000000000..fe3099071eb7
+--- /dev/null
++++ b/kernel/sched/pds.h
+@@ -0,0 +1,139 @@
++#ifndef _KERNEL_SCHED_PDS_H
++#define _KERNEL_SCHED_PDS_H
++
++#define ALT_SCHED_NAME "PDS"
++
++static const u64 RT_MASK = ((1ULL << MIN_SCHED_NORMAL_PRIO) - 1);
++
++#define SCHED_NORMAL_PRIO_NUM (32)
++#define SCHED_EDGE_DELTA (SCHED_NORMAL_PRIO_NUM - NICE_WIDTH / 2)
++
++/* PDS assume SCHED_NORMAL_PRIO_NUM is power of 2 */
++#define SCHED_NORMAL_PRIO_MOD(x) ((x) & (SCHED_NORMAL_PRIO_NUM - 1))
++
++/* default time slice 4ms -> shift 22, 2 time slice slots -> shift 23 */
++static __read_mostly int sched_timeslice_shift = 23;
++
++/*
++ * Common interfaces
++ */
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ u64 sched_dl = max(p->deadline, rq->time_edge);
++
++#ifdef ALT_SCHED_DEBUG
++ if (WARN_ONCE(sched_dl - rq->time_edge > NORMAL_PRIO_NUM - 1,
++ "pds: task_sched_prio_normal() delta %lld\n", sched_dl - rq->time_edge))
++ return SCHED_NORMAL_PRIO_NUM - 1;
++#endif
++
++ return sched_dl - rq->time_edge;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MIN_NORMAL_PRIO) ? (p->prio >> 2) :
++ MIN_SCHED_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
++}
++
++#define TASK_SCHED_PRIO_IDX(p, rq, idx, prio) \
++ if (p->prio < MIN_NORMAL_PRIO) { \
++ prio = p->prio >> 2; \
++ idx = prio; \
++ } else { \
++ u64 sched_dl = max(p->deadline, rq->time_edge); \
++ prio = MIN_SCHED_NORMAL_PRIO + sched_dl - rq->time_edge; \
++ idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_dl); \
++ }
++
++static inline int sched_prio2idx(int sched_prio, struct rq *rq)
++{
++ return (IDLE_TASK_SCHED_PRIO == sched_prio || sched_prio < MIN_SCHED_NORMAL_PRIO) ?
++ sched_prio :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_prio + rq->time_edge);
++}
++
++static inline int sched_idx2prio(int sched_idx, struct rq *rq)
++{
++ return (sched_idx < MIN_SCHED_NORMAL_PRIO) ?
++ sched_idx :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_idx - rq->time_edge);
++}
++
++static inline int sched_rq_prio_idx(struct rq *rq)
++{
++ return rq->prio_idx;
++}
++
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio > DEFAULT_PRIO);
++}
++
++static inline void sched_update_rq_clock(struct rq *rq)
++{
++ struct list_head head;
++ u64 old = rq->time_edge;
++ u64 now = rq->clock >> sched_timeslice_shift;
++ u64 prio, delta;
++ DECLARE_BITMAP(normal, SCHED_QUEUE_BITS);
++
++ if (now == old)
++ return;
++
++ rq->time_edge = now;
++ delta = min_t(u64, SCHED_NORMAL_PRIO_NUM, now - old);
++ INIT_LIST_HEAD(&head);
++
++ prio = MIN_SCHED_NORMAL_PRIO;
++ for_each_set_bit_from(prio, rq->queue.bitmap, MIN_SCHED_NORMAL_PRIO + delta)
++ list_splice_tail_init(rq->queue.heads + MIN_SCHED_NORMAL_PRIO +
++ SCHED_NORMAL_PRIO_MOD(prio + old), &head);
++
++ bitmap_shift_right(normal, rq->queue.bitmap, delta, SCHED_QUEUE_BITS);
++ if (!list_empty(&head)) {
++ u64 idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(now);
++
++ __list_splice(&head, rq->queue.heads + idx, rq->queue.heads[idx].next);
++ set_bit(MIN_SCHED_NORMAL_PRIO, normal);
++ }
++ bitmap_replace(rq->queue.bitmap, normal, rq->queue.bitmap,
++ (const unsigned long *)&RT_MASK, SCHED_QUEUE_BITS);
++
++ if (rq->prio < MIN_SCHED_NORMAL_PRIO || IDLE_TASK_SCHED_PRIO == rq->prio)
++ return;
++
++ rq->prio = max_t(u64, MIN_SCHED_NORMAL_PRIO, rq->prio - delta);
++ rq->prio_idx = sched_prio2idx(rq->prio, rq);
++}
++
++static inline void sched_task_renew(struct task_struct *p, const struct rq *rq)
++{
++ if (p->prio >= MIN_NORMAL_PRIO)
++ p->deadline = rq->time_edge + SCHED_EDGE_DELTA +
++ (p->static_prio - (MAX_PRIO - NICE_WIDTH)) / 2;
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
++{
++ u64 max_dl = rq->time_edge + SCHED_EDGE_DELTA + NICE_WIDTH / 2 - 1;
++ if (unlikely(p->deadline > max_dl))
++ p->deadline = max_dl;
++}
++
++static inline void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ sched_task_renew(p, rq);
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sysctl_sched_base_slice;
++ sched_task_renew(p, rq);
++}
++
++static inline void sched_task_ttwu(struct task_struct *p) {}
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
++
++#endif /* _KERNEL_SCHED_PDS_H */
+diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
+index fa83bbaf4f3e..e5a8e94e6a8e 100644
+--- a/kernel/sched/pelt.c
++++ b/kernel/sched/pelt.c
+@@ -267,6 +267,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
+ WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * sched_entity:
+ *
+@@ -384,8 +385,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+
+ return 0;
+ }
++#endif
+
+-#ifdef CONFIG_SCHED_HW_PRESSURE
++#if defined(CONFIG_SCHED_HW_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * hardware:
+ *
+@@ -469,6 +471,7 @@ int update_irq_load_avg(struct rq *rq, u64 running)
+ }
+ #endif /* CONFIG_HAVE_SCHED_AVG_IRQ */
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * Load avg and utiliztion metrics need to be updated periodically and before
+ * consumption. This function updates the metrics for all subsystems except for
+@@ -488,3 +491,4 @@ bool update_other_load_avgs(struct rq *rq)
+ update_hw_load_avg(rq_clock_task(rq), rq, hw_pressure) |
+ update_irq_load_avg(rq, 0);
+ }
++#endif /* !CONFIG_SCHED_ALT */
+diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
+index 62c3fa543c0f..de93412fd3ad 100644
+--- a/kernel/sched/pelt.h
++++ b/kernel/sched/pelt.h
+@@ -5,14 +5,16 @@
+
+ #include "sched-pelt.h"
+
++#ifndef CONFIG_SCHED_ALT
+ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
+ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
+ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
+ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
+ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
+ bool update_other_load_avgs(struct rq *rq);
++#endif
+
+-#ifdef CONFIG_SCHED_HW_PRESSURE
++#if defined(CONFIG_SCHED_HW_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ int update_hw_load_avg(u64 now, struct rq *rq, u64 capacity);
+
+ static inline u64 hw_load_avg(struct rq *rq)
+@@ -49,6 +51,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
+ return PELT_MIN_DIVIDER + avg->period_contrib;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void cfs_se_util_change(struct sched_avg *avg)
+ {
+ unsigned int enqueued;
+@@ -185,5 +188,6 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+ return rq_clock_pelt(rq_of(cfs_rq));
+ }
+ #endif /* !CONFIG_CFS_BANDWIDTH */
++#endif /* CONFIG_SCHED_ALT */
+
+ #endif /* _KERNEL_SCHED_PELT_H */
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index cf2109b67f9a..a800b6b50264 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -5,6 +5,10 @@
+ #ifndef _KERNEL_SCHED_SCHED_H
+ #define _KERNEL_SCHED_SCHED_H
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_sched.h"
++#else
++
+ #include <linux/sched/affinity.h>
+ #include <linux/sched/autogroup.h>
+ #include <linux/sched/cpufreq.h>
+@@ -3900,4 +3904,9 @@ void sched_enq_and_set_task(struct sched_enq_and_set_ctx *ctx);
+
+ #include "ext.h"
+
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (task_nice(p) > 0);
++}
++#endif /* !CONFIG_SCHED_ALT */
+ #endif /* _KERNEL_SCHED_SCHED_H */
+diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
+index d1c9429a4ac5..cc3764073dd3 100644
+--- a/kernel/sched/stats.c
++++ b/kernel/sched/stats.c
+@@ -115,8 +115,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ seq_printf(seq, "timestamp %lu\n", jiffies);
+ } else {
+ struct rq *rq;
++#ifndef CONFIG_SCHED_ALT
+ struct sched_domain *sd;
+ int dcount = 0;
++#endif
+ cpu = (unsigned long)(v - 2);
+ rq = cpu_rq(cpu);
+
+@@ -131,6 +133,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+
+ seq_printf(seq, "\n");
+
++#ifndef CONFIG_SCHED_ALT
+ /* domain-specific stats */
+ rcu_read_lock();
+ for_each_domain(cpu, sd) {
+@@ -161,6 +164,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ sd->ttwu_move_balance);
+ }
+ rcu_read_unlock();
++#endif
+ }
+ return 0;
+ }
+diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
+index 26f3fd4d34ce..a3e389997138 100644
+--- a/kernel/sched/stats.h
++++ b/kernel/sched/stats.h
+@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
+
+ #endif /* CONFIG_SCHEDSTATS */
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ struct sched_entity_stats {
+ struct sched_entity se;
+@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
+ #endif
+ return &task_of(se)->stats;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PSI
+ void psi_task_change(struct task_struct *task, int clear, int set);
+diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
+index 77ae87f36e84..a2b06eba44e7 100644
+--- a/kernel/sched/syscalls.c
++++ b/kernel/sched/syscalls.c
+@@ -16,6 +16,14 @@
+ #include "sched.h"
+ #include "autogroup.h"
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_core.h"
++
++static inline int __normal_prio(int policy, int rt_prio, int static_prio)
++{
++ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) : static_prio;
++}
++#else /* !CONFIG_SCHED_ALT */
+ static inline int __normal_prio(int policy, int rt_prio, int nice)
+ {
+ int prio;
+@@ -29,6 +37,7 @@ static inline int __normal_prio(int policy, int rt_prio, int nice)
+
+ return prio;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ /*
+ * Calculate the expected normal priority: i.e. priority
+@@ -39,7 +48,11 @@ static inline int __normal_prio(int policy, int rt_prio, int nice)
+ */
+ static inline int normal_prio(struct task_struct *p)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
++#else /* !CONFIG_SCHED_ALT */
+ return __normal_prio(p->policy, p->rt_priority, PRIO_TO_NICE(p->static_prio));
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ /*
+@@ -64,6 +77,37 @@ static int effective_prio(struct task_struct *p)
+
+ void set_user_nice(struct task_struct *p, long nice)
+ {
++#ifdef CONFIG_SCHED_ALT
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
++ return;
++ /*
++ * We have to be careful, if called from sys_setpriority(),
++ * the task might be in the middle of scheduling on another CPU.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ rq = __task_access_lock(p, &lock);
++
++ p->static_prio = NICE_TO_PRIO(nice);
++ /*
++ * The RT priorities are set via sched_setscheduler(), but we still
++ * allow the 'normal' nice value to be set - but as expected
++ * it won't have any effect on scheduling until the task is
++ * not SCHED_NORMAL/SCHED_BATCH:
++ */
++ if (task_has_rt_policy(p))
++ goto out_unlock;
++
++ p->prio = effective_prio(p);
++
++ check_task_changed(p, rq);
++out_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++#else
+ bool queued, running;
+ struct rq *rq;
+ int old_prio;
+@@ -112,6 +156,7 @@ void set_user_nice(struct task_struct *p, long nice)
+ * lowered its priority, then reschedule its CPU:
+ */
+ p->sched_class->prio_changed(rq, p, old_prio);
++#endif /* !CONFIG_SCHED_ALT */
+ }
+ EXPORT_SYMBOL(set_user_nice);
+
+@@ -190,7 +235,19 @@ SYSCALL_DEFINE1(nice, int, increment)
+ */
+ int task_prio(const struct task_struct *p)
+ {
++#ifdef CONFIG_SCHED_ALT
++/*
++ * sched policy return value kernel prio user prio/nice
++ *
++ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
++ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
++ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
++ */
++ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
++ task_sched_prio_normal(p, task_rq(p));
++#else
+ return p->prio - MAX_RT_PRIO;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -297,11 +354,16 @@ static void __setscheduler_params(struct task_struct *p,
+
+ p->policy = policy;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_policy(policy))
+ __setparam_dl(p, attr);
+ else if (fair_policy(policy))
+ __setparam_fair(p, attr);
++#else /* !CONFIG_SCHED_ALT */
++ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
++#endif /* CONFIG_SCHED_ALT */
+
++#ifndef CONFIG_SCHED_ALT
+ /* rt-policy tasks do not have a timerslack */
+ if (rt_or_dl_task_policy(p)) {
+ p->timer_slack_ns = 0;
+@@ -309,6 +371,7 @@ static void __setscheduler_params(struct task_struct *p,
+ /* when switching back to non-rt policy, restore timerslack */
+ p->timer_slack_ns = p->default_timer_slack_ns;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ /*
+ * __sched_setscheduler() ensures attr->sched_priority == 0 when
+@@ -317,7 +380,9 @@ static void __setscheduler_params(struct task_struct *p,
+ */
+ p->rt_priority = attr->sched_priority;
+ p->normal_prio = normal_prio(p);
++#ifndef CONFIG_SCHED_ALT
+ set_load_weight(p, true);
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ /*
+@@ -333,6 +398,8 @@ static bool check_same_owner(struct task_struct *p)
+ uid_eq(cred->euid, pcred->uid));
+ }
+
++#ifndef CONFIG_SCHED_ALT
++
+ #ifdef CONFIG_UCLAMP_TASK
+
+ static int uclamp_validate(struct task_struct *p,
+@@ -446,6 +513,7 @@ static inline int uclamp_validate(struct task_struct *p,
+ static void __setscheduler_uclamp(struct task_struct *p,
+ const struct sched_attr *attr) { }
+ #endif /* !CONFIG_UCLAMP_TASK */
++#endif /* !CONFIG_SCHED_ALT */
+
+ /*
+ * Allow unprivileged RT tasks to decrease priority.
+@@ -456,11 +524,13 @@ static int user_check_sched_setscheduler(struct task_struct *p,
+ const struct sched_attr *attr,
+ int policy, int reset_on_fork)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (fair_policy(policy)) {
+ if (attr->sched_nice < task_nice(p) &&
+ !is_nice_reduction(p, attr->sched_nice))
+ goto req_priv;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ if (rt_policy(policy)) {
+ unsigned long rlim_rtprio = task_rlimit(p, RLIMIT_RTPRIO);
+@@ -475,6 +545,7 @@ static int user_check_sched_setscheduler(struct task_struct *p,
+ goto req_priv;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * Can't set/change SCHED_DEADLINE policy at all for now
+ * (safest behavior); in the future we would like to allow
+@@ -492,6 +563,7 @@ static int user_check_sched_setscheduler(struct task_struct *p,
+ if (!is_nice_reduction(p, task_nice(p)))
+ goto req_priv;
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ /* Can't change other user's priorities: */
+ if (!check_same_owner(p))
+@@ -514,6 +586,158 @@ int __sched_setscheduler(struct task_struct *p,
+ const struct sched_attr *attr,
+ bool user, bool pi)
+ {
++#ifdef CONFIG_SCHED_ALT
++ const struct sched_attr dl_squash_attr = {
++ .size = sizeof(struct sched_attr),
++ .sched_policy = SCHED_FIFO,
++ .sched_nice = 0,
++ .sched_priority = 99,
++ };
++ int oldpolicy = -1, policy = attr->sched_policy;
++ int retval, newprio;
++ struct balance_callback *head;
++ unsigned long flags;
++ struct rq *rq;
++ int reset_on_fork;
++ raw_spinlock_t *lock;
++
++ /* The pi code expects interrupts enabled */
++ BUG_ON(pi && in_interrupt());
++
++ /*
++ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
++ */
++ if (unlikely(SCHED_DEADLINE == policy)) {
++ attr = &dl_squash_attr;
++ policy = attr->sched_policy;
++ }
++recheck:
++ /* Double check policy once rq lock held */
++ if (policy < 0) {
++ reset_on_fork = p->sched_reset_on_fork;
++ policy = oldpolicy = p->policy;
++ } else {
++ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
++
++ if (policy > SCHED_IDLE)
++ return -EINVAL;
++ }
++
++ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
++ return -EINVAL;
++
++ /*
++ * Valid priorities for SCHED_FIFO and SCHED_RR are
++ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
++ * SCHED_BATCH and SCHED_IDLE is 0.
++ */
++ if (attr->sched_priority < 0 ||
++ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
++ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
++ return -EINVAL;
++ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
++ (attr->sched_priority != 0))
++ return -EINVAL;
++
++ if (user) {
++ retval = user_check_sched_setscheduler(p, attr, policy, reset_on_fork);
++ if (retval)
++ return retval;
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ return retval;
++ }
++
++ /*
++ * Make sure no PI-waiters arrive (or leave) while we are
++ * changing the priority of the task:
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++
++ /*
++ * To be able to change p->policy safely, task_access_lock()
++ * must be called.
++ * IF use task_access_lock() here:
++ * For the task p which is not running, reading rq->stop is
++ * racy but acceptable as ->stop doesn't change much.
++ * An enhancemnet can be made to read rq->stop saftly.
++ */
++ rq = __task_access_lock(p, &lock);
++
++ /*
++ * Changing the policy of the stop threads its a very bad idea
++ */
++ if (p == rq->stop) {
++ retval = -EINVAL;
++ goto unlock;
++ }
++
++ /*
++ * If not changing anything there's no need to proceed further:
++ */
++ if (unlikely(policy == p->policy)) {
++ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
++ goto change;
++ if (!rt_policy(policy) &&
++ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
++ goto change;
++
++ p->sched_reset_on_fork = reset_on_fork;
++ retval = 0;
++ goto unlock;
++ }
++change:
++
++ /* Re-check policy now with rq lock held */
++ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
++ policy = oldpolicy = -1;
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ goto recheck;
++ }
++
++ p->sched_reset_on_fork = reset_on_fork;
++
++ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
++ if (pi) {
++ /*
++ * Take priority boosted tasks into account. If the new
++ * effective priority is unchanged, we just store the new
++ * normal parameters and do not touch the scheduler class and
++ * the runqueue. This will be done when the task deboost
++ * itself.
++ */
++ newprio = rt_effective_prio(p, newprio);
++ }
++
++ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
++ __setscheduler_params(p, attr);
++ __setscheduler_prio(p, newprio);
++ }
++
++ check_task_changed(p, rq);
++
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++ head = splice_balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ if (pi)
++ rt_mutex_adjust_pi(p);
++
++ /* Run balance callbacks after we've adjusted the PI chain: */
++ balance_callbacks(rq, head);
++ preempt_enable();
++
++ return 0;
++
++unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ return retval;
++#else /* !CONFIG_SCHED_ALT */
+ int oldpolicy = -1, policy = attr->sched_policy;
+ int retval, oldprio, newprio, queued, running;
+ const struct sched_class *prev_class, *next_class;
+@@ -750,6 +974,7 @@ int __sched_setscheduler(struct task_struct *p,
+ if (cpuset_locked)
+ cpuset_unlock();
+ return retval;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+
+ static int _sched_setscheduler(struct task_struct *p, int policy,
+@@ -761,8 +986,10 @@ static int _sched_setscheduler(struct task_struct *p, int policy,
+ .sched_nice = PRIO_TO_NICE(p->static_prio),
+ };
+
++#ifndef CONFIG_SCHED_ALT
+ if (p->se.custom_slice)
+ attr.sched_runtime = p->se.slice;
++#endif /* !CONFIG_SCHED_ALT */
+
+ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
+ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
+@@ -930,13 +1157,18 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
+
+ static void get_params(struct task_struct *p, struct sched_attr *attr)
+ {
+- if (task_has_dl_policy(p)) {
++#ifndef CONFIG_SCHED_ALT
++ if (task_has_dl_policy(p))
+ __getparam_dl(p, attr);
+- } else if (task_has_rt_policy(p)) {
++ else
++#endif
++ if (task_has_rt_policy(p)) {
+ attr->sched_priority = p->rt_priority;
+ } else {
+ attr->sched_nice = task_nice(p);
++#ifndef CONFIG_SCHED_ALT
+ attr->sched_runtime = p->se.slice;
++#endif
+ }
+ }
+
+@@ -1117,6 +1349,7 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
+
+ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
+ {
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * If the task isn't a deadline task or admission control is
+ * disabled then we don't care about affinity changes.
+@@ -1140,6 +1373,7 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
+ guard(rcu)();
+ if (!cpumask_subset(task_rq(p)->rd->span, mask))
+ return -EBUSY;
++#endif
+
+ return 0;
+ }
+@@ -1163,9 +1397,11 @@ int __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
+ ctx->new_mask = new_mask;
+ ctx->flags |= SCA_CHECK;
+
++#ifndef CONFIG_SCHED_ALT
+ retval = dl_task_check_affinity(p, new_mask);
+ if (retval)
+ goto out_free_new_mask;
++#endif
+
+ retval = __set_cpus_allowed_ptr(p, ctx);
+ if (retval)
+@@ -1345,13 +1581,34 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
+
+ static void do_sched_yield(void)
+ {
+- struct rq_flags rf;
+ struct rq *rq;
++ struct rq_flags rf;
++
++#ifdef CONFIG_SCHED_ALT
++ struct task_struct *p;
++
++ if (!sched_yield_type)
++ return;
+
+ rq = this_rq_lock_irq(&rf);
+
++ schedstat_inc(rq->yld_count);
++
++ p = current;
++ if (rt_task(p)) {
++ if (task_on_rq_queued(p))
++ requeue_task(p, rq);
++ } else if (rq->nr_running > 1) {
++ do_sched_yield_type_1(p, rq);
++ if (task_on_rq_queued(p))
++ requeue_task(p, rq);
++ }
++#else /* !CONFIG_SCHED_ALT */
++ rq = this_rq_lock_irq(&rf);
++
+ schedstat_inc(rq->yld_count);
+ current->sched_class->yield_task(rq);
++#endif /* !CONFIG_SCHED_ALT */
+
+ preempt_disable();
+ rq_unlock_irq(rq, &rf);
+@@ -1420,6 +1677,9 @@ EXPORT_SYMBOL(yield);
+ */
+ int __sched yield_to(struct task_struct *p, bool preempt)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return 0;
++#else /* !CONFIG_SCHED_ALT */
+ struct task_struct *curr = current;
+ struct rq *rq, *p_rq;
+ int yielded = 0;
+@@ -1465,6 +1725,7 @@ int __sched yield_to(struct task_struct *p, bool preempt)
+ schedule();
+
+ return yielded;
++#endif /* !CONFIG_SCHED_ALT */
+ }
+ EXPORT_SYMBOL_GPL(yield_to);
+
+@@ -1485,7 +1746,9 @@ SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
+ case SCHED_RR:
+ ret = MAX_RT_PRIO-1;
+ break;
++#ifndef CONFIG_SCHED_ALT
+ case SCHED_DEADLINE:
++#endif
+ case SCHED_NORMAL:
+ case SCHED_BATCH:
+ case SCHED_IDLE:
+@@ -1513,7 +1776,9 @@ SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
+ case SCHED_RR:
+ ret = 1;
+ break;
++#ifndef CONFIG_SCHED_ALT
+ case SCHED_DEADLINE:
++#endif
+ case SCHED_NORMAL:
+ case SCHED_BATCH:
+ case SCHED_IDLE:
+@@ -1525,7 +1790,9 @@ SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
+
+ static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
+ {
++#ifndef CONFIG_SCHED_ALT
+ unsigned int time_slice = 0;
++#endif
+ int retval;
+
+ if (pid < 0)
+@@ -1540,6 +1807,7 @@ static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
+ if (retval)
+ return retval;
+
++#ifndef CONFIG_SCHED_ALT
+ scoped_guard (task_rq_lock, p) {
+ struct rq *rq = scope.rq;
+ if (p->sched_class->get_rr_interval)
+@@ -1548,6 +1816,13 @@ static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
+ }
+
+ jiffies_to_timespec64(time_slice, t);
++#else
++ }
++
++ alt_sched_debug();
++
++ *t = ns_to_timespec64(sysctl_sched_base_slice);
++#endif /* !CONFIG_SCHED_ALT */
+ return 0;
+ }
+
+diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
+index 6e2f54169e66..5a5031761477 100644
+--- a/kernel/sched/topology.c
++++ b/kernel/sched/topology.c
+@@ -3,6 +3,7 @@
+ * Scheduler topology setup/handling methods
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ #include <linux/sched/isolation.h>
+ #include <linux/bsearch.h>
+ #include "sched.h"
+@@ -1497,8 +1498,10 @@ static void asym_cpu_capacity_scan(void)
+ */
+
+ static int default_relax_domain_level = -1;
++#endif /* CONFIG_SCHED_ALT */
+ int sched_domain_level_max;
+
++#ifndef CONFIG_SCHED_ALT
+ static int __init setup_relax_domain_level(char *str)
+ {
+ if (kstrtoint(str, 0, &default_relax_domain_level))
+@@ -1731,6 +1734,7 @@ sd_init(struct sched_domain_topology_level *tl,
+
+ return sd;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ /*
+ * Topology list, bottom-up.
+@@ -1767,6 +1771,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
+ sched_domain_topology_saved = NULL;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_NUMA
+
+ static const struct cpumask *sd_numa_mask(int cpu)
+@@ -2833,3 +2838,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
+ sched_domains_mutex_unlock();
+ }
++#else /* CONFIG_SCHED_ALT */
++DEFINE_STATIC_KEY_FALSE(sched_asym_cpucapacity);
++
++void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
++ struct sched_domain_attr *dattr_new)
++{}
++
++#ifdef CONFIG_NUMA
++int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return best_mask_cpu(cpu, cpus);
++}
++
++int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
++{
++ return cpumask_nth(cpu, cpus);
++}
++
++const struct cpumask *sched_numa_hop_mask(unsigned int node, unsigned int hops)
++{
++ return ERR_PTR(-EOPNOTSUPP);
++}
++EXPORT_SYMBOL_GPL(sched_numa_hop_mask);
++#endif /* CONFIG_NUMA */
++
++void sched_update_asym_prefer_cpu(int cpu, int old_prio, int new_prio)
++{}
++#endif
+diff --git a/kernel/sysctl.c b/kernel/sysctl.c
+index cb6196e3fa99..d0446e53fd64 100644
+--- a/kernel/sysctl.c
++++ b/kernel/sysctl.c
+@@ -36,6 +36,10 @@ EXPORT_SYMBOL_GPL(sysctl_long_vals);
+ static const int ngroups_max = NGROUPS_MAX;
+ static const int cap_last_cap = CAP_LAST_CAP;
+
++#ifdef CONFIG_SCHED_ALT
++extern int sched_yield_type;
++#endif
++
+ #ifdef CONFIG_PROC_SYSCTL
+
+ /**
+@@ -1489,6 +1493,17 @@ static const struct ctl_table sysctl_subsys_table[] = {
+ .proc_handler = proc_dointvec,
+ },
+ #endif
++#ifdef CONFIG_SCHED_ALT
++ {
++ .procname = "yield_type",
++ .data = &sched_yield_type,
++ .maxlen = sizeof (int),
++ .mode = 0644,
++ .proc_handler = &proc_dointvec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_TWO,
++ },
++#endif
+ #ifdef CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
+ {
+ .procname = "ignore-unaligned-usertrap",
+diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
+index 2e5b89d7d866..38c4526f5bc7 100644
+--- a/kernel/time/posix-cpu-timers.c
++++ b/kernel/time/posix-cpu-timers.c
+@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
+ u64 stime, utime;
+
+ task_cputime(p, &utime, &stime);
+- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
++ store_samples(samples, stime, utime, tsk_seruntime(p));
+ }
+
+ static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
+@@ -835,6 +835,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
+ }
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void check_dl_overrun(struct task_struct *tsk)
+ {
+ if (tsk->dl.dl_overrun) {
+@@ -842,6 +843,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
+ send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID);
+ }
+ }
++#endif
+
+ static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
+ {
+@@ -869,8 +871,10 @@ static void check_thread_timers(struct task_struct *tsk,
+ u64 samples[CPUCLOCK_MAX];
+ unsigned long soft;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk))
+ check_dl_overrun(tsk);
++#endif
+
+ if (expiry_cache_is_inactive(pct))
+ return;
+@@ -884,7 +888,7 @@ static void check_thread_timers(struct task_struct *tsk,
+ soft = task_rlimit(tsk, RLIMIT_RTTIME);
+ if (soft != RLIM_INFINITY) {
+ /* Task RT timeout is accounted in jiffies. RTTIME is usec */
+- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
++ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
+ unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
+
+ /* At the hard limit, send SIGKILL. No further action. */
+@@ -1120,8 +1124,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
+ return true;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk) && tsk->dl.dl_overrun)
+ return true;
++#endif
+
+ return false;
+ }
+diff --git a/kernel/time/timer.c b/kernel/time/timer.c
+index 553fa469d7cc..d7c90f6ff009 100644
+--- a/kernel/time/timer.c
++++ b/kernel/time/timer.c
+@@ -2453,7 +2453,11 @@ static void run_local_timers(void)
+ */
+ if (time_after_eq(jiffies, READ_ONCE(base->next_expiry)) ||
+ (i == BASE_DEF && tmigr_requires_handle_remote())) {
++#ifdef CONFIG_SCHED_BMQ
++ __raise_softirq_irqoff(TIMER_SOFTIRQ);
++#else
+ raise_timer_softirq(TIMER_SOFTIRQ);
++#endif
+ return;
+ }
+ }
+diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
+index dc734867f0fc..9ce22f6282eb 100644
+--- a/kernel/trace/trace_osnoise.c
++++ b/kernel/trace/trace_osnoise.c
+@@ -1645,6 +1645,9 @@ static void osnoise_sleep(bool skip_period)
+ */
+ static inline int osnoise_migration_pending(void)
+ {
++#ifdef CONFIG_SCHED_ALT
++ return 0;
++#else
+ if (!current->migration_pending)
+ return 0;
+
+@@ -1666,6 +1669,7 @@ static inline int osnoise_migration_pending(void)
+ mutex_unlock(&interface_lock);
+
+ return 1;
++#endif
+ }
+
+ /*
+diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
+index d88c44f1dfa5..4af3cbbdcccb 100644
+--- a/kernel/trace/trace_selftest.c
++++ b/kernel/trace/trace_selftest.c
+@@ -1423,10 +1423,15 @@ static int trace_wakeup_test_thread(void *data)
+ {
+ /* Make this a -deadline thread */
+ static const struct sched_attr attr = {
++#ifdef CONFIG_SCHED_ALT
++ /* No deadline on BMQ/PDS, use RR */
++ .sched_policy = SCHED_RR,
++#else
+ .sched_policy = SCHED_DEADLINE,
+ .sched_runtime = 100000ULL,
+ .sched_deadline = 10000000ULL,
+ .sched_period = 10000000ULL
++#endif
+ };
+ struct wakeup_test_data *x = data;
+
+diff --git a/kernel/workqueue.c b/kernel/workqueue.c
+index c6b79b3675c3..2872234d8620 100644
+--- a/kernel/workqueue.c
++++ b/kernel/workqueue.c
+@@ -1251,6 +1251,7 @@ static bool kick_pool(struct worker_pool *pool)
+
+ p = worker->task;
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+ /*
+ * Idle @worker is about to execute @work and waking up provides an
+@@ -1280,6 +1281,8 @@ static bool kick_pool(struct worker_pool *pool)
+ }
+ }
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
++
+ wake_up_process(p);
+ return true;
+ }
+@@ -1408,7 +1411,11 @@ void wq_worker_running(struct task_struct *task)
+ * CPU intensive auto-detection cares about how long a work item hogged
+ * CPU without sleeping. Reset the starting timestamp on wakeup.
+ */
++#ifdef CONFIG_SCHED_ALT
++ worker->current_at = worker->task->sched_time;
++#else
+ worker->current_at = worker->task->se.sum_exec_runtime;
++#endif
+
+ WRITE_ONCE(worker->sleeping, 0);
+ }
+@@ -1493,7 +1500,11 @@ void wq_worker_tick(struct task_struct *task)
+ * We probably want to make this prettier in the future.
+ */
+ if ((worker->flags & WORKER_NOT_RUNNING) || READ_ONCE(worker->sleeping) ||
++#ifdef CONFIG_SCHED_ALT
++ worker->task->sched_time - worker->current_at <
++#else
+ worker->task->se.sum_exec_runtime - worker->current_at <
++#endif
+ wq_cpu_intensive_thresh_us * NSEC_PER_USEC)
+ return;
+
+@@ -3164,7 +3175,11 @@ __acquires(&pool->lock)
+ worker->current_func = work->func;
+ worker->current_pwq = pwq;
+ if (worker->task)
++#ifdef CONFIG_SCHED_ALT
++ worker->current_at = worker->task->sched_time;
++#else
+ worker->current_at = worker->task->se.sum_exec_runtime;
++#endif
+ work_data = *work_data_bits(work);
+ worker->current_color = get_work_color(work_data);
+
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
new file mode 100644
index 00000000..6dc48eec
--- /dev/null
+++ b/5021_BMQ-and-PDS-gentoo-defaults.patch
@@ -0,0 +1,13 @@
+--- a/init/Kconfig 2023-02-13 08:16:09.534315265 -0500
++++ b/init/Kconfig 2023-02-13 08:17:24.130237204 -0500
+@@ -867,8 +867,9 @@ config UCLAMP_BUCKETS_COUNT
+ If in doubt, use the default value.
+
+ menuconfig SCHED_ALT
++ depends on X86_64
+ bool "Alternative CPU Schedulers"
+- default y
++ default n
+ help
+ This feature enable alternative CPU scheduler"
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-09-29 12:16 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-09-29 12:16 UTC (permalink / raw
To: gentoo-commits
commit: 850d16b9da0ddd1d3aa834fcba00d1dcfedc84cb
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Sep 29 12:12:23 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Sep 29 12:12:23 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=850d16b9
Add libbpf suppress adding '-Werror' if WERROR=0
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 ++++
2991_libbpf_add_WERROR_option.patch | 47 +++++++++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/0000_README b/0000_README
index 7dc67ad1..1e434c3a 100644
--- a/0000_README
+++ b/0000_README
@@ -70,6 +70,10 @@ Patch: 2990_libbpf-v2-workaround-Wmaybe-uninitialized-false-pos.patch
From: https://lore.kernel.org/bpf/
Desc: libbpf: workaround -Wmaybe-uninitialized false positive
+Patch: 2991_libbpf_add_WERROR_option.patch
+From: https://lore.kernel.org/bpf/
+Desc: libbpf: suppress adding -werror is WERROR=0
+
Patch: 3000_Support-printing-firmware-info.patch
From: https://bugs.gentoo.org/732852
Desc: Print firmware info (Reqs CONFIG_GENTOO_PRINT_FIRMWARE_INFO). Thanks to Georgy Yakovlev
diff --git a/2991_libbpf_add_WERROR_option.patch b/2991_libbpf_add_WERROR_option.patch
new file mode 100644
index 00000000..e8649909
--- /dev/null
+++ b/2991_libbpf_add_WERROR_option.patch
@@ -0,0 +1,47 @@
+Subject: [PATCH] tools/libbpf: add WERROR option
+Date: Sat, 5 Jul 2025 11:43:12 +0100
+Message-ID: <7e6c41e47c6a8ab73945e6aac319e0dd53337e1b.1751712192.git.sam@gentoo.org>
+X-Mailer: git-send-email 2.50.0
+Precedence: bulk
+X-Mailing-List: bpf@vger.kernel.org
+List-Id: <bpf.vger.kernel.org>
+List-Subscribe: <mailto:bpf+subscribe@vger.kernel.org>
+List-Unsubscribe: <mailto:bpf+unsubscribe@vger.kernel.org>
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
+
+Check the 'WERROR' variable and suppress adding '-Werror' if WERROR=0.
+
+This mirrors what tools/perf and other directories in tools do to handle
+-Werror rather than adding it unconditionally.
+
+Signed-off-by: Sam James <sam@gentoo.org>
+---
+ tools/lib/bpf/Makefile | 7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
+index 168140f8e646..9563d37265da 100644
+--- a/tools/lib/bpf/Makefile
++++ b/tools/lib/bpf/Makefile
+@@ -77,10 +77,15 @@ else
+ CFLAGS := -g -O2
+ endif
+
++# Treat warnings as errors unless directed not to
++ifneq ($(WERROR),0)
++ CFLAGS += -Werror
++endif
++
+ # Append required CFLAGS
+ override CFLAGS += -std=gnu89
+ override CFLAGS += $(EXTRA_WARNINGS) -Wno-switch-enum
+-override CFLAGS += -Werror -Wall
++override CFLAGS += -Wall
+ override CFLAGS += $(INCLUDES)
+ override CFLAGS += -fvisibility=hidden
+ override CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
+--
+2.50.0
+
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-01 6:43 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-01 6:43 UTC (permalink / raw
To: gentoo-commits
commit: e7fe1516618bcc20ae9887a3325d9a77f59b39c6
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 1 06:41:57 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 1 06:41:57 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e7fe1516
Update CONFIG_GCC_PLUGIN_STACKLEAK to CONFIG_KSTACK_ERASE
Ref: https://lore.kernel.org/all/20250717232519.2984886-1-kees <AT> kernel.org/
bug: #963589
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
4567_distro-Gentoo-Kconfig.patch | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/4567_distro-Gentoo-Kconfig.patch b/4567_distro-Gentoo-Kconfig.patch
index 298dc6ec..5bcc8590 100644
--- a/4567_distro-Gentoo-Kconfig.patch
+++ b/4567_distro-Gentoo-Kconfig.patch
@@ -247,7 +247,7 @@
+ depends on !X86_MSR && X86_64 && GENTOO_KERNEL_SELF_PROTECTION
+ default n
+
-+ select GCC_PLUGIN_STACKLEAK
++ select KSTACK_ERASE
+ select X86_KERNEL_IBT if CC_HAS_IBT=y && HAVE_OBJTOOL=y && (!LD_IS_LLD=n || LLD_VERSION>=140000)
+ select LEGACY_VSYSCALL_NONE
+ select PAGE_TABLE_ISOLATION
@@ -273,7 +273,7 @@
+ select ARM64_BTI_KERNEL if ( ARM64_BTI=y ) && ( ARM64_PTR_AUTH_KERNEL=y ) && ( CC_HAS_BRANCH_PROT_PAC_RET_BTI=y ) && (CC_IS_GCC=n || GCC_VERSION >= 100100 ) && (CC_IS_GCC=n ) && ((FUNCTION_GRAPH_TRACE=n || DYNAMIC_FTRACE_WITH_ARG=y ))
+ select ARM64_SW_TTBR0_PAN
+ select CONFIG_UNMAP_KERNEL_AT_EL0
-+ select GCC_PLUGIN_STACKLEAK
++ select KSTACK_ERASE
+ select KASAN_HW_TAGS if HAVE_ARCH_KASAN_HW_TAGS=y
+ select RANDOMIZE_BASE
+ select RELOCATABLE
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-01 18:08 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-01 18:08 UTC (permalink / raw
To: gentoo-commits
commit: 8cdd683e38e0608446d7389b4b5943cf51827e2b
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 1 06:41:57 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 1 18:08:30 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=8cdd683e
Update CONFIG_GCC_PLUGIN_STACKLEAK to CONFIG_KSTACK_ERASE
Ref: https://lore.kernel.org/all/20250717232519.2984886-1-kees <AT> kernel.org/
bug: #963589
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
4567_distro-Gentoo-Kconfig.patch | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/4567_distro-Gentoo-Kconfig.patch b/4567_distro-Gentoo-Kconfig.patch
index 298dc6ec..c34629a6 100644
--- a/4567_distro-Gentoo-Kconfig.patch
+++ b/4567_distro-Gentoo-Kconfig.patch
@@ -247,7 +247,7 @@
+ depends on !X86_MSR && X86_64 && GENTOO_KERNEL_SELF_PROTECTION
+ default n
+
-+ select GCC_PLUGIN_STACKLEAK
++ select KSTACK_ERASE if HAVE_ARCH_KSTACK_ERASE
+ select X86_KERNEL_IBT if CC_HAS_IBT=y && HAVE_OBJTOOL=y && (!LD_IS_LLD=n || LLD_VERSION>=140000)
+ select LEGACY_VSYSCALL_NONE
+ select PAGE_TABLE_ISOLATION
@@ -273,7 +273,7 @@
+ select ARM64_BTI_KERNEL if ( ARM64_BTI=y ) && ( ARM64_PTR_AUTH_KERNEL=y ) && ( CC_HAS_BRANCH_PROT_PAC_RET_BTI=y ) && (CC_IS_GCC=n || GCC_VERSION >= 100100 ) && (CC_IS_GCC=n ) && ((FUNCTION_GRAPH_TRACE=n || DYNAMIC_FTRACE_WITH_ARG=y ))
+ select ARM64_SW_TTBR0_PAN
+ select CONFIG_UNMAP_KERNEL_AT_EL0
-+ select GCC_PLUGIN_STACKLEAK
++ select KSTACK_ERASE if HAVE_ARCH_KSTACK_ERASE
+ select KASAN_HW_TAGS if HAVE_ARCH_KASAN_HW_TAGS=y
+ select RANDOMIZE_BASE
+ select RELOCATABLE
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-02 3:06 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-02 3:06 UTC (permalink / raw
To: gentoo-commits
commit: 67d91772faccd0adbd05b92f00e4172de2bf767f
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Thu Oct 2 03:04:31 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Thu Oct 2 03:04:31 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=67d91772
Add patch 2101 blk-mq: fix blk_mq_tags double free while nr_requests grown
Ref: https://lore.kernel.org/all/CAFj5m9K+ct=ioJUz8v78Wr_myC7pjVnB1SAKRXc-CLysHV_5ww <AT> mail.gmail.com/
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 ++
..._tags_double_free_while_nr_requests_grown.patch | 47 ++++++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/0000_README b/0000_README
index 1e434c3a..9e8cb466 100644
--- a/0000_README
+++ b/0000_README
@@ -58,6 +58,10 @@ Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
+Patch: 2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch
+From: https://lore.kernel.org/all/CAFj5m9K+ct=ioJUz8v78Wr_myC7pjVnB1SAKRXc-CLysHV_5ww@mail.gmail.com/
+Desc: blk-mq: fix blk_mq_tags double free while nr_requests grown
+
Patch: 2901_permit-menuconfig-sorting.patch
From: https://lore.kernel.org/
Desc: menuconfig: Allow sorting the entries alphabetically
diff --git a/2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch b/2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch
new file mode 100644
index 00000000..e47b4b2a
--- /dev/null
+++ b/2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch
@@ -0,0 +1,47 @@
+From ba28afbd9eff2a6370f23ef4e6a036ab0cfda409 Mon Sep 17 00:00:00 2001
+From: Yu Kuai <yukuai3@huawei.com>
+Date: Thu, 21 Aug 2025 14:06:12 +0800
+Subject: blk-mq: fix blk_mq_tags double free while nr_requests grown
+
+In the case user trigger tags grow by queue sysfs attribute nr_requests,
+hctx->sched_tags will be freed directly and replaced with a new
+allocated tags, see blk_mq_tag_update_depth().
+
+The problem is that hctx->sched_tags is from elevator->et->tags, while
+et->tags is still the freed tags, hence later elevator exit will try to
+free the tags again, causing kernel panic.
+
+Fix this problem by replacing et->tags with new allocated tags as well.
+
+Noted there are still some long term problems that will require some
+refactor to be fixed thoroughly[1].
+
+[1] https://lore.kernel.org/all/20250815080216.410665-1-yukuai1@huaweicloud.com/
+Fixes: f5a6604f7a44 ("block: fix lockdep warning caused by lock dependency in elv_iosched_store")
+
+Signed-off-by: Yu Kuai <yukuai3@huawei.com>
+Reviewed-by: Ming Lei <ming.lei@redhat.com>
+Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
+Reviewed-by: Hannes Reinecke <hare@suse.de>
+Reviewed-by: Li Nan <linan122@huawei.com>
+Link: https://lore.kernel.org/r/20250821060612.1729939-3-yukuai1@huaweicloud.com
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+---
+ block/blk-mq-tag.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
+index d880c50629d612..5cffa5668d0c38 100644
+--- a/block/blk-mq-tag.c
++++ b/block/blk-mq-tag.c
+@@ -622,6 +622,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+ return -ENOMEM;
+
+ blk_mq_free_map_and_rqs(set, *tagsptr, hctx->queue_num);
++ hctx->queue->elevator->et->tags[hctx->queue_num] = new;
+ *tagsptr = new;
+ } else {
+ /*
+--
+cgit 1.2.3-korg
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-06 11:06 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-06 11:06 UTC (permalink / raw
To: gentoo-commits
commit: 9b45045d324846023feefcaf22239ca6c1fd2b62
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Thu Oct 2 03:55:09 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Thu Oct 2 03:55:09 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=9b45045d
Add Update CPU Optimization patch for 6.16+
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +
5010_enable-cpu-optimizations-universal.patch | 846 ++++++++++++++++++++++++++
2 files changed, 850 insertions(+)
diff --git a/0000_README b/0000_README
index 9e8cb466..f2189223 100644
--- a/0000_README
+++ b/0000_README
@@ -86,6 +86,10 @@ Patch: 4567_distro-Gentoo-Kconfig.patch
From: Tom Wijsman <TomWij@gentoo.org>
Desc: Add Gentoo Linux support config settings and defaults.
+Patch: 5010_enable-cpu-optimizations-universal.patch
+From: https://github.com/graysky2/kernel_compiler_patch
+Desc: More ISA levels and uarches for kernel 6.16+
+
Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
From: https://gitlab.com/alfredchen/projectc
Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
diff --git a/5010_enable-cpu-optimizations-universal.patch b/5010_enable-cpu-optimizations-universal.patch
new file mode 100644
index 00000000..962a82a6
--- /dev/null
+++ b/5010_enable-cpu-optimizations-universal.patch
@@ -0,0 +1,846 @@
+From 6b1d270f55e3143bcb3ad914adf920774351a6b9 Mon Sep 17 00:00:00 2001
+From: graysky <therealgraysky AT proton DOT me>
+Date: Mon, 18 Aug 2025 04:14:48 -0400
+
+1. New generic x86-64 ISA levels
+
+These are selectable under:
+ Processor type and features ---> x86-64 compiler ISA level
+
+• x86-64 A value of (1) is the default
+• x86-64-v2 A value of (2) brings support for vector
+ instructions up to Streaming SIMD Extensions 4.2 (SSE4.2)
+ and Supplemental Streaming SIMD Extensions 3 (SSSE3), the
+ POPCNT instruction, and CMPXCHG16B.
+• x86-64-v3 A value of (3) adds vector instructions up to AVX2, MOVBE,
+ and additional bit-manipulation instructions.
+
+There is also x86-64-v4 but including this makes little sense as
+the kernel does not use any of the AVX512 instructions anyway.
+
+Users of glibc 2.33 and above can see which level is supported by running:
+ /lib/ld-linux-x86-64.so.2 --help | grep supported
+Or
+ /lib64/ld-linux-x86-64.so.2 --help | grep supported
+
+2. New micro-architectures
+
+These are selectable under:
+ Processor type and features ---> Processor family
+
+• AMD Improved K8-family
+• AMD K10-family
+• AMD Family 10h (Barcelona)
+• AMD Family 14h (Bobcat)
+• AMD Family 16h (Jaguar)
+• AMD Family 15h (Bulldozer)
+• AMD Family 15h (Piledriver)
+• AMD Family 15h (Steamroller)
+• AMD Family 15h (Excavator)
+• AMD Family 17h (Zen)
+• AMD Family 17h (Zen 2)
+• AMD Family 19h (Zen 3)**
+• AMD Family 19h (Zen 4)‡
+• AMD Family 1Ah (Zen 5)§
+• Intel Silvermont low-power processors
+• Intel Goldmont low-power processors (Apollo Lake and Denverton)
+• Intel Goldmont Plus low-power processors (Gemini Lake)
+• Intel 1st Gen Core i3/i5/i7 (Nehalem)
+• Intel 1.5 Gen Core i3/i5/i7 (Westmere)
+• Intel 2nd Gen Core i3/i5/i7 (Sandybridge)
+• Intel 3rd Gen Core i3/i5/i7 (Ivybridge)
+• Intel 4th Gen Core i3/i5/i7 (Haswell)
+• Intel 5th Gen Core i3/i5/i7 (Broadwell)
+• Intel 6th Gen Core i3/i5/i7 (Skylake)
+• Intel 6th Gen Core i7/i9 (Skylake X)
+• Intel 8th Gen Core i3/i5/i7 (Cannon Lake)
+• Intel 10th Gen Core i7/i9 (Ice Lake)
+• Intel Xeon (Cascade Lake)
+• Intel Xeon (Cooper Lake)*
+• Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake)*
+• Intel 4th Gen 10nm++ Xeon (Sapphire Rapids)†
+• Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake)†
+• Intel 12th Gen i3/i5/i7/i9-family (Alder Lake)†
+• Intel 13th Gen i3/i5/i7/i9-family (Raptor Lake)‡
+• Intel 14th Gen i3/i5/i7/i9-family (Meteor Lake)‡
+• Intel 5th Gen 10nm++ Xeon (Emerald Rapids)‡
+
+Notes: If not otherwise noted, gcc >=9.1 is required for support.
+ *Requires gcc >=10.1 or clang >=10.0
+ **Required gcc >=10.3 or clang >=12.0
+ †Required gcc >=11.1 or clang >=12.0
+ ‡Required gcc >=13.0 or clang >=15.0.5
+ §Required gcc >14.0 or clang >=19.0?
+
+3. Auto-detected micro-architecture levels
+
+Compile by passing the '-march=native' option which, "selects the CPU
+to generate code for at compilation time by determining the processor type of
+the compiling machine. Using -march=native enables all instruction subsets
+supported by the local machine and will produce code optimized for the local
+machine under the constraints of the selected instruction set."[1]
+
+Users of Intel CPUs should select the 'Intel-Native' option and users of AMD
+CPUs should select the 'AMD-Native' option.
+
+MINOR NOTES RELATING TO INTEL ATOM PROCESSORS
+This patch also changes -march=atom to -march=bonnell in accordance with the
+gcc v4.9 changes. Upstream is using the deprecated -match=atom flags when I
+believe it should use the newer -march=bonnell flag for atom processors.[2]
+
+It is not recommended to compile on Atom-CPUs with the 'native' option.[3] The
+recommendation is to use the 'atom' option instead.
+
+BENEFITS
+Small but real speed increases are measurable using a make endpoint comparing
+a generic kernel to one built with one of the respective microarchs.
+
+See the following experimental evidence supporting this statement:
+https://github.com/graysky2/kernel_compiler_patch?tab=readme-ov-file#benchmarks
+
+REQUIREMENTS
+linux version 6.16+
+gcc version >=9.0 or clang version >=9.0
+
+ACKNOWLEDGMENTS
+This patch builds on the seminal work by Jeroen.[4]
+
+REFERENCES
+1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-x86-Options
+2. https://bugzilla.kernel.org/show_bug.cgi?id=77461
+3. https://github.com/graysky2/kernel_gcc_patch/issues/15
+4. http://www.linuxforge.net/docs/linux/linux-gcc.php
+
+---
+ arch/x86/Kconfig.cpu | 427 ++++++++++++++++++++++++++++++++++++++++++-
+ arch/x86/Makefile | 213 ++++++++++++++++++++-
+ 2 files changed, 631 insertions(+), 9 deletions(-)
+
+diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
+index f928cf6e3252..2802936f2e62 100644
+--- a/arch/x86/Kconfig.cpu
++++ b/arch/x86/Kconfig.cpu
+@@ -31,6 +31,7 @@ choice
+ - "Pentium-4" for the Intel Pentium 4 or P4-based Celeron.
+ - "K6" for the AMD K6, K6-II and K6-III (aka K6-3D).
+ - "Athlon" for the AMD K7 family (Athlon/Duron/Thunderbird).
++ - "Opteron/Athlon64/Hammer/K8" for all K8 and newer AMD CPUs.
+ - "Crusoe" for the Transmeta Crusoe series.
+ - "Efficeon" for the Transmeta Efficeon series.
+ - "Winchip-C6" for original IDT Winchip.
+@@ -41,7 +42,10 @@ choice
+ - "CyrixIII/VIA C3" for VIA Cyrix III or VIA C3.
+ - "VIA C3-2" for VIA C3-2 "Nehemiah" (model 9 and above).
+ - "VIA C7" for VIA C7.
++ - "Intel P4" for the Pentium 4/Netburst microarchitecture.
++ - "Core 2/newer Xeon" for all core2 and newer Intel CPUs.
+ - "Intel Atom" for the Atom-microarchitecture CPUs.
++ - "Generic-x86-64" for a kernel which runs on any x86-64 CPU.
+
+ See each option's help text for additional details. If you don't know
+ what to do, choose "Pentium-Pro".
+@@ -135,10 +139,21 @@ config MPENTIUM4
+ -Mobile Pentium 4
+ -Mobile Pentium 4 M
+ -Extreme Edition (Gallatin)
++ -Prescott
++ -Prescott 2M
++ -Cedar Mill
++ -Presler
++ -Smithfiled
+ Xeons (Intel Xeon, Xeon MP, Xeon LV, Xeon MV) corename:
+ -Foster
+ -Prestonia
+ -Gallatin
++ -Nocona
++ -Irwindale
++ -Cranford
++ -Potomac
++ -Paxville
++ -Dempsey
+
+ config MK6
+ bool "K6/K6-II/K6-III"
+@@ -281,6 +296,402 @@ config X86_GENERIC
+ This is really intended for distributors who need more
+ generic optimizations.
+
++choice
++ prompt "x86_64 Compiler Build Optimization"
++ depends on !X86_NATIVE_CPU
++ default GENERIC_CPU
++
++config GENERIC_CPU
++ bool "Generic-x86-64"
++ depends on X86_64
++ help
++ Generic x86-64 CPU.
++ Runs equally well on all x86-64 CPUs.
++
++config MK8
++ bool "AMD Opteron/Athlon64/Hammer/K8"
++ help
++ Select this for an AMD Opteron or Athlon64 Hammer-family processor.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MK8SSE3
++ bool "AMD Opteron/Athlon64/Hammer/K8 with SSE3"
++ help
++ Select this for improved AMD Opteron or Athlon64 Hammer-family processors.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MK10
++ bool "AMD 61xx/7x50/PhenomX3/X4/II/K10"
++ help
++ Select this for an AMD 61xx Eight-Core Magny-Cours, Athlon X2 7x50,
++ Phenom X3/X4/II, Athlon II X2/X3/X4, or Turion II-family processor.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MBARCELONA
++ bool "AMD Barcelona"
++ help
++ Select this for AMD Family 10h Barcelona processors.
++
++ Enables -march=barcelona
++
++config MBOBCAT
++ bool "AMD Bobcat"
++ help
++ Select this for AMD Family 14h Bobcat processors.
++
++ Enables -march=btver1
++
++config MJAGUAR
++ bool "AMD Jaguar"
++ help
++ Select this for AMD Family 16h Jaguar processors.
++
++ Enables -march=btver2
++
++config MBULLDOZER
++ bool "AMD Bulldozer"
++ help
++ Select this for AMD Family 15h Bulldozer processors.
++
++ Enables -march=bdver1
++
++config MPILEDRIVER
++ bool "AMD Piledriver"
++ help
++ Select this for AMD Family 15h Piledriver processors.
++
++ Enables -march=bdver2
++
++config MSTEAMROLLER
++ bool "AMD Steamroller"
++ help
++ Select this for AMD Family 15h Steamroller processors.
++
++ Enables -march=bdver3
++
++config MEXCAVATOR
++ bool "AMD Excavator"
++ help
++ Select this for AMD Family 15h Excavator processors.
++
++ Enables -march=bdver4
++
++config MZEN
++ bool "AMD Ryzen"
++ help
++ Select this for AMD Family 17h Zen processors.
++
++ Enables -march=znver1
++
++config MZEN2
++ bool "AMD Ryzen 2"
++ help
++ Select this for AMD Family 17h Zen 2 processors.
++
++ Enables -march=znver2
++
++config MZEN3
++ bool "AMD Ryzen 3"
++ depends on (CC_IS_GCC && GCC_VERSION >= 100300) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ help
++ Select this for AMD Family 19h Zen 3 processors.
++
++ Enables -march=znver3
++
++config MZEN4
++ bool "AMD Ryzen 4"
++ depends on (CC_IS_GCC && GCC_VERSION >= 130000) || (CC_IS_CLANG && CLANG_VERSION >= 160000)
++ help
++ Select this for AMD Family 19h Zen 4 processors.
++
++ Enables -march=znver4
++
++config MZEN5
++ bool "AMD Ryzen 5"
++ depends on (CC_IS_GCC && GCC_VERSION > 140000) || (CC_IS_CLANG && CLANG_VERSION >= 190100)
++ help
++ Select this for AMD Family 19h Zen 5 processors.
++
++ Enables -march=znver5
++
++config MPSC
++ bool "Intel P4 / older Netburst based Xeon"
++ depends on X86_64
++ help
++ Optimize for Intel Pentium 4, Pentium D and older Nocona/Dempsey
++ Xeon CPUs with Intel 64bit which is compatible with x86-64.
++ Note that the latest Xeons (Xeon 51xx and 53xx) are not based on the
++ Netburst core and shouldn't use this option. You can distinguish them
++ using the cpu family field
++ in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one.
++
++config MCORE2
++ bool "Intel Core 2"
++ depends on X86_64
++ help
++
++ Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and
++ 53xx) CPUs. You can distinguish newer from older Xeons by the CPU
++ family in /proc/cpuinfo. Newer ones have 6 and older ones 15
++ (not a typo)
++
++ Enables -march=core2
++
++config MNEHALEM
++ bool "Intel Nehalem"
++ depends on X86_64
++ help
++
++ Select this for 1st Gen Core processors in the Nehalem family.
++
++ Enables -march=nehalem
++
++config MWESTMERE
++ bool "Intel Westmere"
++ depends on X86_64
++ help
++
++ Select this for the Intel Westmere formerly Nehalem-C family.
++
++ Enables -march=westmere
++
++config MSILVERMONT
++ bool "Intel Silvermont"
++ depends on X86_64
++ help
++
++ Select this for the Intel Silvermont platform.
++
++ Enables -march=silvermont
++
++config MGOLDMONT
++ bool "Intel Goldmont"
++ depends on X86_64
++ help
++
++ Select this for the Intel Goldmont platform including Apollo Lake and Denverton.
++
++ Enables -march=goldmont
++
++config MGOLDMONTPLUS
++ bool "Intel Goldmont Plus"
++ depends on X86_64
++ help
++
++ Select this for the Intel Goldmont Plus platform including Gemini Lake.
++
++ Enables -march=goldmont-plus
++
++config MSANDYBRIDGE
++ bool "Intel Sandy Bridge"
++ depends on X86_64
++ help
++
++ Select this for 2nd Gen Core processors in the Sandy Bridge family.
++
++ Enables -march=sandybridge
++
++config MIVYBRIDGE
++ bool "Intel Ivy Bridge"
++ depends on X86_64
++ help
++
++ Select this for 3rd Gen Core processors in the Ivy Bridge family.
++
++ Enables -march=ivybridge
++
++config MHASWELL
++ bool "Intel Haswell"
++ depends on X86_64
++ help
++
++ Select this for 4th Gen Core processors in the Haswell family.
++
++ Enables -march=haswell
++
++config MBROADWELL
++ bool "Intel Broadwell"
++ depends on X86_64
++ help
++
++ Select this for 5th Gen Core processors in the Broadwell family.
++
++ Enables -march=broadwell
++
++config MSKYLAKE
++ bool "Intel Skylake"
++ depends on X86_64
++ help
++
++ Select this for 6th Gen Core processors in the Skylake family.
++
++ Enables -march=skylake
++
++config MSKYLAKEX
++ bool "Intel Skylake-X (7th Gen Core i7/i9)"
++ depends on X86_64
++ help
++
++ Select this for 7th Gen Core i7/i9 processors in the Skylake-X family.
++
++ Enables -march=skylake-avx512
++
++config MCANNONLAKE
++ bool "Intel Coffee Lake/Kaby Lake Refresh (8th Gen Core i3/i5/i7)"
++ depends on X86_64
++ help
++
++ Select this for 8th Gen Core i3/i5/i7 processors in the Coffee Lake or Kaby Lake Refresh families.
++
++ Enables -march=cannonlake
++
++config MICELAKE_CLIENT
++ bool "Intel Ice Lake"
++ depends on X86_64
++ help
++
++ Select this for 10th Gen Core client processors in the Ice Lake family.
++
++ Enables -march=icelake-client
++
++config MICELAKE_SERVER
++ bool "Intel Ice Lake-SP (3rd Gen Xeon Scalable)"
++ depends on X86_64
++ help
++
++ Select this for 3rd Gen Xeon Scalable processors in the Ice Lake-SP family.
++
++ Enables -march=icelake-server
++
++config MCOOPERLAKE
++ bool "Intel Cooper Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ help
++
++ Select this for Xeon processors in the Cooper Lake family.
++
++ Enables -march=cooperlake
++
++config MCASCADELAKE
++ bool "Intel Cascade Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ help
++
++ Select this for Xeon processors in the Cascade Lake family.
++
++ Enables -march=cascadelake
++
++config MTIGERLAKE
++ bool "Intel Tiger Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ help
++
++ Select this for third-generation 10 nm process processors in the Tiger Lake family.
++
++ Enables -march=tigerlake
++
++config MSAPPHIRERAPIDS
++ bool "Intel Sapphire Rapids"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ help
++
++ Select this for fourth-generation 10 nm process processors in the Sapphire Rapids family.
++
++ Enables -march=sapphirerapids
++
++config MROCKETLAKE
++ bool "Intel Rocket Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ help
++
++ Select this for eleventh-generation processors in the Rocket Lake family.
++
++ Enables -march=rocketlake
++
++config MALDERLAKE
++ bool "Intel Alder Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ help
++
++ Select this for twelfth-generation processors in the Alder Lake family.
++
++ Enables -march=alderlake
++
++config MRAPTORLAKE
++ bool "Intel Raptor Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION >= 130000) || (CC_IS_CLANG && CLANG_VERSION >= 150500)
++ help
++
++ Select this for thirteenth-generation processors in the Raptor Lake family.
++
++ Enables -march=raptorlake
++
++config MMETEORLAKE
++ bool "Intel Meteor Lake"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION >= 130000) || (CC_IS_CLANG && CLANG_VERSION >= 150500)
++ help
++
++ Select this for fourteenth-generation processors in the Meteor Lake family.
++
++ Enables -march=meteorlake
++
++config MEMERALDRAPIDS
++ bool "Intel Emerald Rapids"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 130000) || (CC_IS_CLANG && CLANG_VERSION >= 150500)
++ help
++
++ Select this for fifth-generation Xeon Scalable processors in the Emerald Rapids family.
++
++ Enables -march=emeraldrapids
++
++config MDIAMONDRAPIDS
++ bool "Intel Diamond Rapids (7th Gen Xeon Scalable)"
++ depends on X86_64
++ depends on (CC_IS_GCC && GCC_VERSION > 150000) || (CC_IS_CLANG && CLANG_VERSION >= 200000)
++ help
++
++ Select this for seventh-generation Xeon Scalable processors in the Diamond Rapids family.
++
++ Enables -march=diamondrapids
++
++endchoice
++
++config X86_64_VERSION
++ int "x86-64 compiler ISA level"
++ range 1 3
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64 && GENERIC_CPU
++ help
++ Specify a specific x86-64 compiler ISA level.
++
++ There are three x86-64 ISA levels that work on top of
++ the x86-64 baseline, namely: x86-64-v2 and x86-64-v3.
++
++ x86-64-v2 brings support for vector instructions up to Streaming SIMD
++ Extensions 4.2 (SSE4.2) and Supplemental Streaming SIMD Extensions 3
++ (SSSE3), the POPCNT instruction, and CMPXCHG16B.
++
++ x86-64-v3 adds vector instructions up to AVX2, MOVBE, and additional
++ bit-manipulation instructions.
++
++ x86-64-v4 is not included since the kernel does not use AVX512 instructions
++
++ You can find the best version for your CPU by running one of the following:
++ /lib/ld-linux-x86-64.so.2 --help | grep supported
++ /lib64/ld-linux-x86-64.so.2 --help | grep supported
++
+ #
+ # Define implied options from the CPU selection here
+ config X86_INTERNODE_CACHE_SHIFT
+@@ -290,8 +701,8 @@ config X86_INTERNODE_CACHE_SHIFT
+
+ config X86_L1_CACHE_SHIFT
+ int
+- default "7" if MPENTIUM4
+- default "6" if MK7 || MPENTIUMM || MATOM || MVIAC7 || X86_GENERIC || X86_64
++ default "7" if MPENTIUM4 || MPSC
++ default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MZEN4 || MZEN5 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE_CLIENT || MICELAKE_SERVER || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MDIAMONDRAPIDS || X86_NATIVE_CPU
+ default "4" if MELAN || M486SX || M486 || MGEODEGX1
+ default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
+
+@@ -309,19 +720,19 @@ config X86_ALIGNMENT_16
+
+ config X86_INTEL_USERCOPY
+ def_bool y
+- depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK7 || MEFFICEON
++ depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE_CLIENT || MICELAKE_SERVER || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MDIAMONDRAPIDS
+
+ config X86_USE_PPRO_CHECKSUM
+ def_bool y
+- depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MATOM
++ depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MZEN4 || MZEN5 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE_CLIENT || MICELAKE_SERVER || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MDIAMONDRAPIDS
+
+ config X86_TSC
+ def_bool y
+- depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MATOM) || X86_64
++ depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM) || X86_64
+
+ config X86_HAVE_PAE
+ def_bool y
+- depends on MCRUSOE || MEFFICEON || MCYRIXIII || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC7 || MATOM || X86_64
++ depends on MCRUSOE || MEFFICEON || MCYRIXIII || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC7 || MCORE2 || MATOM || X86_64
+
+ config X86_CX8
+ def_bool y
+@@ -331,12 +742,12 @@ config X86_CX8
+ # generates cmov.
+ config X86_CMOV
+ def_bool y
+- depends on (MK7 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || MATOM || MGEODE_LX || X86_64)
++ depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
+
+ config X86_MINIMUM_CPU_FAMILY
+ int
+ default "64" if X86_64
+- default "6" if X86_32 && (MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MEFFICEON || MATOM || MK7)
++ default "6" if X86_32 && (MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MEFFICEON || MATOM || MCORE2 || MK7 || MK8)
+ default "5" if X86_32 && X86_CX8
+ default "4"
+
+diff --git a/arch/x86/Makefile b/arch/x86/Makefile
+index 1913d342969b..6c165daccb3d 100644
+--- a/arch/x86/Makefile
++++ b/arch/x86/Makefile
+@@ -177,10 +177,221 @@ ifdef CONFIG_X86_NATIVE_CPU
+ KBUILD_CFLAGS += -march=native
+ KBUILD_RUSTFLAGS += -Ctarget-cpu=native
+ else
++ifdef CONFIG_MK8
++ KBUILD_CFLAGS += -march=k8
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=k8
++endif
++
++ifdef CONFIG_MK8SSE3
++ KBUILD_CFLAGS += -march=k8-sse3
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=k8-sse3
++endif
++
++ifdef CONFIG_MK10
++ KBUILD_CFLAGS += -march=amdfam10
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=amdfam10
++endif
++
++ifdef CONFIG_MBARCELONA
++ KBUILD_CFLAGS += -march=barcelona
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=barcelona
++endif
++
++ifdef CONFIG_MBOBCAT
++ KBUILD_CFLAGS += -march=btver1
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=btver1
++endif
++
++ifdef CONFIG_MJAGUAR
++ KBUILD_CFLAGS += -march=btver2
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=btver2
++endif
++
++ifdef CONFIG_MBULLDOZER
++ KBUILD_CFLAGS += -march=bdver1
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=bdver1
++endif
++
++ifdef CONFIG_MPILEDRIVER
++ KBUILD_CFLAGS += -march=bdver2 -mno-tbm
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=bdver2 -mno-tbm
++endif
++
++ifdef CONFIG_MSTEAMROLLER
++ KBUILD_CFLAGS += -march=bdver3 -mno-tbm
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=bdver3 -mno-tbm
++endif
++
++ifdef CONFIG_MEXCAVATOR
++ KBUILD_CFLAGS += -march=bdver4 -mno-tbm
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=bdver4 -mno-tbm
++endif
++
++ifdef CONFIG_MZEN
++ KBUILD_CFLAGS += -march=znver1
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=znver1
++endif
++
++ifdef CONFIG_MZEN2
++ KBUILD_CFLAGS += -march=znver2
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=znver2
++endif
++
++ifdef CONFIG_MZEN3
++ KBUILD_CFLAGS += -march=znver3
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=znver3
++endif
++
++ifdef CONFIG_MZEN4
++ KBUILD_CFLAGS += -march=znver4
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=znver4
++endif
++
++ifdef CONFIG_MZEN5
++ KBUILD_CFLAGS += -march=znver5
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=znver5
++endif
++
++ifdef CONFIG_MPSC
++ KBUILD_CFLAGS += -march=nocona
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=nocona
++endif
++
++ifdef CONFIG_MCORE2
++ KBUILD_CFLAGS += -march=core2
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=core2
++endif
++
++ifdef CONFIG_MNEHALEM
++ KBUILD_CFLAGS += -march=nehalem
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=nehalem
++endif
++
++ifdef CONFIG_MWESTMERE
++ KBUILD_CFLAGS += -march=westmere
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=westmere
++endif
++
++ifdef CONFIG_MSILVERMONT
++ KBUILD_CFLAGS += -march=silvermont
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=silvermont
++endif
++
++ifdef CONFIG_MGOLDMONT
++ KBUILD_CFLAGS += -march=goldmont
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=goldmont
++endif
++
++ifdef CONFIG_MGOLDMONTPLUS
++ KBUILD_CFLAGS += -march=goldmont-plus
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=goldmont-plus
++endif
++
++ifdef CONFIG_MSANDYBRIDGE
++ KBUILD_CFLAGS += -march=sandybridge
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=sandybridge
++endif
++
++ifdef CONFIG_MIVYBRIDGE
++ KBUILD_CFLAGS += -march=ivybridge
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=ivybridge
++endif
++
++ifdef CONFIG_MHASWELL
++ KBUILD_CFLAGS += -march=haswell
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=haswell
++endif
++
++ifdef CONFIG_MBROADWELL
++ KBUILD_CFLAGS += -march=broadwell
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=broadwell
++endif
++
++ifdef CONFIG_MSKYLAKE
++ KBUILD_CFLAGS += -march=skylake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=skylake
++endif
++
++ifdef CONFIG_MSKYLAKEX
++ KBUILD_CFLAGS += -march=skylake-avx512
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=skylake-avx512
++endif
++
++ifdef CONFIG_MCANNONLAKE
++ KBUILD_CFLAGS += -march=cannonlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=cannonlake
++endif
++
++ifdef CONFIG_MICELAKE_CLIENT
++ KBUILD_CFLAGS += -march=icelake-client
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=icelake-client
++endif
++
++ifdef CONFIG_MICELAKE_SERVER
++ KBUILD_CFLAGS += -march=icelake-server
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=icelake-server
++endif
++
++ifdef CONFIG_MCOOPERLAKE
++ KBUILD_CFLAGS += -march=cooperlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=cooperlake
++endif
++
++ifdef CONFIG_MCASCADELAKE
++ KBUILD_CFLAGS += -march=cascadelake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=cascadelake
++endif
++
++ifdef CONFIG_MTIGERLAKE
++ KBUILD_CFLAGS += -march=tigerlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=tigerlake
++endif
++
++ifdef CONFIG_MSAPPHIRERAPIDS
++ KBUILD_CFLAGS += -march=sapphirerapids
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=sapphirerapids
++endif
++
++ifdef CONFIG_MROCKETLAKE
++ KBUILD_CFLAGS += -march=rocketlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=rocketlake
++endif
++
++ifdef CONFIG_MALDERLAKE
++ KBUILD_CFLAGS += -march=alderlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=alderlake
++endif
++
++ifdef CONFIG_MRAPTORLAKE
++ KBUILD_CFLAGS += -march=raptorlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=raptorlake
++endif
++
++ifdef CONFIG_MMETEORLAKE
++ KBUILD_CFLAGS += -march=meteorlake
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=meteorlake
++endif
++
++ifdef CONFIG_MEMERALDRAPIDS
++ KBUILD_CFLAGS += -march=emeraldrapids
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=emeraldrapids
++endif
++
++ifdef CONFIG_MDIAMONDRAPIDS
++ KBUILD_CFLAGS += -march=diamondrapids
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=diamondrapids
++endif
++
++ifdef CONFIG_GENERIC_CPU
++ifeq ($(CONFIG_X86_64_VERSION),1)
+ KBUILD_CFLAGS += -march=x86-64 -mtune=generic
+ KBUILD_RUSTFLAGS += -Ctarget-cpu=x86-64 -Ztune-cpu=generic
++else
++ KBUILD_CFLAGS +=-march=x86-64-v$(CONFIG_X86_64_VERSION)
++ KBUILD_RUSTFLAGS += -Ctarget-cpu=x86-64-v$(CONFIG_X86_64_VERSION)
++endif
++endif
+ endif
+-
+ KBUILD_CFLAGS += -mno-red-zone
+ KBUILD_CFLAGS += -mcmodel=kernel
+ KBUILD_RUSTFLAGS += -Cno-redzone=y
+--
+2.50.1
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-06 11:06 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-06 11:06 UTC (permalink / raw
To: gentoo-commits
commit: 2c4e7f5d31bdbba77f973534a8a9fe66eb80af3e
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 6 11:06:02 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 6 11:06:02 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=2c4e7f5d
Linux patch 6.17.2
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 5 +
1000_linux-6.17.1.patch | 659 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 664 insertions(+)
diff --git a/0000_README b/0000_README
index f2189223..50189b54 100644
--- a/0000_README
+++ b/0000_README
@@ -42,6 +42,11 @@ EXPERIMENTAL
Individual Patch Descriptions:
--------------------------------------------------------------------------
+
+Patch: 1000_linux-6.17.1.patch
+From: https://www.kernel.org
+Desc: Linux 6.17.1
+
Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
diff --git a/1000_linux-6.17.1.patch b/1000_linux-6.17.1.patch
new file mode 100644
index 00000000..1266cc07
--- /dev/null
+++ b/1000_linux-6.17.1.patch
@@ -0,0 +1,659 @@
+diff --git a/Makefile b/Makefile
+index 82bb9cdf73a32b..389bfac0adaaac 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 17
+-SUBLEVEL = 0
++SUBLEVEL = 1
+ EXTRAVERSION =
+ NAME = Baby Opossum Posse
+
+diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
+index d880c50629d612..5cffa5668d0c38 100644
+--- a/block/blk-mq-tag.c
++++ b/block/blk-mq-tag.c
+@@ -622,6 +622,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+ return -ENOMEM;
+
+ blk_mq_free_map_and_rqs(set, *tagsptr, hctx->queue_num);
++ hctx->queue->elevator->et->tags[hctx->queue_num] = new;
+ *tagsptr = new;
+ } else {
+ /*
+diff --git a/drivers/media/i2c/tc358743.c b/drivers/media/i2c/tc358743.c
+index 1cc7636e446d77..5042cf612d21e8 100644
+--- a/drivers/media/i2c/tc358743.c
++++ b/drivers/media/i2c/tc358743.c
+@@ -2245,10 +2245,10 @@ static int tc358743_probe(struct i2c_client *client)
+ err_work_queues:
+ cec_unregister_adapter(state->cec_adap);
+ if (!state->i2c_client->irq) {
+- timer_delete(&state->timer);
++ timer_delete_sync(&state->timer);
+ flush_work(&state->work_i2c_poll);
+ }
+- cancel_delayed_work(&state->delayed_work_enable_hotplug);
++ cancel_delayed_work_sync(&state->delayed_work_enable_hotplug);
+ mutex_destroy(&state->confctl_mutex);
+ err_hdl:
+ media_entity_cleanup(&sd->entity);
+diff --git a/drivers/media/pci/b2c2/flexcop-pci.c b/drivers/media/pci/b2c2/flexcop-pci.c
+index 486c8ec0fa60d9..ab53c5b02c48df 100644
+--- a/drivers/media/pci/b2c2/flexcop-pci.c
++++ b/drivers/media/pci/b2c2/flexcop-pci.c
+@@ -411,7 +411,7 @@ static void flexcop_pci_remove(struct pci_dev *pdev)
+ struct flexcop_pci *fc_pci = pci_get_drvdata(pdev);
+
+ if (irq_chk_intv > 0)
+- cancel_delayed_work(&fc_pci->irq_check_work);
++ cancel_delayed_work_sync(&fc_pci->irq_check_work);
+
+ flexcop_pci_dma_exit(fc_pci);
+ flexcop_device_exit(fc_pci->fc_dev);
+diff --git a/drivers/media/platform/qcom/iris/iris_buffer.c b/drivers/media/platform/qcom/iris/iris_buffer.c
+index 6425e4919e3b0b..9f664c24114936 100644
+--- a/drivers/media/platform/qcom/iris/iris_buffer.c
++++ b/drivers/media/platform/qcom/iris/iris_buffer.c
+@@ -413,6 +413,16 @@ static int iris_destroy_internal_buffers(struct iris_inst *inst, u32 plane, bool
+ }
+ }
+
++ if (force) {
++ buffers = &inst->buffers[BUF_PERSIST];
++
++ list_for_each_entry_safe(buf, next, &buffers->list, list) {
++ ret = iris_destroy_internal_buffer(inst, buf);
++ if (ret)
++ return ret;
++ }
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/media/platform/st/stm32/stm32-csi.c b/drivers/media/platform/st/stm32/stm32-csi.c
+index b69048144cc12b..fd2b6dfbd44c57 100644
+--- a/drivers/media/platform/st/stm32/stm32-csi.c
++++ b/drivers/media/platform/st/stm32/stm32-csi.c
+@@ -443,8 +443,7 @@ static void stm32_csi_phy_reg_write(struct stm32_csi_dev *csidev,
+ static int stm32_csi_start(struct stm32_csi_dev *csidev,
+ struct v4l2_subdev_state *state)
+ {
+- struct media_pad *src_pad =
+- &csidev->s_subdev->entity.pads[csidev->s_subdev_pad_nb];
++ struct media_pad *src_pad;
+ const struct stm32_csi_mbps_phy_reg *phy_regs = NULL;
+ struct v4l2_mbus_framefmt *sink_fmt;
+ const struct stm32_csi_fmts *fmt;
+@@ -466,6 +465,7 @@ static int stm32_csi_start(struct stm32_csi_dev *csidev,
+ if (!csidev->s_subdev)
+ return -EIO;
+
++ src_pad = &csidev->s_subdev->entity.pads[csidev->s_subdev_pad_nb];
+ link_freq = v4l2_get_link_freq(src_pad,
+ fmt->bpp, 2 * csidev->num_lanes);
+ if (link_freq < 0)
+diff --git a/drivers/media/rc/imon.c b/drivers/media/rc/imon.c
+index f5221b01880813..cf3e6e43c0c7e4 100644
+--- a/drivers/media/rc/imon.c
++++ b/drivers/media/rc/imon.c
+@@ -536,7 +536,9 @@ static int display_open(struct inode *inode, struct file *file)
+
+ mutex_lock(&ictx->lock);
+
+- if (!ictx->display_supported) {
++ if (ictx->disconnected) {
++ retval = -ENODEV;
++ } else if (!ictx->display_supported) {
+ pr_err("display not supported by device\n");
+ retval = -ENODEV;
+ } else if (ictx->display_isopen) {
+@@ -598,6 +600,9 @@ static int send_packet(struct imon_context *ictx)
+ int retval = 0;
+ struct usb_ctrlrequest *control_req = NULL;
+
++ if (ictx->disconnected)
++ return -ENODEV;
++
+ /* Check if we need to use control or interrupt urb */
+ if (!ictx->tx_control) {
+ pipe = usb_sndintpipe(ictx->usbdev_intf0,
+@@ -949,12 +954,14 @@ static ssize_t vfd_write(struct file *file, const char __user *buf,
+ static const unsigned char vfd_packet6[] = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF };
+
+- if (ictx->disconnected)
+- return -ENODEV;
+-
+ if (mutex_lock_interruptible(&ictx->lock))
+ return -ERESTARTSYS;
+
++ if (ictx->disconnected) {
++ retval = -ENODEV;
++ goto exit;
++ }
++
+ if (!ictx->dev_present_intf0) {
+ pr_err_ratelimited("no iMON device present\n");
+ retval = -ENODEV;
+@@ -1029,11 +1036,13 @@ static ssize_t lcd_write(struct file *file, const char __user *buf,
+ int retval = 0;
+ struct imon_context *ictx = file->private_data;
+
+- if (ictx->disconnected)
+- return -ENODEV;
+-
+ mutex_lock(&ictx->lock);
+
++ if (ictx->disconnected) {
++ retval = -ENODEV;
++ goto exit;
++ }
++
+ if (!ictx->display_supported) {
+ pr_err_ratelimited("no iMON display present\n");
+ retval = -ENODEV;
+@@ -2499,7 +2508,11 @@ static void imon_disconnect(struct usb_interface *interface)
+ int ifnum;
+
+ ictx = usb_get_intfdata(interface);
++
++ mutex_lock(&ictx->lock);
+ ictx->disconnected = true;
++ mutex_unlock(&ictx->lock);
++
+ dev = ictx->dev;
+ ifnum = interface->cur_altsetting->desc.bInterfaceNumber;
+
+diff --git a/drivers/media/tuners/xc5000.c b/drivers/media/tuners/xc5000.c
+index 30aa4ee958bdea..ec9a3cd4784e1f 100644
+--- a/drivers/media/tuners/xc5000.c
++++ b/drivers/media/tuners/xc5000.c
+@@ -1304,7 +1304,7 @@ static void xc5000_release(struct dvb_frontend *fe)
+ mutex_lock(&xc5000_list_mutex);
+
+ if (priv) {
+- cancel_delayed_work(&priv->timer_sleep);
++ cancel_delayed_work_sync(&priv->timer_sleep);
+ hybrid_tuner_release_state(priv);
+ }
+
+diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
+index 775bede0d93d9b..50e1589668ba50 100644
+--- a/drivers/media/usb/uvc/uvc_driver.c
++++ b/drivers/media/usb/uvc/uvc_driver.c
+@@ -137,6 +137,9 @@ struct uvc_entity *uvc_entity_by_id(struct uvc_device *dev, int id)
+ {
+ struct uvc_entity *entity;
+
++ if (id == UVC_INVALID_ENTITY_ID)
++ return NULL;
++
+ list_for_each_entry(entity, &dev->entities, list) {
+ if (entity->id == id)
+ return entity;
+@@ -795,14 +798,27 @@ static const u8 uvc_media_transport_input_guid[16] =
+ UVC_GUID_UVC_MEDIA_TRANSPORT_INPUT;
+ static const u8 uvc_processing_guid[16] = UVC_GUID_UVC_PROCESSING;
+
+-static struct uvc_entity *uvc_alloc_entity(u16 type, u16 id,
+- unsigned int num_pads, unsigned int extra_size)
++static struct uvc_entity *uvc_alloc_new_entity(struct uvc_device *dev, u16 type,
++ u16 id, unsigned int num_pads,
++ unsigned int extra_size)
+ {
+ struct uvc_entity *entity;
+ unsigned int num_inputs;
+ unsigned int size;
+ unsigned int i;
+
++ /* Per UVC 1.1+ spec 3.7.2, the ID should be non-zero. */
++ if (id == 0) {
++ dev_err(&dev->intf->dev, "Found Unit with invalid ID 0\n");
++ id = UVC_INVALID_ENTITY_ID;
++ }
++
++ /* Per UVC 1.1+ spec 3.7.2, the ID is unique. */
++ if (uvc_entity_by_id(dev, id)) {
++ dev_err(&dev->intf->dev, "Found multiple Units with ID %u\n", id);
++ id = UVC_INVALID_ENTITY_ID;
++ }
++
+ extra_size = roundup(extra_size, sizeof(*entity->pads));
+ if (num_pads)
+ num_inputs = type & UVC_TERM_OUTPUT ? num_pads : num_pads - 1;
+@@ -812,7 +828,7 @@ static struct uvc_entity *uvc_alloc_entity(u16 type, u16 id,
+ + num_inputs;
+ entity = kzalloc(size, GFP_KERNEL);
+ if (entity == NULL)
+- return NULL;
++ return ERR_PTR(-ENOMEM);
+
+ entity->id = id;
+ entity->type = type;
+@@ -924,10 +940,10 @@ static int uvc_parse_vendor_control(struct uvc_device *dev,
+ break;
+ }
+
+- unit = uvc_alloc_entity(UVC_VC_EXTENSION_UNIT, buffer[3],
+- p + 1, 2*n);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, UVC_VC_EXTENSION_UNIT,
++ buffer[3], p + 1, 2 * n);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->guid, &buffer[4], 16);
+ unit->extension.bNumControls = buffer[20];
+@@ -1036,10 +1052,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- term = uvc_alloc_entity(type | UVC_TERM_INPUT, buffer[3],
+- 1, n + p);
+- if (term == NULL)
+- return -ENOMEM;
++ term = uvc_alloc_new_entity(dev, type | UVC_TERM_INPUT,
++ buffer[3], 1, n + p);
++ if (IS_ERR(term))
++ return PTR_ERR(term);
+
+ if (UVC_ENTITY_TYPE(term) == UVC_ITT_CAMERA) {
+ term->camera.bControlSize = n;
+@@ -1095,10 +1111,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return 0;
+ }
+
+- term = uvc_alloc_entity(type | UVC_TERM_OUTPUT, buffer[3],
+- 1, 0);
+- if (term == NULL)
+- return -ENOMEM;
++ term = uvc_alloc_new_entity(dev, type | UVC_TERM_OUTPUT,
++ buffer[3], 1, 0);
++ if (IS_ERR(term))
++ return PTR_ERR(term);
+
+ memcpy(term->baSourceID, &buffer[7], 1);
+
+@@ -1117,9 +1133,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- unit = uvc_alloc_entity(buffer[2], buffer[3], p + 1, 0);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, buffer[2], buffer[3],
++ p + 1, 0);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->baSourceID, &buffer[5], p);
+
+@@ -1139,9 +1156,9 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- unit = uvc_alloc_entity(buffer[2], buffer[3], 2, n);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, buffer[2], buffer[3], 2, n);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->baSourceID, &buffer[4], 1);
+ unit->processing.wMaxMultiplier =
+@@ -1168,9 +1185,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- unit = uvc_alloc_entity(buffer[2], buffer[3], p + 1, n);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, buffer[2], buffer[3],
++ p + 1, n);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->guid, &buffer[4], 16);
+ unit->extension.bNumControls = buffer[20];
+@@ -1315,9 +1333,10 @@ static int uvc_gpio_parse(struct uvc_device *dev)
+ return dev_err_probe(&dev->intf->dev, irq,
+ "No IRQ for privacy GPIO\n");
+
+- unit = uvc_alloc_entity(UVC_EXT_GPIO_UNIT, UVC_EXT_GPIO_UNIT_ID, 0, 1);
+- if (!unit)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, UVC_EXT_GPIO_UNIT,
++ UVC_EXT_GPIO_UNIT_ID, 0, 1);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ unit->gpio.gpio_privacy = gpio_privacy;
+ unit->gpio.irq = irq;
+diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
+index 757254fc4fe930..37bb8167abe9ac 100644
+--- a/drivers/media/usb/uvc/uvcvideo.h
++++ b/drivers/media/usb/uvc/uvcvideo.h
+@@ -41,6 +41,8 @@
+ #define UVC_EXT_GPIO_UNIT 0x7ffe
+ #define UVC_EXT_GPIO_UNIT_ID 0x100
+
++#define UVC_INVALID_ENTITY_ID 0xffff
++
+ /* ------------------------------------------------------------------------
+ * Driver specific constants.
+ */
+diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
+index 378ac96b861b70..1a42b4abe71682 100644
+--- a/drivers/net/wireless/ath/ath11k/qmi.c
++++ b/drivers/net/wireless/ath/ath11k/qmi.c
+@@ -2557,7 +2557,7 @@ static int ath11k_qmi_m3_load(struct ath11k_base *ab)
+ GFP_KERNEL);
+ if (!m3_mem->vaddr) {
+ ath11k_err(ab, "failed to allocate memory for M3 with size %zu\n",
+- fw->size);
++ m3_len);
+ ret = -ENOMEM;
+ goto out;
+ }
+diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
+index 57590f5577a360..b9c2224dde4a37 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.c
++++ b/drivers/net/wireless/realtek/rtw89/core.c
+@@ -1073,6 +1073,14 @@ rtw89_core_tx_update_desc_info(struct rtw89_dev *rtwdev,
+ }
+ }
+
++static void rtw89_tx_wait_work(struct wiphy *wiphy, struct wiphy_work *work)
++{
++ struct rtw89_dev *rtwdev = container_of(work, struct rtw89_dev,
++ tx_wait_work.work);
++
++ rtw89_tx_wait_list_clear(rtwdev);
++}
++
+ void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel)
+ {
+ u8 ch_dma;
+@@ -1090,6 +1098,8 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
+ unsigned long time_left;
+ int ret = 0;
+
++ lockdep_assert_wiphy(rtwdev->hw->wiphy);
++
+ wait = kzalloc(sizeof(*wait), GFP_KERNEL);
+ if (!wait) {
+ rtw89_core_tx_kick_off(rtwdev, qsel);
+@@ -1097,18 +1107,23 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
+ }
+
+ init_completion(&wait->completion);
++ wait->skb = skb;
+ rcu_assign_pointer(skb_data->wait, wait);
+
+ rtw89_core_tx_kick_off(rtwdev, qsel);
+ time_left = wait_for_completion_timeout(&wait->completion,
+ msecs_to_jiffies(timeout));
+- if (time_left == 0)
+- ret = -ETIMEDOUT;
+- else if (!wait->tx_done)
+- ret = -EAGAIN;
+
+- rcu_assign_pointer(skb_data->wait, NULL);
+- kfree_rcu(wait, rcu_head);
++ if (time_left == 0) {
++ ret = -ETIMEDOUT;
++ list_add_tail(&wait->list, &rtwdev->tx_waits);
++ wiphy_delayed_work_queue(rtwdev->hw->wiphy, &rtwdev->tx_wait_work,
++ RTW89_TX_WAIT_WORK_TIMEOUT);
++ } else {
++ if (!wait->tx_done)
++ ret = -EAGAIN;
++ rtw89_tx_wait_release(wait);
++ }
+
+ return ret;
+ }
+@@ -4978,6 +4993,7 @@ void rtw89_core_stop(struct rtw89_dev *rtwdev)
+ wiphy_work_cancel(wiphy, &btc->dhcp_notify_work);
+ wiphy_work_cancel(wiphy, &btc->icmp_notify_work);
+ cancel_delayed_work_sync(&rtwdev->txq_reinvoke_work);
++ wiphy_delayed_work_cancel(wiphy, &rtwdev->tx_wait_work);
+ wiphy_delayed_work_cancel(wiphy, &rtwdev->track_work);
+ wiphy_delayed_work_cancel(wiphy, &rtwdev->track_ps_work);
+ wiphy_delayed_work_cancel(wiphy, &rtwdev->chanctx_work);
+@@ -5203,6 +5219,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
+ INIT_LIST_HEAD(&rtwdev->scan_info.pkt_list[band]);
+ }
+ INIT_LIST_HEAD(&rtwdev->scan_info.chan_list);
++ INIT_LIST_HEAD(&rtwdev->tx_waits);
+ INIT_WORK(&rtwdev->ba_work, rtw89_core_ba_work);
+ INIT_WORK(&rtwdev->txq_work, rtw89_core_txq_work);
+ INIT_DELAYED_WORK(&rtwdev->txq_reinvoke_work, rtw89_core_txq_reinvoke_work);
+@@ -5214,6 +5231,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
+ wiphy_delayed_work_init(&rtwdev->coex_rfk_chk_work, rtw89_coex_rfk_chk_work);
+ wiphy_delayed_work_init(&rtwdev->cfo_track_work, rtw89_phy_cfo_track_work);
+ wiphy_delayed_work_init(&rtwdev->mcc_prepare_done_work, rtw89_mcc_prepare_done_work);
++ wiphy_delayed_work_init(&rtwdev->tx_wait_work, rtw89_tx_wait_work);
+ INIT_DELAYED_WORK(&rtwdev->forbid_ba_work, rtw89_forbid_ba_work);
+ wiphy_delayed_work_init(&rtwdev->antdiv_work, rtw89_phy_antdiv_work);
+ rtwdev->txq_wq = alloc_workqueue("rtw89_tx_wq", WQ_UNBOUND | WQ_HIGHPRI, 0);
+diff --git a/drivers/net/wireless/realtek/rtw89/core.h b/drivers/net/wireless/realtek/rtw89/core.h
+index 43e10278e14dc3..337971c744e60f 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.h
++++ b/drivers/net/wireless/realtek/rtw89/core.h
+@@ -3506,9 +3506,12 @@ struct rtw89_phy_rate_pattern {
+ bool enable;
+ };
+
++#define RTW89_TX_WAIT_WORK_TIMEOUT msecs_to_jiffies(500)
+ struct rtw89_tx_wait_info {
+ struct rcu_head rcu_head;
++ struct list_head list;
+ struct completion completion;
++ struct sk_buff *skb;
+ bool tx_done;
+ };
+
+@@ -5925,6 +5928,9 @@ struct rtw89_dev {
+ /* used to protect rpwm */
+ spinlock_t rpwm_lock;
+
++ struct list_head tx_waits;
++ struct wiphy_delayed_work tx_wait_work;
++
+ struct rtw89_cam_info cam_info;
+
+ struct sk_buff_head c2h_queue;
+@@ -6181,6 +6187,26 @@ rtw89_assoc_link_rcu_dereference(struct rtw89_dev *rtwdev, u8 macid)
+ list_first_entry_or_null(&p->dlink_pool, typeof(*p->links_inst), dlink_schd); \
+ })
+
++static inline void rtw89_tx_wait_release(struct rtw89_tx_wait_info *wait)
++{
++ dev_kfree_skb_any(wait->skb);
++ kfree_rcu(wait, rcu_head);
++}
++
++static inline void rtw89_tx_wait_list_clear(struct rtw89_dev *rtwdev)
++{
++ struct rtw89_tx_wait_info *wait, *tmp;
++
++ lockdep_assert_wiphy(rtwdev->hw->wiphy);
++
++ list_for_each_entry_safe(wait, tmp, &rtwdev->tx_waits, list) {
++ if (!completion_done(&wait->completion))
++ continue;
++ list_del(&wait->list);
++ rtw89_tx_wait_release(wait);
++ }
++}
++
+ static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
+ struct rtw89_core_tx_request *tx_req)
+ {
+@@ -6190,6 +6216,7 @@ static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
+ static inline void rtw89_hci_reset(struct rtw89_dev *rtwdev)
+ {
+ rtwdev->hci.ops->reset(rtwdev);
++ rtw89_tx_wait_list_clear(rtwdev);
+ }
+
+ static inline int rtw89_hci_start(struct rtw89_dev *rtwdev)
+@@ -7258,11 +7285,12 @@ static inline struct sk_buff *rtw89_alloc_skb_for_rx(struct rtw89_dev *rtwdev,
+ return dev_alloc_skb(length);
+ }
+
+-static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
++static inline bool rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
+ struct rtw89_tx_skb_data *skb_data,
+ bool tx_done)
+ {
+ struct rtw89_tx_wait_info *wait;
++ bool ret = false;
+
+ rcu_read_lock();
+
+@@ -7270,11 +7298,14 @@ static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
+ if (!wait)
+ goto out;
+
++ ret = true;
+ wait->tx_done = tx_done;
+- complete(&wait->completion);
++ /* Don't access skb anymore after completion */
++ complete_all(&wait->completion);
+
+ out:
+ rcu_read_unlock();
++ return ret;
+ }
+
+ static inline bool rtw89_is_mlo_1_1(struct rtw89_dev *rtwdev)
+diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
+index a669f2f843aab4..4e3034b44f5641 100644
+--- a/drivers/net/wireless/realtek/rtw89/pci.c
++++ b/drivers/net/wireless/realtek/rtw89/pci.c
+@@ -464,7 +464,8 @@ static void rtw89_pci_tx_status(struct rtw89_dev *rtwdev,
+ struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
+ struct ieee80211_tx_info *info;
+
+- rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE);
++ if (rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE))
++ return;
+
+ info = IEEE80211_SKB_CB(skb);
+ ieee80211_tx_info_clear_status(info);
+diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
+index bb39fdbcba0d80..fe7beff8c42465 100644
+--- a/drivers/net/wireless/realtek/rtw89/ser.c
++++ b/drivers/net/wireless/realtek/rtw89/ser.c
+@@ -502,7 +502,9 @@ static void ser_reset_trx_st_hdl(struct rtw89_ser *ser, u8 evt)
+ }
+
+ drv_stop_rx(ser);
++ wiphy_lock(wiphy);
+ drv_trx_reset(ser);
++ wiphy_unlock(wiphy);
+
+ /* wait m3 */
+ hal_send_m2_event(ser);
+diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c
+index 0904ecae253a8e..b19acd662726d4 100644
+--- a/drivers/target/target_core_configfs.c
++++ b/drivers/target/target_core_configfs.c
+@@ -2774,7 +2774,7 @@ static ssize_t target_lu_gp_members_show(struct config_item *item, char *page)
+ config_item_name(&dev->dev_group.cg_item));
+ cur_len++; /* Extra byte for NULL terminator */
+
+- if ((cur_len + len) > PAGE_SIZE) {
++ if ((cur_len + len) > PAGE_SIZE || cur_len > LU_GROUP_NAME_BUF) {
+ pr_warn("Ran out of lu_gp_show_attr"
+ "_members buffer\n");
+ break;
+diff --git a/mm/swapfile.c b/mm/swapfile.c
+index b4f3cc71258049..ad438a4d0e68ca 100644
+--- a/mm/swapfile.c
++++ b/mm/swapfile.c
+@@ -2243,6 +2243,8 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
+ VMA_ITERATOR(vmi, mm, 0);
+
+ mmap_read_lock(mm);
++ if (check_stable_address_space(mm))
++ goto unlock;
+ for_each_vma(vmi, vma) {
+ if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
+ ret = unuse_vma(vma, type);
+@@ -2252,6 +2254,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
+
+ cond_resched();
+ }
++unlock:
+ mmap_read_unlock(mm);
+ return ret;
+ }
+diff --git a/scripts/gcc-plugins/gcc-common.h b/scripts/gcc-plugins/gcc-common.h
+index 6cb6d105181520..8f1b3500f8e2dc 100644
+--- a/scripts/gcc-plugins/gcc-common.h
++++ b/scripts/gcc-plugins/gcc-common.h
+@@ -173,10 +173,17 @@ static inline opt_pass *get_pass_for_id(int id)
+ return g->get_passes()->get_pass_for_id(id);
+ }
+
++#if BUILDING_GCC_VERSION < 16000
+ #define TODO_verify_ssa TODO_verify_il
+ #define TODO_verify_flow TODO_verify_il
+ #define TODO_verify_stmts TODO_verify_il
+ #define TODO_verify_rtl_sharing TODO_verify_il
++#else
++#define TODO_verify_ssa 0
++#define TODO_verify_flow 0
++#define TODO_verify_stmts 0
++#define TODO_verify_rtl_sharing 0
++#endif
+
+ #define INSN_DELETED_P(insn) (insn)->deleted()
+
+diff --git a/sound/soc/qcom/qdsp6/topology.c b/sound/soc/qcom/qdsp6/topology.c
+index 83319a928f2917..01bb1bdee5cec1 100644
+--- a/sound/soc/qcom/qdsp6/topology.c
++++ b/sound/soc/qcom/qdsp6/topology.c
+@@ -587,8 +587,8 @@ static int audioreach_widget_load_module_common(struct snd_soc_component *compon
+ return PTR_ERR(cont);
+
+ mod = audioreach_parse_common_tokens(apm, cont, &tplg_w->priv, w);
+- if (IS_ERR(mod))
+- return PTR_ERR(mod);
++ if (IS_ERR_OR_NULL(mod))
++ return mod ? PTR_ERR(mod) : -ENODEV;
+
+ dobj = &w->dobj;
+ dobj->private = mod;
+diff --git a/sound/usb/midi.c b/sound/usb/midi.c
+index acb3bf92857c10..97e7e7662b12de 100644
+--- a/sound/usb/midi.c
++++ b/sound/usb/midi.c
+@@ -1522,15 +1522,14 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
+ {
+ int i;
+
++ if (!umidi->disconnected)
++ snd_usbmidi_disconnect(&umidi->list);
++
+ for (i = 0; i < MIDI_MAX_ENDPOINTS; ++i) {
+ struct snd_usb_midi_endpoint *ep = &umidi->endpoints[i];
+- if (ep->out)
+- snd_usbmidi_out_endpoint_delete(ep->out);
+- if (ep->in)
+- snd_usbmidi_in_endpoint_delete(ep->in);
++ kfree(ep->out);
+ }
+ mutex_destroy(&umidi->mutex);
+- timer_shutdown_sync(&umidi->error_timer);
+ kfree(umidi);
+ }
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-06 11:08 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-06 11:08 UTC (permalink / raw
To: gentoo-commits
commit: 12fb365205f621fb2cc57979b9990a87c6c46cf0
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 6 11:06:02 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 6 11:08:33 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=12fb3652
Linux patch 6.17.1
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 5 +
1000_linux-6.17.1.patch | 659 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 664 insertions(+)
diff --git a/0000_README b/0000_README
index f2189223..50189b54 100644
--- a/0000_README
+++ b/0000_README
@@ -42,6 +42,11 @@ EXPERIMENTAL
Individual Patch Descriptions:
--------------------------------------------------------------------------
+
+Patch: 1000_linux-6.17.1.patch
+From: https://www.kernel.org
+Desc: Linux 6.17.1
+
Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
diff --git a/1000_linux-6.17.1.patch b/1000_linux-6.17.1.patch
new file mode 100644
index 00000000..1266cc07
--- /dev/null
+++ b/1000_linux-6.17.1.patch
@@ -0,0 +1,659 @@
+diff --git a/Makefile b/Makefile
+index 82bb9cdf73a32b..389bfac0adaaac 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 17
+-SUBLEVEL = 0
++SUBLEVEL = 1
+ EXTRAVERSION =
+ NAME = Baby Opossum Posse
+
+diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
+index d880c50629d612..5cffa5668d0c38 100644
+--- a/block/blk-mq-tag.c
++++ b/block/blk-mq-tag.c
+@@ -622,6 +622,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+ return -ENOMEM;
+
+ blk_mq_free_map_and_rqs(set, *tagsptr, hctx->queue_num);
++ hctx->queue->elevator->et->tags[hctx->queue_num] = new;
+ *tagsptr = new;
+ } else {
+ /*
+diff --git a/drivers/media/i2c/tc358743.c b/drivers/media/i2c/tc358743.c
+index 1cc7636e446d77..5042cf612d21e8 100644
+--- a/drivers/media/i2c/tc358743.c
++++ b/drivers/media/i2c/tc358743.c
+@@ -2245,10 +2245,10 @@ static int tc358743_probe(struct i2c_client *client)
+ err_work_queues:
+ cec_unregister_adapter(state->cec_adap);
+ if (!state->i2c_client->irq) {
+- timer_delete(&state->timer);
++ timer_delete_sync(&state->timer);
+ flush_work(&state->work_i2c_poll);
+ }
+- cancel_delayed_work(&state->delayed_work_enable_hotplug);
++ cancel_delayed_work_sync(&state->delayed_work_enable_hotplug);
+ mutex_destroy(&state->confctl_mutex);
+ err_hdl:
+ media_entity_cleanup(&sd->entity);
+diff --git a/drivers/media/pci/b2c2/flexcop-pci.c b/drivers/media/pci/b2c2/flexcop-pci.c
+index 486c8ec0fa60d9..ab53c5b02c48df 100644
+--- a/drivers/media/pci/b2c2/flexcop-pci.c
++++ b/drivers/media/pci/b2c2/flexcop-pci.c
+@@ -411,7 +411,7 @@ static void flexcop_pci_remove(struct pci_dev *pdev)
+ struct flexcop_pci *fc_pci = pci_get_drvdata(pdev);
+
+ if (irq_chk_intv > 0)
+- cancel_delayed_work(&fc_pci->irq_check_work);
++ cancel_delayed_work_sync(&fc_pci->irq_check_work);
+
+ flexcop_pci_dma_exit(fc_pci);
+ flexcop_device_exit(fc_pci->fc_dev);
+diff --git a/drivers/media/platform/qcom/iris/iris_buffer.c b/drivers/media/platform/qcom/iris/iris_buffer.c
+index 6425e4919e3b0b..9f664c24114936 100644
+--- a/drivers/media/platform/qcom/iris/iris_buffer.c
++++ b/drivers/media/platform/qcom/iris/iris_buffer.c
+@@ -413,6 +413,16 @@ static int iris_destroy_internal_buffers(struct iris_inst *inst, u32 plane, bool
+ }
+ }
+
++ if (force) {
++ buffers = &inst->buffers[BUF_PERSIST];
++
++ list_for_each_entry_safe(buf, next, &buffers->list, list) {
++ ret = iris_destroy_internal_buffer(inst, buf);
++ if (ret)
++ return ret;
++ }
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/media/platform/st/stm32/stm32-csi.c b/drivers/media/platform/st/stm32/stm32-csi.c
+index b69048144cc12b..fd2b6dfbd44c57 100644
+--- a/drivers/media/platform/st/stm32/stm32-csi.c
++++ b/drivers/media/platform/st/stm32/stm32-csi.c
+@@ -443,8 +443,7 @@ static void stm32_csi_phy_reg_write(struct stm32_csi_dev *csidev,
+ static int stm32_csi_start(struct stm32_csi_dev *csidev,
+ struct v4l2_subdev_state *state)
+ {
+- struct media_pad *src_pad =
+- &csidev->s_subdev->entity.pads[csidev->s_subdev_pad_nb];
++ struct media_pad *src_pad;
+ const struct stm32_csi_mbps_phy_reg *phy_regs = NULL;
+ struct v4l2_mbus_framefmt *sink_fmt;
+ const struct stm32_csi_fmts *fmt;
+@@ -466,6 +465,7 @@ static int stm32_csi_start(struct stm32_csi_dev *csidev,
+ if (!csidev->s_subdev)
+ return -EIO;
+
++ src_pad = &csidev->s_subdev->entity.pads[csidev->s_subdev_pad_nb];
+ link_freq = v4l2_get_link_freq(src_pad,
+ fmt->bpp, 2 * csidev->num_lanes);
+ if (link_freq < 0)
+diff --git a/drivers/media/rc/imon.c b/drivers/media/rc/imon.c
+index f5221b01880813..cf3e6e43c0c7e4 100644
+--- a/drivers/media/rc/imon.c
++++ b/drivers/media/rc/imon.c
+@@ -536,7 +536,9 @@ static int display_open(struct inode *inode, struct file *file)
+
+ mutex_lock(&ictx->lock);
+
+- if (!ictx->display_supported) {
++ if (ictx->disconnected) {
++ retval = -ENODEV;
++ } else if (!ictx->display_supported) {
+ pr_err("display not supported by device\n");
+ retval = -ENODEV;
+ } else if (ictx->display_isopen) {
+@@ -598,6 +600,9 @@ static int send_packet(struct imon_context *ictx)
+ int retval = 0;
+ struct usb_ctrlrequest *control_req = NULL;
+
++ if (ictx->disconnected)
++ return -ENODEV;
++
+ /* Check if we need to use control or interrupt urb */
+ if (!ictx->tx_control) {
+ pipe = usb_sndintpipe(ictx->usbdev_intf0,
+@@ -949,12 +954,14 @@ static ssize_t vfd_write(struct file *file, const char __user *buf,
+ static const unsigned char vfd_packet6[] = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF };
+
+- if (ictx->disconnected)
+- return -ENODEV;
+-
+ if (mutex_lock_interruptible(&ictx->lock))
+ return -ERESTARTSYS;
+
++ if (ictx->disconnected) {
++ retval = -ENODEV;
++ goto exit;
++ }
++
+ if (!ictx->dev_present_intf0) {
+ pr_err_ratelimited("no iMON device present\n");
+ retval = -ENODEV;
+@@ -1029,11 +1036,13 @@ static ssize_t lcd_write(struct file *file, const char __user *buf,
+ int retval = 0;
+ struct imon_context *ictx = file->private_data;
+
+- if (ictx->disconnected)
+- return -ENODEV;
+-
+ mutex_lock(&ictx->lock);
+
++ if (ictx->disconnected) {
++ retval = -ENODEV;
++ goto exit;
++ }
++
+ if (!ictx->display_supported) {
+ pr_err_ratelimited("no iMON display present\n");
+ retval = -ENODEV;
+@@ -2499,7 +2508,11 @@ static void imon_disconnect(struct usb_interface *interface)
+ int ifnum;
+
+ ictx = usb_get_intfdata(interface);
++
++ mutex_lock(&ictx->lock);
+ ictx->disconnected = true;
++ mutex_unlock(&ictx->lock);
++
+ dev = ictx->dev;
+ ifnum = interface->cur_altsetting->desc.bInterfaceNumber;
+
+diff --git a/drivers/media/tuners/xc5000.c b/drivers/media/tuners/xc5000.c
+index 30aa4ee958bdea..ec9a3cd4784e1f 100644
+--- a/drivers/media/tuners/xc5000.c
++++ b/drivers/media/tuners/xc5000.c
+@@ -1304,7 +1304,7 @@ static void xc5000_release(struct dvb_frontend *fe)
+ mutex_lock(&xc5000_list_mutex);
+
+ if (priv) {
+- cancel_delayed_work(&priv->timer_sleep);
++ cancel_delayed_work_sync(&priv->timer_sleep);
+ hybrid_tuner_release_state(priv);
+ }
+
+diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
+index 775bede0d93d9b..50e1589668ba50 100644
+--- a/drivers/media/usb/uvc/uvc_driver.c
++++ b/drivers/media/usb/uvc/uvc_driver.c
+@@ -137,6 +137,9 @@ struct uvc_entity *uvc_entity_by_id(struct uvc_device *dev, int id)
+ {
+ struct uvc_entity *entity;
+
++ if (id == UVC_INVALID_ENTITY_ID)
++ return NULL;
++
+ list_for_each_entry(entity, &dev->entities, list) {
+ if (entity->id == id)
+ return entity;
+@@ -795,14 +798,27 @@ static const u8 uvc_media_transport_input_guid[16] =
+ UVC_GUID_UVC_MEDIA_TRANSPORT_INPUT;
+ static const u8 uvc_processing_guid[16] = UVC_GUID_UVC_PROCESSING;
+
+-static struct uvc_entity *uvc_alloc_entity(u16 type, u16 id,
+- unsigned int num_pads, unsigned int extra_size)
++static struct uvc_entity *uvc_alloc_new_entity(struct uvc_device *dev, u16 type,
++ u16 id, unsigned int num_pads,
++ unsigned int extra_size)
+ {
+ struct uvc_entity *entity;
+ unsigned int num_inputs;
+ unsigned int size;
+ unsigned int i;
+
++ /* Per UVC 1.1+ spec 3.7.2, the ID should be non-zero. */
++ if (id == 0) {
++ dev_err(&dev->intf->dev, "Found Unit with invalid ID 0\n");
++ id = UVC_INVALID_ENTITY_ID;
++ }
++
++ /* Per UVC 1.1+ spec 3.7.2, the ID is unique. */
++ if (uvc_entity_by_id(dev, id)) {
++ dev_err(&dev->intf->dev, "Found multiple Units with ID %u\n", id);
++ id = UVC_INVALID_ENTITY_ID;
++ }
++
+ extra_size = roundup(extra_size, sizeof(*entity->pads));
+ if (num_pads)
+ num_inputs = type & UVC_TERM_OUTPUT ? num_pads : num_pads - 1;
+@@ -812,7 +828,7 @@ static struct uvc_entity *uvc_alloc_entity(u16 type, u16 id,
+ + num_inputs;
+ entity = kzalloc(size, GFP_KERNEL);
+ if (entity == NULL)
+- return NULL;
++ return ERR_PTR(-ENOMEM);
+
+ entity->id = id;
+ entity->type = type;
+@@ -924,10 +940,10 @@ static int uvc_parse_vendor_control(struct uvc_device *dev,
+ break;
+ }
+
+- unit = uvc_alloc_entity(UVC_VC_EXTENSION_UNIT, buffer[3],
+- p + 1, 2*n);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, UVC_VC_EXTENSION_UNIT,
++ buffer[3], p + 1, 2 * n);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->guid, &buffer[4], 16);
+ unit->extension.bNumControls = buffer[20];
+@@ -1036,10 +1052,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- term = uvc_alloc_entity(type | UVC_TERM_INPUT, buffer[3],
+- 1, n + p);
+- if (term == NULL)
+- return -ENOMEM;
++ term = uvc_alloc_new_entity(dev, type | UVC_TERM_INPUT,
++ buffer[3], 1, n + p);
++ if (IS_ERR(term))
++ return PTR_ERR(term);
+
+ if (UVC_ENTITY_TYPE(term) == UVC_ITT_CAMERA) {
+ term->camera.bControlSize = n;
+@@ -1095,10 +1111,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return 0;
+ }
+
+- term = uvc_alloc_entity(type | UVC_TERM_OUTPUT, buffer[3],
+- 1, 0);
+- if (term == NULL)
+- return -ENOMEM;
++ term = uvc_alloc_new_entity(dev, type | UVC_TERM_OUTPUT,
++ buffer[3], 1, 0);
++ if (IS_ERR(term))
++ return PTR_ERR(term);
+
+ memcpy(term->baSourceID, &buffer[7], 1);
+
+@@ -1117,9 +1133,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- unit = uvc_alloc_entity(buffer[2], buffer[3], p + 1, 0);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, buffer[2], buffer[3],
++ p + 1, 0);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->baSourceID, &buffer[5], p);
+
+@@ -1139,9 +1156,9 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- unit = uvc_alloc_entity(buffer[2], buffer[3], 2, n);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, buffer[2], buffer[3], 2, n);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->baSourceID, &buffer[4], 1);
+ unit->processing.wMaxMultiplier =
+@@ -1168,9 +1185,10 @@ static int uvc_parse_standard_control(struct uvc_device *dev,
+ return -EINVAL;
+ }
+
+- unit = uvc_alloc_entity(buffer[2], buffer[3], p + 1, n);
+- if (unit == NULL)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, buffer[2], buffer[3],
++ p + 1, n);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ memcpy(unit->guid, &buffer[4], 16);
+ unit->extension.bNumControls = buffer[20];
+@@ -1315,9 +1333,10 @@ static int uvc_gpio_parse(struct uvc_device *dev)
+ return dev_err_probe(&dev->intf->dev, irq,
+ "No IRQ for privacy GPIO\n");
+
+- unit = uvc_alloc_entity(UVC_EXT_GPIO_UNIT, UVC_EXT_GPIO_UNIT_ID, 0, 1);
+- if (!unit)
+- return -ENOMEM;
++ unit = uvc_alloc_new_entity(dev, UVC_EXT_GPIO_UNIT,
++ UVC_EXT_GPIO_UNIT_ID, 0, 1);
++ if (IS_ERR(unit))
++ return PTR_ERR(unit);
+
+ unit->gpio.gpio_privacy = gpio_privacy;
+ unit->gpio.irq = irq;
+diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
+index 757254fc4fe930..37bb8167abe9ac 100644
+--- a/drivers/media/usb/uvc/uvcvideo.h
++++ b/drivers/media/usb/uvc/uvcvideo.h
+@@ -41,6 +41,8 @@
+ #define UVC_EXT_GPIO_UNIT 0x7ffe
+ #define UVC_EXT_GPIO_UNIT_ID 0x100
+
++#define UVC_INVALID_ENTITY_ID 0xffff
++
+ /* ------------------------------------------------------------------------
+ * Driver specific constants.
+ */
+diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
+index 378ac96b861b70..1a42b4abe71682 100644
+--- a/drivers/net/wireless/ath/ath11k/qmi.c
++++ b/drivers/net/wireless/ath/ath11k/qmi.c
+@@ -2557,7 +2557,7 @@ static int ath11k_qmi_m3_load(struct ath11k_base *ab)
+ GFP_KERNEL);
+ if (!m3_mem->vaddr) {
+ ath11k_err(ab, "failed to allocate memory for M3 with size %zu\n",
+- fw->size);
++ m3_len);
+ ret = -ENOMEM;
+ goto out;
+ }
+diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
+index 57590f5577a360..b9c2224dde4a37 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.c
++++ b/drivers/net/wireless/realtek/rtw89/core.c
+@@ -1073,6 +1073,14 @@ rtw89_core_tx_update_desc_info(struct rtw89_dev *rtwdev,
+ }
+ }
+
++static void rtw89_tx_wait_work(struct wiphy *wiphy, struct wiphy_work *work)
++{
++ struct rtw89_dev *rtwdev = container_of(work, struct rtw89_dev,
++ tx_wait_work.work);
++
++ rtw89_tx_wait_list_clear(rtwdev);
++}
++
+ void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel)
+ {
+ u8 ch_dma;
+@@ -1090,6 +1098,8 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
+ unsigned long time_left;
+ int ret = 0;
+
++ lockdep_assert_wiphy(rtwdev->hw->wiphy);
++
+ wait = kzalloc(sizeof(*wait), GFP_KERNEL);
+ if (!wait) {
+ rtw89_core_tx_kick_off(rtwdev, qsel);
+@@ -1097,18 +1107,23 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
+ }
+
+ init_completion(&wait->completion);
++ wait->skb = skb;
+ rcu_assign_pointer(skb_data->wait, wait);
+
+ rtw89_core_tx_kick_off(rtwdev, qsel);
+ time_left = wait_for_completion_timeout(&wait->completion,
+ msecs_to_jiffies(timeout));
+- if (time_left == 0)
+- ret = -ETIMEDOUT;
+- else if (!wait->tx_done)
+- ret = -EAGAIN;
+
+- rcu_assign_pointer(skb_data->wait, NULL);
+- kfree_rcu(wait, rcu_head);
++ if (time_left == 0) {
++ ret = -ETIMEDOUT;
++ list_add_tail(&wait->list, &rtwdev->tx_waits);
++ wiphy_delayed_work_queue(rtwdev->hw->wiphy, &rtwdev->tx_wait_work,
++ RTW89_TX_WAIT_WORK_TIMEOUT);
++ } else {
++ if (!wait->tx_done)
++ ret = -EAGAIN;
++ rtw89_tx_wait_release(wait);
++ }
+
+ return ret;
+ }
+@@ -4978,6 +4993,7 @@ void rtw89_core_stop(struct rtw89_dev *rtwdev)
+ wiphy_work_cancel(wiphy, &btc->dhcp_notify_work);
+ wiphy_work_cancel(wiphy, &btc->icmp_notify_work);
+ cancel_delayed_work_sync(&rtwdev->txq_reinvoke_work);
++ wiphy_delayed_work_cancel(wiphy, &rtwdev->tx_wait_work);
+ wiphy_delayed_work_cancel(wiphy, &rtwdev->track_work);
+ wiphy_delayed_work_cancel(wiphy, &rtwdev->track_ps_work);
+ wiphy_delayed_work_cancel(wiphy, &rtwdev->chanctx_work);
+@@ -5203,6 +5219,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
+ INIT_LIST_HEAD(&rtwdev->scan_info.pkt_list[band]);
+ }
+ INIT_LIST_HEAD(&rtwdev->scan_info.chan_list);
++ INIT_LIST_HEAD(&rtwdev->tx_waits);
+ INIT_WORK(&rtwdev->ba_work, rtw89_core_ba_work);
+ INIT_WORK(&rtwdev->txq_work, rtw89_core_txq_work);
+ INIT_DELAYED_WORK(&rtwdev->txq_reinvoke_work, rtw89_core_txq_reinvoke_work);
+@@ -5214,6 +5231,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
+ wiphy_delayed_work_init(&rtwdev->coex_rfk_chk_work, rtw89_coex_rfk_chk_work);
+ wiphy_delayed_work_init(&rtwdev->cfo_track_work, rtw89_phy_cfo_track_work);
+ wiphy_delayed_work_init(&rtwdev->mcc_prepare_done_work, rtw89_mcc_prepare_done_work);
++ wiphy_delayed_work_init(&rtwdev->tx_wait_work, rtw89_tx_wait_work);
+ INIT_DELAYED_WORK(&rtwdev->forbid_ba_work, rtw89_forbid_ba_work);
+ wiphy_delayed_work_init(&rtwdev->antdiv_work, rtw89_phy_antdiv_work);
+ rtwdev->txq_wq = alloc_workqueue("rtw89_tx_wq", WQ_UNBOUND | WQ_HIGHPRI, 0);
+diff --git a/drivers/net/wireless/realtek/rtw89/core.h b/drivers/net/wireless/realtek/rtw89/core.h
+index 43e10278e14dc3..337971c744e60f 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.h
++++ b/drivers/net/wireless/realtek/rtw89/core.h
+@@ -3506,9 +3506,12 @@ struct rtw89_phy_rate_pattern {
+ bool enable;
+ };
+
++#define RTW89_TX_WAIT_WORK_TIMEOUT msecs_to_jiffies(500)
+ struct rtw89_tx_wait_info {
+ struct rcu_head rcu_head;
++ struct list_head list;
+ struct completion completion;
++ struct sk_buff *skb;
+ bool tx_done;
+ };
+
+@@ -5925,6 +5928,9 @@ struct rtw89_dev {
+ /* used to protect rpwm */
+ spinlock_t rpwm_lock;
+
++ struct list_head tx_waits;
++ struct wiphy_delayed_work tx_wait_work;
++
+ struct rtw89_cam_info cam_info;
+
+ struct sk_buff_head c2h_queue;
+@@ -6181,6 +6187,26 @@ rtw89_assoc_link_rcu_dereference(struct rtw89_dev *rtwdev, u8 macid)
+ list_first_entry_or_null(&p->dlink_pool, typeof(*p->links_inst), dlink_schd); \
+ })
+
++static inline void rtw89_tx_wait_release(struct rtw89_tx_wait_info *wait)
++{
++ dev_kfree_skb_any(wait->skb);
++ kfree_rcu(wait, rcu_head);
++}
++
++static inline void rtw89_tx_wait_list_clear(struct rtw89_dev *rtwdev)
++{
++ struct rtw89_tx_wait_info *wait, *tmp;
++
++ lockdep_assert_wiphy(rtwdev->hw->wiphy);
++
++ list_for_each_entry_safe(wait, tmp, &rtwdev->tx_waits, list) {
++ if (!completion_done(&wait->completion))
++ continue;
++ list_del(&wait->list);
++ rtw89_tx_wait_release(wait);
++ }
++}
++
+ static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
+ struct rtw89_core_tx_request *tx_req)
+ {
+@@ -6190,6 +6216,7 @@ static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
+ static inline void rtw89_hci_reset(struct rtw89_dev *rtwdev)
+ {
+ rtwdev->hci.ops->reset(rtwdev);
++ rtw89_tx_wait_list_clear(rtwdev);
+ }
+
+ static inline int rtw89_hci_start(struct rtw89_dev *rtwdev)
+@@ -7258,11 +7285,12 @@ static inline struct sk_buff *rtw89_alloc_skb_for_rx(struct rtw89_dev *rtwdev,
+ return dev_alloc_skb(length);
+ }
+
+-static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
++static inline bool rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
+ struct rtw89_tx_skb_data *skb_data,
+ bool tx_done)
+ {
+ struct rtw89_tx_wait_info *wait;
++ bool ret = false;
+
+ rcu_read_lock();
+
+@@ -7270,11 +7298,14 @@ static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
+ if (!wait)
+ goto out;
+
++ ret = true;
+ wait->tx_done = tx_done;
+- complete(&wait->completion);
++ /* Don't access skb anymore after completion */
++ complete_all(&wait->completion);
+
+ out:
+ rcu_read_unlock();
++ return ret;
+ }
+
+ static inline bool rtw89_is_mlo_1_1(struct rtw89_dev *rtwdev)
+diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
+index a669f2f843aab4..4e3034b44f5641 100644
+--- a/drivers/net/wireless/realtek/rtw89/pci.c
++++ b/drivers/net/wireless/realtek/rtw89/pci.c
+@@ -464,7 +464,8 @@ static void rtw89_pci_tx_status(struct rtw89_dev *rtwdev,
+ struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
+ struct ieee80211_tx_info *info;
+
+- rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE);
++ if (rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE))
++ return;
+
+ info = IEEE80211_SKB_CB(skb);
+ ieee80211_tx_info_clear_status(info);
+diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
+index bb39fdbcba0d80..fe7beff8c42465 100644
+--- a/drivers/net/wireless/realtek/rtw89/ser.c
++++ b/drivers/net/wireless/realtek/rtw89/ser.c
+@@ -502,7 +502,9 @@ static void ser_reset_trx_st_hdl(struct rtw89_ser *ser, u8 evt)
+ }
+
+ drv_stop_rx(ser);
++ wiphy_lock(wiphy);
+ drv_trx_reset(ser);
++ wiphy_unlock(wiphy);
+
+ /* wait m3 */
+ hal_send_m2_event(ser);
+diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c
+index 0904ecae253a8e..b19acd662726d4 100644
+--- a/drivers/target/target_core_configfs.c
++++ b/drivers/target/target_core_configfs.c
+@@ -2774,7 +2774,7 @@ static ssize_t target_lu_gp_members_show(struct config_item *item, char *page)
+ config_item_name(&dev->dev_group.cg_item));
+ cur_len++; /* Extra byte for NULL terminator */
+
+- if ((cur_len + len) > PAGE_SIZE) {
++ if ((cur_len + len) > PAGE_SIZE || cur_len > LU_GROUP_NAME_BUF) {
+ pr_warn("Ran out of lu_gp_show_attr"
+ "_members buffer\n");
+ break;
+diff --git a/mm/swapfile.c b/mm/swapfile.c
+index b4f3cc71258049..ad438a4d0e68ca 100644
+--- a/mm/swapfile.c
++++ b/mm/swapfile.c
+@@ -2243,6 +2243,8 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
+ VMA_ITERATOR(vmi, mm, 0);
+
+ mmap_read_lock(mm);
++ if (check_stable_address_space(mm))
++ goto unlock;
+ for_each_vma(vmi, vma) {
+ if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
+ ret = unuse_vma(vma, type);
+@@ -2252,6 +2254,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
+
+ cond_resched();
+ }
++unlock:
+ mmap_read_unlock(mm);
+ return ret;
+ }
+diff --git a/scripts/gcc-plugins/gcc-common.h b/scripts/gcc-plugins/gcc-common.h
+index 6cb6d105181520..8f1b3500f8e2dc 100644
+--- a/scripts/gcc-plugins/gcc-common.h
++++ b/scripts/gcc-plugins/gcc-common.h
+@@ -173,10 +173,17 @@ static inline opt_pass *get_pass_for_id(int id)
+ return g->get_passes()->get_pass_for_id(id);
+ }
+
++#if BUILDING_GCC_VERSION < 16000
+ #define TODO_verify_ssa TODO_verify_il
+ #define TODO_verify_flow TODO_verify_il
+ #define TODO_verify_stmts TODO_verify_il
+ #define TODO_verify_rtl_sharing TODO_verify_il
++#else
++#define TODO_verify_ssa 0
++#define TODO_verify_flow 0
++#define TODO_verify_stmts 0
++#define TODO_verify_rtl_sharing 0
++#endif
+
+ #define INSN_DELETED_P(insn) (insn)->deleted()
+
+diff --git a/sound/soc/qcom/qdsp6/topology.c b/sound/soc/qcom/qdsp6/topology.c
+index 83319a928f2917..01bb1bdee5cec1 100644
+--- a/sound/soc/qcom/qdsp6/topology.c
++++ b/sound/soc/qcom/qdsp6/topology.c
+@@ -587,8 +587,8 @@ static int audioreach_widget_load_module_common(struct snd_soc_component *compon
+ return PTR_ERR(cont);
+
+ mod = audioreach_parse_common_tokens(apm, cont, &tplg_w->priv, w);
+- if (IS_ERR(mod))
+- return PTR_ERR(mod);
++ if (IS_ERR_OR_NULL(mod))
++ return mod ? PTR_ERR(mod) : -ENODEV;
+
+ dobj = &w->dobj;
+ dobj->private = mod;
+diff --git a/sound/usb/midi.c b/sound/usb/midi.c
+index acb3bf92857c10..97e7e7662b12de 100644
+--- a/sound/usb/midi.c
++++ b/sound/usb/midi.c
+@@ -1522,15 +1522,14 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
+ {
+ int i;
+
++ if (!umidi->disconnected)
++ snd_usbmidi_disconnect(&umidi->list);
++
+ for (i = 0; i < MIDI_MAX_ENDPOINTS; ++i) {
+ struct snd_usb_midi_endpoint *ep = &umidi->endpoints[i];
+- if (ep->out)
+- snd_usbmidi_out_endpoint_delete(ep->out);
+- if (ep->in)
+- snd_usbmidi_in_endpoint_delete(ep->in);
++ kfree(ep->out);
+ }
+ mutex_destroy(&umidi->mutex);
+- timer_shutdown_sync(&umidi->error_timer);
+ kfree(umidi);
+ }
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-06 11:42 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-06 11:42 UTC (permalink / raw
To: gentoo-commits
commit: 7512398b65dcb9185fd0523b3dcbbb09806199bc
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 6 11:36:52 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 6 11:36:52 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=7512398b
Add patch 1901_btrfs_fix_racy_bitfield_write_in_btrfs_clear_space_info_full.patch
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +
...ield_write_in_btrfs_clear_space_info_full.patch | 330 +++++++++++++++++++++
2 files changed, 334 insertions(+)
diff --git a/0000_README b/0000_README
index 50189b54..8b807251 100644
--- a/0000_README
+++ b/0000_README
@@ -59,6 +59,10 @@ Patch: 1730_parisc-Disable-prctl.patch
From: https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git
Desc: prctl: Temporarily disable prctl(PR_SET_MDWE) on parisc
+Patch: 1901_btrfs_fix_racy_bitfield_write_in_btrfs_clear_space_info_full.patch
+From: https://lore.kernel.org/linux-btrfs/c885e50a-8076-4517-a0c0-b2dd85d5581a@suse.com/T/#m76d8b9b7f4f86aa223acb03d9f5ed0d33e59bd0c
+Desc: btrfs: fix racy bitfield write in btrfs_clear_space_info_full()
+
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1901_btrfs_fix_racy_bitfield_write_in_btrfs_clear_space_info_full.patch b/1901_btrfs_fix_racy_bitfield_write_in_btrfs_clear_space_info_full.patch
new file mode 100644
index 00000000..58becb01
--- /dev/null
+++ b/1901_btrfs_fix_racy_bitfield_write_in_btrfs_clear_space_info_full.patch
@@ -0,0 +1,330 @@
+From mboxrd@z Thu Jan 1 00:00:00 1970
+Received: from fout-a8-smtp.messagingengine.com (fout-a8-smtp.messagingengine.com [103.168.172.151])
+ (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
+ (No client certificate requested)
+ by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE95727815D
+ for <linux-btrfs@vger.kernel.org>; Thu, 2 Oct 2025 18:50:40 +0000 (UTC)
+Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.151
+ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
+ t=1759431043; cv=none; b=g/sVOE9GUxIJrDs+0c6htIoqtxEDV8ApO1Aq7Xb8UGRrX2UomGZr9HDJ2bNe6ijrub5gKml8F1KGTA91iLsdPWUTcWqt57wMcDZppnqlI6OdJnOMqJ90P5yW0wk6Jlwigde7XA+F69vGuQrF4QAX9JHXpc4lioyib3BUw6RHm1Q=
+ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
+ s=arc-20240116; t=1759431043; c=relaxed/simple;
+ bh=Lz00567+LBOOr/jPZJDOsMK/G5LYbSqcZ1M75XQKCHo=;
+ h=From:To:Subject:Date:Message-ID:MIME-Version; b=DINAhWzqyFqkVrmSYPBqIlfpVuB2TnKC1XywkfPtIEBqjxjw8K1jkGz5tobD6kU2fg4ROHXYY6kdRmOV8qr9XJdVTJ0Wmi/2pu5mEsQ0b5HcgTnGDBJlwkkx6jtOjsiiUcvmrqasmwsqQuMecNDF+7hXBzEsWpdiukETgRfsMx0=
+ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=bur.io; spf=pass smtp.mailfrom=bur.io; dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b=SVDvSoaE; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=lnFXkkGK; arc=none smtp.client-ip=103.168.172.151
+Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=bur.io
+Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bur.io
+Authentication-Results: smtp.subspace.kernel.org;
+ dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b="SVDvSoaE";
+ dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="lnFXkkGK"
+Received: from phl-compute-03.internal (phl-compute-03.internal [10.202.2.43])
+ by mailfout.phl.internal (Postfix) with ESMTP id A7079EC0118;
+ Thu, 2 Oct 2025 14:50:39 -0400 (EDT)
+Received: from phl-mailfrontend-02 ([10.202.2.163])
+ by phl-compute-03.internal (MEProxy); Thu, 02 Oct 2025 14:50:39 -0400
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc
+ :content-transfer-encoding:content-type:date:date:from:from
+ :in-reply-to:message-id:mime-version:reply-to:subject:subject:to
+ :to; s=fm3; t=1759431039; x=1759517439; bh=DrEfciGMUtyxNwG2vzflI
+ voWYrOmxEsadfo5eguen8g=; b=SVDvSoaEzdvDqnX363+5w1lc2spnGXAB07J+K
+ /mHYY01A7sB/7DUXcaZGqSKWR+48fk9sYvC9vtc4Yq7Edg1qgtVtooxHHLEGi8FX
+ GKluiw7MkrcsFIA0jh00eS51wyWpbO5iq0Gr4hAH2lfqw3/57BzUJqohnbMcAvdw
+ dmwesbiX1nU0Ml1sUGiNFOu4FFtMDIEBoXputBHGM/axKywZmXXN7+IYZ/x13uXr
+ ZKDMzfiNBOC6VKSmW7R2JWuA9e+47QLFZUU/HWu+EAevskL4MGgxYSXrIAALHEIQ
+ COj3bxw1/ncWswwFAJygA2uOJM56BBu99MdbhT39UUfwNoHFg==
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
+ messagingengine.com; h=cc:content-transfer-encoding:content-type
+ :date:date:feedback-id:feedback-id:from:from:in-reply-to
+ :message-id:mime-version:reply-to:subject:subject:to:to
+ :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=
+ 1759431039; x=1759517439; bh=DrEfciGMUtyxNwG2vzflIvoWYrOmxEsadfo
+ 5eguen8g=; b=lnFXkkGKUWnmDes1N/gYHy+RQ3rm5kJ005VboAXlxaOa+M8p7p4
+ IIwzOAkgzWOksNIWXD1ypjT9Fk6OH66vwzGKOY8N7hP1wW7tTP8lW8QwzIU4jM+0
+ jthvry8yMYkR8Lnqn7SYqUuIz7Db8VwleEWhQzOOkI3slInOkntZl0BUfaJvBJrx
+ 7iGtiLMDOipI1wJBoGkPSWQN7pVrh8bOlvvANL7TRaVNz2y5ksQwAKIZ3tzSD0lt
+ Q+nShLepD5tfXyVNqqc/kuKoBt8AsXZM87l1uQvtqlDwsAcb47Ts6KqZclRJaSLk
+ ik0ylzYIPDkQhnEJsq2VjZSou5z0sTpaMMg==
+X-ME-Sender: <xms:f8neaBMYb3xShGEJ_hziHcvQjZq36dIp48irancZuitIOKOuWzlJ-A>
+ <xme:f8neaJ_xBU9C_Mp0eMxAf9kNz8bh-_xi81iqWRN3kGjDHi6H0jYHUN-6TDV9pN0DR
+ Seu96HR3ttH07vh_GBqQCuFu_j0RpJd2T6PuU5p38EKDT5GSB7Ivlo>
+X-ME-Received: <xmr:f8neaN6wedGQO534LI6LhGDP7chQcF8FVTOPDdEKsny3nHSYoEM2plijAwzs5OxRGR29PMBbU_RYG11gGr408p3kiek>
+X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdekieejkecutefuodetggdotefrod
+ ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr
+ ihhlohhuthemuceftddtnecunecujfgurhephffvufffkffoggfgsedtkeertdertddtne
+ cuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhioheqnecu
+ ggftrfgrthhtvghrnhepudeitdelueeijeefleffveelieefgfejjeeigeekudduteefke
+ fffeethfdvjeevnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhf
+ rhhomhepsghorhhishessghurhdrihhopdhnsggprhgtphhtthhopedvpdhmohguvgepsh
+ hmthhpohhuthdprhgtphhtthhopehlihhnuhigqdgsthhrfhhssehvghgvrhdrkhgvrhhn
+ vghlrdhorhhgpdhrtghpthhtohepkhgvrhhnvghlqdhtvggrmhesfhgsrdgtohhm
+X-ME-Proxy: <xmx:f8neaI0LOj7FqlciSZHVW9HkZlh366aILY7leGQGR72MQb8ezV_Kng>
+ <xmx:f8neaMAW2nqxoK4KOcmUD1w2RyOA-wHQADFyk5gUZ6mtjkHftsVqIg>
+ <xmx:f8neaC0b5FU9w_LG7rsXlzok5JkqjXeUxba8oRdiFNRm9x4KitJEHA>
+ <xmx:f8neaOuymnHT2roQzuxoS2N7B709OFOg4Zz6UAbMIqe9vqL67KStvg>
+ <xmx:f8neaAy_gqKzrOHtI2YBl1YjOy90iBFtfVHF9hqW9_b3RcotOOzAJtTh>
+Feedback-ID: i083147f8:Fastmail
+Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu,
+ 2 Oct 2025 14:50:38 -0400 (EDT)
+From: Boris Burkov <boris@bur.io>
+To: linux-btrfs@vger.kernel.org,
+ kernel-team@fb.com
+Subject: [PATCH v2] btrfs: fix racy bitfield write in btrfs_clear_space_info_full()
+Date: Thu, 2 Oct 2025 11:50:33 -0700
+Message-ID: <22e8b64df3d4984000713433a89cfc14309b75fc.1759430967.git.boris@bur.io>
+X-Mailer: git-send-email 2.50.1
+Precedence: bulk
+X-Mailing-List: linux-btrfs@vger.kernel.org
+List-Id: <linux-btrfs.vger.kernel.org>
+List-Subscribe: <mailto:linux-btrfs+subscribe@vger.kernel.org>
+List-Unsubscribe: <mailto:linux-btrfs+unsubscribe@vger.kernel.org>
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
+
+>From the memory-barriers.txt document regarding memory barrier ordering
+guarantees:
+
+ (*) These guarantees do not apply to bitfields, because compilers often
+ generate code to modify these using non-atomic read-modify-write
+ sequences. Do not attempt to use bitfields to synchronize parallel
+ algorithms.
+
+ (*) Even in cases where bitfields are protected by locks, all fields
+ in a given bitfield must be protected by one lock. If two fields
+ in a given bitfield are protected by different locks, the compiler's
+ non-atomic read-modify-write sequences can cause an update to one
+ field to corrupt the value of an adjacent field.
+
+btrfs_space_info has a bitfield sharing an underlying word consisting of
+the fields full, chunk_alloc, and flush:
+
+struct btrfs_space_info {
+ struct btrfs_fs_info * fs_info; /* 0 8 */
+ struct btrfs_space_info * parent; /* 8 8 */
+ ...
+ int clamp; /* 172 4 */
+ unsigned int full:1; /* 176: 0 4 */
+ unsigned int chunk_alloc:1; /* 176: 1 4 */
+ unsigned int flush:1; /* 176: 2 4 */
+ ...
+
+Therefore, to be safe from parallel read-modify-writes losing a write to one of the bitfield members protected by a lock, all writes to all the
+bitfields must use the lock. They almost universally do, except for
+btrfs_clear_space_info_full() which iterates over the space_infos and
+writes out found->full = 0 without a lock.
+
+Imagine that we have one thread completing a transaction in which we
+finished deleting a block_group and are thus calling
+btrfs_clear_space_info_full() while simultaneously the data reclaim
+ticket infrastructure is running do_async_reclaim_data_space():
+
+ T1 T2
+btrfs_commit_transaction
+ btrfs_clear_space_info_full
+ data_sinfo->full = 0
+ READ: full:0, chunk_alloc:0, flush:1
+ do_async_reclaim_data_space(data_sinfo)
+ spin_lock(&space_info->lock);
+ if(list_empty(tickets))
+ space_info->flush = 0;
+ READ: full: 0, chunk_alloc:0, flush:1
+ MOD/WRITE: full: 0, chunk_alloc:0, flush:0
+ spin_unlock(&space_info->lock);
+ return;
+ MOD/WRITE: full:0, chunk_alloc:0, flush:1
+
+and now data_sinfo->flush is 1 but the reclaim worker has exited. This
+breaks the invariant that flush is 0 iff there is no work queued or
+running. Once this invariant is violated, future allocations that go
+into __reserve_bytes() will add tickets to space_info->tickets but will
+see space_info->flush is set to 1 and not queue the work. After this,
+they will block forever on the resulting ticket, as it is now impossible
+to kick the worker again.
+
+I also confirmed by looking at the assembly of the affected kernel that
+it is doing RMW operations. For example, to set the flush (3rd) bit to 0,
+the assembly is:
+ andb $0xfb,0x60(%rbx)
+and similarly for setting the full (1st) bit to 0:
+ andb $0xfe,-0x20(%rax)
+
+So I think this is really a bug on practical systems. I have observed
+a number of systems in this exact state, but am currently unable to
+reproduce it.
+
+Rather than leaving this footgun lying around for the future, take
+advantage of the fact that there is room in the struct anyway, and that
+it is already quite large and simply change the three bitfield members to
+bools. This avoids writes to space_info->full having any effect on
+writes to space_info->flush, regardless of locking.
+
+Fixes: 957780eb2788 ("Btrfs: introduce ticketed enospc infrastructure")
+Signed-off-by: Boris Burkov <boris@bur.io>
+---
+Changelog:
+v2:
+- migrate the three bitfield members to bools to step around the whole
+ atomic RMW issue in the most straightforward way.
+
+---
+ fs/btrfs/block-group.c | 6 +++---
+ fs/btrfs/space-info.c | 22 +++++++++++-----------
+ fs/btrfs/space-info.h | 6 +++---
+ 3 files changed, 17 insertions(+), 17 deletions(-)
+
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index 4330f5ba02dd..cd51f50a7c8b 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -4215,7 +4215,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
+ mutex_unlock(&fs_info->chunk_mutex);
+ } else {
+ /* Proceed with allocation */
+- space_info->chunk_alloc = 1;
++ space_info->chunk_alloc = true;
+ wait_for_alloc = false;
+ spin_unlock(&space_info->lock);
+ }
+@@ -4264,7 +4264,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
+ spin_lock(&space_info->lock);
+ if (ret < 0) {
+ if (ret == -ENOSPC)
+- space_info->full = 1;
++ space_info->full = true;
+ else
+ goto out;
+ } else {
+@@ -4274,7 +4274,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
+
+ space_info->force_alloc = CHUNK_ALLOC_NO_FORCE;
+ out:
+- space_info->chunk_alloc = 0;
++ space_info->chunk_alloc = false;
+ spin_unlock(&space_info->lock);
+ mutex_unlock(&fs_info->chunk_mutex);
+
+diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
+index 0e5c0c80e0fe..04a07d6f8537 100644
+--- a/fs/btrfs/space-info.c
++++ b/fs/btrfs/space-info.c
+@@ -192,7 +192,7 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info)
+ struct btrfs_space_info *found;
+
+ list_for_each_entry(found, head, list)
+- found->full = 0;
++ found->full = false;
+ }
+
+ /*
+@@ -372,7 +372,7 @@ void btrfs_add_bg_to_space_info(struct btrfs_fs_info *info,
+ space_info->bytes_readonly += block_group->bytes_super;
+ btrfs_space_info_update_bytes_zone_unusable(space_info, block_group->zone_unusable);
+ if (block_group->length > 0)
+- space_info->full = 0;
++ space_info->full = false;
+ btrfs_try_granting_tickets(info, space_info);
+ spin_unlock(&space_info->lock);
+
+@@ -1146,7 +1146,7 @@ static void do_async_reclaim_metadata_space(struct btrfs_space_info *space_info)
+ spin_lock(&space_info->lock);
+ to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info);
+ if (!to_reclaim) {
+- space_info->flush = 0;
++ space_info->flush = false;
+ spin_unlock(&space_info->lock);
+ return;
+ }
+@@ -1158,7 +1158,7 @@ static void do_async_reclaim_metadata_space(struct btrfs_space_info *space_info)
+ flush_space(fs_info, space_info, to_reclaim, flush_state, false);
+ spin_lock(&space_info->lock);
+ if (list_empty(&space_info->tickets)) {
+- space_info->flush = 0;
++ space_info->flush = false;
+ spin_unlock(&space_info->lock);
+ return;
+ }
+@@ -1201,7 +1201,7 @@ static void do_async_reclaim_metadata_space(struct btrfs_space_info *space_info)
+ flush_state = FLUSH_DELAYED_ITEMS_NR;
+ commit_cycles--;
+ } else {
+- space_info->flush = 0;
++ space_info->flush = false;
+ }
+ } else {
+ flush_state = FLUSH_DELAYED_ITEMS_NR;
+@@ -1383,7 +1383,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
+
+ spin_lock(&space_info->lock);
+ if (list_empty(&space_info->tickets)) {
+- space_info->flush = 0;
++ space_info->flush = false;
+ spin_unlock(&space_info->lock);
+ return;
+ }
+@@ -1394,7 +1394,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
+ flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE, false);
+ spin_lock(&space_info->lock);
+ if (list_empty(&space_info->tickets)) {
+- space_info->flush = 0;
++ space_info->flush = false;
+ spin_unlock(&space_info->lock);
+ return;
+ }
+@@ -1411,7 +1411,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
+ data_flush_states[flush_state], false);
+ spin_lock(&space_info->lock);
+ if (list_empty(&space_info->tickets)) {
+- space_info->flush = 0;
++ space_info->flush = false;
+ spin_unlock(&space_info->lock);
+ return;
+ }
+@@ -1428,7 +1428,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
+ if (maybe_fail_all_tickets(fs_info, space_info))
+ flush_state = 0;
+ else
+- space_info->flush = 0;
++ space_info->flush = false;
+ } else {
+ flush_state = 0;
+ }
+@@ -1444,7 +1444,7 @@ static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
+
+ aborted_fs:
+ maybe_fail_all_tickets(fs_info, space_info);
+- space_info->flush = 0;
++ space_info->flush = false;
+ spin_unlock(&space_info->lock);
+ }
+
+@@ -1825,7 +1825,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
+ */
+ maybe_clamp_preempt(fs_info, space_info);
+
+- space_info->flush = 1;
++ space_info->flush = true;
+ trace_btrfs_trigger_flush(fs_info,
+ space_info->flags,
+ orig_bytes, flush,
+diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
+index 679f22efb407..a846f63585c9 100644
+--- a/fs/btrfs/space-info.h
++++ b/fs/btrfs/space-info.h
+@@ -142,11 +142,11 @@ struct btrfs_space_info {
+ flushing. The value is >> clamp, so turns
+ out to be a 2^clamp divisor. */
+
+- unsigned int full:1; /* indicates that we cannot allocate any more
++ bool full; /* indicates that we cannot allocate any more
+ chunks for this space */
+- unsigned int chunk_alloc:1; /* set if we are allocating a chunk */
++ bool chunk_alloc; /* set if we are allocating a chunk */
+
+- unsigned int flush:1; /* set if we are trying to make space */
++ bool flush; /* set if we are trying to make space */
+
+ unsigned int force_alloc; /* set if we need to force a chunk
+ alloc for this space */
+--
+2.50.1
+
+
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-06 11:42 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-06 11:42 UTC (permalink / raw
To: gentoo-commits
commit: 09c964d5cc48ac2c3f1cd02b757d1c62067c0341
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 6 11:41:44 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 6 11:41:44 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=09c964d5
Remove patch 2101 upstreamed
blk-mq: fix blk_mq_tags double free while nr_requests grown
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 --
..._tags_double_free_while_nr_requests_grown.patch | 47 ----------------------
2 files changed, 51 deletions(-)
diff --git a/0000_README b/0000_README
index 8b807251..0b0898c5 100644
--- a/0000_README
+++ b/0000_README
@@ -67,10 +67,6 @@ Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
-Patch: 2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch
-From: https://lore.kernel.org/all/CAFj5m9K+ct=ioJUz8v78Wr_myC7pjVnB1SAKRXc-CLysHV_5ww@mail.gmail.com/
-Desc: blk-mq: fix blk_mq_tags double free while nr_requests grown
-
Patch: 2901_permit-menuconfig-sorting.patch
From: https://lore.kernel.org/
Desc: menuconfig: Allow sorting the entries alphabetically
diff --git a/2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch b/2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch
deleted file mode 100644
index e47b4b2a..00000000
--- a/2101_blk-mq_fix_blk_mq_tags_double_free_while_nr_requests_grown.patch
+++ /dev/null
@@ -1,47 +0,0 @@
-From ba28afbd9eff2a6370f23ef4e6a036ab0cfda409 Mon Sep 17 00:00:00 2001
-From: Yu Kuai <yukuai3@huawei.com>
-Date: Thu, 21 Aug 2025 14:06:12 +0800
-Subject: blk-mq: fix blk_mq_tags double free while nr_requests grown
-
-In the case user trigger tags grow by queue sysfs attribute nr_requests,
-hctx->sched_tags will be freed directly and replaced with a new
-allocated tags, see blk_mq_tag_update_depth().
-
-The problem is that hctx->sched_tags is from elevator->et->tags, while
-et->tags is still the freed tags, hence later elevator exit will try to
-free the tags again, causing kernel panic.
-
-Fix this problem by replacing et->tags with new allocated tags as well.
-
-Noted there are still some long term problems that will require some
-refactor to be fixed thoroughly[1].
-
-[1] https://lore.kernel.org/all/20250815080216.410665-1-yukuai1@huaweicloud.com/
-Fixes: f5a6604f7a44 ("block: fix lockdep warning caused by lock dependency in elv_iosched_store")
-
-Signed-off-by: Yu Kuai <yukuai3@huawei.com>
-Reviewed-by: Ming Lei <ming.lei@redhat.com>
-Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
-Reviewed-by: Hannes Reinecke <hare@suse.de>
-Reviewed-by: Li Nan <linan122@huawei.com>
-Link: https://lore.kernel.org/r/20250821060612.1729939-3-yukuai1@huaweicloud.com
-Signed-off-by: Jens Axboe <axboe@kernel.dk>
----
- block/blk-mq-tag.c | 1 +
- 1 file changed, 1 insertion(+)
-
-diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
-index d880c50629d612..5cffa5668d0c38 100644
---- a/block/blk-mq-tag.c
-+++ b/block/blk-mq-tag.c
-@@ -622,6 +622,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
- return -ENOMEM;
-
- blk_mq_free_map_and_rqs(set, *tagsptr, hctx->queue_num);
-+ hctx->queue->elevator->et->tags[hctx->queue_num] = new;
- *tagsptr = new;
- } else {
- /*
---
-cgit 1.2.3-korg
-
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-13 11:56 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-13 11:56 UTC (permalink / raw
To: gentoo-commits
commit: e8fad4046f7f600d60c08c1c1472af37aa2ac03c
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 13 11:55:47 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 13 11:55:47 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e8fad404
Linux patch 6.17.2
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +
1001_linux-6.17.2.patch | 1181 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 1185 insertions(+)
diff --git a/0000_README b/0000_README
index 0b0898c5..c1ac05f0 100644
--- a/0000_README
+++ b/0000_README
@@ -47,6 +47,10 @@ Patch: 1000_linux-6.17.1.patch
From: https://www.kernel.org
Desc: Linux 6.17.1
+Patch: 1001_linux-6.17.2.patch
+From: https://www.kernel.org
+Desc: Linux 6.17.2
+
Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
diff --git a/1001_linux-6.17.2.patch b/1001_linux-6.17.2.patch
new file mode 100644
index 00000000..dc135773
--- /dev/null
+++ b/1001_linux-6.17.2.patch
@@ -0,0 +1,1181 @@
+diff --git a/Makefile b/Makefile
+index 389bfac0adaaac..a04d0223dc840a 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 17
+-SUBLEVEL = 1
++SUBLEVEL = 2
+ EXTRAVERSION =
+ NAME = Baby Opossum Posse
+
+diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
+index 1349e278cd2a13..542d3664afa31a 100644
+--- a/arch/x86/kvm/emulate.c
++++ b/arch/x86/kvm/emulate.c
+@@ -5107,12 +5107,11 @@ void init_decode_cache(struct x86_emulate_ctxt *ctxt)
+ ctxt->mem_read.end = 0;
+ }
+
+-int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
++int x86_emulate_insn(struct x86_emulate_ctxt *ctxt, bool check_intercepts)
+ {
+ const struct x86_emulate_ops *ops = ctxt->ops;
+ int rc = X86EMUL_CONTINUE;
+ int saved_dst_type = ctxt->dst.type;
+- bool is_guest_mode = ctxt->ops->is_guest_mode(ctxt);
+
+ ctxt->mem_read.pos = 0;
+
+@@ -5160,7 +5159,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
+ fetch_possible_mmx_operand(&ctxt->dst);
+ }
+
+- if (unlikely(is_guest_mode) && ctxt->intercept) {
++ if (unlikely(check_intercepts) && ctxt->intercept) {
+ rc = emulator_check_intercept(ctxt, ctxt->intercept,
+ X86_ICPT_PRE_EXCEPT);
+ if (rc != X86EMUL_CONTINUE)
+@@ -5189,7 +5188,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
+ goto done;
+ }
+
+- if (unlikely(is_guest_mode) && (ctxt->d & Intercept)) {
++ if (unlikely(check_intercepts) && (ctxt->d & Intercept)) {
+ rc = emulator_check_intercept(ctxt, ctxt->intercept,
+ X86_ICPT_POST_EXCEPT);
+ if (rc != X86EMUL_CONTINUE)
+@@ -5243,7 +5242,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
+
+ special_insn:
+
+- if (unlikely(is_guest_mode) && (ctxt->d & Intercept)) {
++ if (unlikely(check_intercepts) && (ctxt->d & Intercept)) {
+ rc = emulator_check_intercept(ctxt, ctxt->intercept,
+ X86_ICPT_POST_MEMACCESS);
+ if (rc != X86EMUL_CONTINUE)
+diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
+index c1df5acfacaffa..7b5ddb787a251e 100644
+--- a/arch/x86/kvm/kvm_emulate.h
++++ b/arch/x86/kvm/kvm_emulate.h
+@@ -235,7 +235,6 @@ struct x86_emulate_ops {
+ void (*set_nmi_mask)(struct x86_emulate_ctxt *ctxt, bool masked);
+
+ bool (*is_smm)(struct x86_emulate_ctxt *ctxt);
+- bool (*is_guest_mode)(struct x86_emulate_ctxt *ctxt);
+ int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
+ void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
+ int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+@@ -521,7 +520,7 @@ bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt);
+ #define EMULATION_RESTART 1
+ #define EMULATION_INTERCEPTED 2
+ void init_decode_cache(struct x86_emulate_ctxt *ctxt);
+-int x86_emulate_insn(struct x86_emulate_ctxt *ctxt);
++int x86_emulate_insn(struct x86_emulate_ctxt *ctxt, bool check_intercepts);
+ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
+ u16 tss_selector, int idt_index, int reason,
+ bool has_error_code, u32 error_code);
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 706b6fd56d3c5d..e6ae226704cba5 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -8470,11 +8470,6 @@ static bool emulator_is_smm(struct x86_emulate_ctxt *ctxt)
+ return is_smm(emul_to_vcpu(ctxt));
+ }
+
+-static bool emulator_is_guest_mode(struct x86_emulate_ctxt *ctxt)
+-{
+- return is_guest_mode(emul_to_vcpu(ctxt));
+-}
+-
+ #ifndef CONFIG_KVM_SMM
+ static int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
+ {
+@@ -8558,7 +8553,6 @@ static const struct x86_emulate_ops emulate_ops = {
+ .guest_cpuid_is_intel_compatible = emulator_guest_cpuid_is_intel_compatible,
+ .set_nmi_mask = emulator_set_nmi_mask,
+ .is_smm = emulator_is_smm,
+- .is_guest_mode = emulator_is_guest_mode,
+ .leave_smm = emulator_leave_smm,
+ .triple_fault = emulator_triple_fault,
+ .set_xcr = emulator_set_xcr,
+@@ -9143,7 +9137,14 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+ ctxt->exception.address = 0;
+ }
+
+- r = x86_emulate_insn(ctxt);
++ /*
++ * Check L1's instruction intercepts when emulating instructions for
++ * L2, unless KVM is re-emulating a previously decoded instruction,
++ * e.g. to complete userspace I/O, in which case KVM has already
++ * checked the intercepts.
++ */
++ r = x86_emulate_insn(ctxt, is_guest_mode(vcpu) &&
++ !(emulation_type & EMULTYPE_NO_DECODE));
+
+ if (r == EMULATION_INTERCEPTED)
+ return 1;
+diff --git a/crypto/rng.c b/crypto/rng.c
+index b8ae6ebc091dd5..ee1768c5a4005b 100644
+--- a/crypto/rng.c
++++ b/crypto/rng.c
+@@ -168,6 +168,11 @@ int crypto_del_default_rng(void)
+ EXPORT_SYMBOL_GPL(crypto_del_default_rng);
+ #endif
+
++static void rng_default_set_ent(struct crypto_rng *tfm, const u8 *data,
++ unsigned int len)
++{
++}
++
+ int crypto_register_rng(struct rng_alg *alg)
+ {
+ struct crypto_alg *base = &alg->base;
+@@ -179,6 +184,9 @@ int crypto_register_rng(struct rng_alg *alg)
+ base->cra_flags &= ~CRYPTO_ALG_TYPE_MASK;
+ base->cra_flags |= CRYPTO_ALG_TYPE_RNG;
+
++ if (!alg->set_ent)
++ alg->set_ent = rng_default_set_ent;
++
+ return crypto_register_alg(base);
+ }
+ EXPORT_SYMBOL_GPL(crypto_register_rng);
+diff --git a/crypto/testmgr.c b/crypto/testmgr.c
+index ee33ba21ae2bc0..3e284706152aa4 100644
+--- a/crypto/testmgr.c
++++ b/crypto/testmgr.c
+@@ -4186,6 +4186,7 @@ static const struct alg_test_desc alg_test_descs[] = {
+ .alg = "authenc(hmac(sha1),cbc(aes))",
+ .generic_driver = "authenc(hmac-sha1-lib,cbc(aes-generic))",
+ .test = alg_test_aead,
++ .fips_allowed = 1,
+ .suite = {
+ .aead = __VECS(hmac_sha1_aes_cbc_tv_temp)
+ }
+@@ -4206,6 +4207,7 @@ static const struct alg_test_desc alg_test_descs[] = {
+ }, {
+ .alg = "authenc(hmac(sha1),ctr(aes))",
+ .test = alg_test_null,
++ .fips_allowed = 1,
+ }, {
+ .alg = "authenc(hmac(sha1),ecb(cipher_null))",
+ .generic_driver = "authenc(hmac-sha1-lib,ecb-cipher_null)",
+@@ -4216,6 +4218,7 @@ static const struct alg_test_desc alg_test_descs[] = {
+ }, {
+ .alg = "authenc(hmac(sha1),rfc3686(ctr(aes)))",
+ .test = alg_test_null,
++ .fips_allowed = 1,
+ }, {
+ .alg = "authenc(hmac(sha224),cbc(des))",
+ .generic_driver = "authenc(hmac-sha224-lib,cbc(des-generic))",
+@@ -5078,6 +5081,7 @@ static const struct alg_test_desc alg_test_descs[] = {
+ .alg = "hmac(sha1)",
+ .generic_driver = "hmac-sha1-lib",
+ .test = alg_test_hash,
++ .fips_allowed = 1,
+ .suite = {
+ .hash = __VECS(hmac_sha1_tv_template)
+ }
+@@ -5448,6 +5452,7 @@ static const struct alg_test_desc alg_test_descs[] = {
+ .alg = "sha1",
+ .generic_driver = "sha1-lib",
+ .test = alg_test_hash,
++ .fips_allowed = 1,
+ .suite = {
+ .hash = __VECS(sha1_tv_template)
+ }
+diff --git a/crypto/zstd.c b/crypto/zstd.c
+index c2a19cb0879d60..ac318d333b6847 100644
+--- a/crypto/zstd.c
++++ b/crypto/zstd.c
+@@ -83,7 +83,7 @@ static void zstd_exit(struct crypto_acomp *acomp_tfm)
+ static int zstd_compress_one(struct acomp_req *req, struct zstd_ctx *ctx,
+ const void *src, void *dst, unsigned int *dlen)
+ {
+- unsigned int out_len;
++ size_t out_len;
+
+ ctx->cctx = zstd_init_cctx(ctx->wksp, ctx->wksp_size);
+ if (!ctx->cctx)
+diff --git a/drivers/android/dbitmap.h b/drivers/android/dbitmap.h
+index 956f1bd087d1c5..c7299ce8b37413 100644
+--- a/drivers/android/dbitmap.h
++++ b/drivers/android/dbitmap.h
+@@ -37,6 +37,7 @@ static inline void dbitmap_free(struct dbitmap *dmap)
+ {
+ dmap->nbits = 0;
+ kfree(dmap->map);
++ dmap->map = NULL;
+ }
+
+ /* Returns the nbits that a dbitmap can shrink to, 0 if not possible. */
+diff --git a/drivers/base/faux.c b/drivers/base/faux.c
+index f5fbda0a9a44bd..21dd02124231a9 100644
+--- a/drivers/base/faux.c
++++ b/drivers/base/faux.c
+@@ -155,6 +155,7 @@ struct faux_device *faux_device_create_with_groups(const char *name,
+ dev->parent = &faux_bus_root;
+ dev->bus = &faux_bus_type;
+ dev_set_name(dev, "%s", name);
++ device_set_pm_not_required(dev);
+
+ ret = device_add(dev);
+ if (ret) {
+diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
+index 8085fabadde8ff..3595a8bad6bdfe 100644
+--- a/drivers/bluetooth/btusb.c
++++ b/drivers/bluetooth/btusb.c
+@@ -522,6 +522,8 @@ static const struct usb_device_id quirks_table[] = {
+ /* Realtek 8851BU Bluetooth devices */
+ { USB_DEVICE(0x3625, 0x010b), .driver_info = BTUSB_REALTEK |
+ BTUSB_WIDEBAND_SPEECH },
++ { USB_DEVICE(0x2001, 0x332a), .driver_info = BTUSB_REALTEK |
++ BTUSB_WIDEBAND_SPEECH },
+
+ /* Realtek 8852AE Bluetooth devices */
+ { USB_DEVICE(0x0bda, 0x2852), .driver_info = BTUSB_REALTEK |
+diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+index 3f6a828cad8ad8..1445da1f53afb4 100644
+--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+@@ -711,6 +711,12 @@ static int mes_v11_0_set_hw_resources(struct amdgpu_mes *mes)
+ mes_set_hw_res_pkt.enable_reg_active_poll = 1;
+ mes_set_hw_res_pkt.enable_level_process_quantum_check = 1;
+ mes_set_hw_res_pkt.oversubscription_timer = 50;
++ if ((mes->adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 0x7f)
++ mes_set_hw_res_pkt.enable_lr_compute_wa = 1;
++ else
++ dev_info_once(mes->adev->dev,
++ "MES FW version must be >= 0x7f to enable LR compute workaround.\n");
++
+ if (amdgpu_mes_log_enable) {
+ mes_set_hw_res_pkt.enable_mes_event_int_logging = 1;
+ mes_set_hw_res_pkt.event_intr_history_gpu_mc_ptr =
+diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
+index 6b222630f3fa1d..39caac14d5fe1c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
+@@ -738,6 +738,11 @@ static int mes_v12_0_set_hw_resources(struct amdgpu_mes *mes, int pipe)
+ mes_set_hw_res_pkt.use_different_vmid_compute = 1;
+ mes_set_hw_res_pkt.enable_reg_active_poll = 1;
+ mes_set_hw_res_pkt.enable_level_process_quantum_check = 1;
++ if ((mes->adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 0x82)
++ mes_set_hw_res_pkt.enable_lr_compute_wa = 1;
++ else
++ dev_info_once(adev->dev,
++ "MES FW version must be >= 0x82 to enable LR compute workaround.\n");
+
+ /*
+ * Keep oversubscribe timer for sdma . When we have unmapped doorbell
+diff --git a/drivers/gpu/drm/amd/include/mes_v11_api_def.h b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
+index 15680c3f49704e..ab1cfc92dbeb1b 100644
+--- a/drivers/gpu/drm/amd/include/mes_v11_api_def.h
++++ b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
+@@ -238,7 +238,8 @@ union MESAPI_SET_HW_RESOURCES {
+ uint32_t enable_mes_sch_stb_log : 1;
+ uint32_t limit_single_process : 1;
+ uint32_t is_strix_tmz_wa_enabled :1;
+- uint32_t reserved : 13;
++ uint32_t enable_lr_compute_wa : 1;
++ uint32_t reserved : 12;
+ };
+ uint32_t uint32_t_all;
+ };
+diff --git a/drivers/gpu/drm/amd/include/mes_v12_api_def.h b/drivers/gpu/drm/amd/include/mes_v12_api_def.h
+index d85ffab2aff9de..a402974939d63c 100644
+--- a/drivers/gpu/drm/amd/include/mes_v12_api_def.h
++++ b/drivers/gpu/drm/amd/include/mes_v12_api_def.h
+@@ -286,7 +286,8 @@ union MESAPI_SET_HW_RESOURCES {
+ uint32_t limit_single_process : 1;
+ uint32_t unmapped_doorbell_handling: 2;
+ uint32_t enable_mes_fence_int: 1;
+- uint32_t reserved : 10;
++ uint32_t enable_lr_compute_wa : 1;
++ uint32_t reserved : 9;
+ };
+ uint32_t uint32_all;
+ };
+diff --git a/drivers/misc/amd-sbi/Kconfig b/drivers/misc/amd-sbi/Kconfig
+index 4840831c84ca48..4aae0733d0fc16 100644
+--- a/drivers/misc/amd-sbi/Kconfig
++++ b/drivers/misc/amd-sbi/Kconfig
+@@ -2,6 +2,7 @@
+ config AMD_SBRMI_I2C
+ tristate "AMD side band RMI support"
+ depends on I2C
++ select REGMAP_I2C
+ help
+ Side band RMI over I2C support for AMD out of band management.
+
+diff --git a/drivers/net/wireless/realtek/rtl8xxxu/core.c b/drivers/net/wireless/realtek/rtl8xxxu/core.c
+index 831b5025c63492..018f5afcd50d26 100644
+--- a/drivers/net/wireless/realtek/rtl8xxxu/core.c
++++ b/drivers/net/wireless/realtek/rtl8xxxu/core.c
+@@ -8172,8 +8172,6 @@ static const struct usb_device_id dev_table[] = {
+ .driver_info = (unsigned long)&rtl8192cu_fops},
+ {USB_DEVICE_AND_INTERFACE_INFO(0x06f8, 0xe033, 0xff, 0xff, 0xff),
+ .driver_info = (unsigned long)&rtl8192cu_fops},
+-{USB_DEVICE_AND_INTERFACE_INFO(0x07b8, 0x8188, 0xff, 0xff, 0xff),
+- .driver_info = (unsigned long)&rtl8192cu_fops},
+ {USB_DEVICE_AND_INTERFACE_INFO(0x07b8, 0x8189, 0xff, 0xff, 0xff),
+ .driver_info = (unsigned long)&rtl8192cu_fops},
+ {USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9041, 0xff, 0xff, 0xff),
+diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c
+index 00a6778df7049f..9480823af838f5 100644
+--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c
++++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c
+@@ -291,7 +291,6 @@ static const struct usb_device_id rtl8192c_usb_ids[] = {
+ {RTL_USB_DEVICE(0x050d, 0x1102, rtl92cu_hal_cfg)}, /*Belkin - Edimax*/
+ {RTL_USB_DEVICE(0x050d, 0x11f2, rtl92cu_hal_cfg)}, /*Belkin - ISY*/
+ {RTL_USB_DEVICE(0x06f8, 0xe033, rtl92cu_hal_cfg)}, /*Hercules - Edimax*/
+- {RTL_USB_DEVICE(0x07b8, 0x8188, rtl92cu_hal_cfg)}, /*Abocom - Abocom*/
+ {RTL_USB_DEVICE(0x07b8, 0x8189, rtl92cu_hal_cfg)}, /*Funai - Abocom*/
+ {RTL_USB_DEVICE(0x0846, 0x9041, rtl92cu_hal_cfg)}, /*NetGear WNA1000M*/
+ {RTL_USB_DEVICE(0x0846, 0x9043, rtl92cu_hal_cfg)}, /*NG WNA1000Mv2*/
+diff --git a/drivers/nvmem/layouts.c b/drivers/nvmem/layouts.c
+index 65d39e19f6eca4..f381ce1e84bd37 100644
+--- a/drivers/nvmem/layouts.c
++++ b/drivers/nvmem/layouts.c
+@@ -45,11 +45,24 @@ static void nvmem_layout_bus_remove(struct device *dev)
+ return drv->remove(layout);
+ }
+
++static int nvmem_layout_bus_uevent(const struct device *dev,
++ struct kobj_uevent_env *env)
++{
++ int ret;
++
++ ret = of_device_uevent_modalias(dev, env);
++ if (ret != ENODEV)
++ return ret;
++
++ return 0;
++}
++
+ static const struct bus_type nvmem_layout_bus_type = {
+ .name = "nvmem-layout",
+ .match = nvmem_layout_bus_match,
+ .probe = nvmem_layout_bus_probe,
+ .remove = nvmem_layout_bus_remove,
++ .uevent = nvmem_layout_bus_uevent,
+ };
+
+ int __nvmem_layout_driver_register(struct nvmem_layout_driver *drv,
+diff --git a/drivers/staging/axis-fifo/axis-fifo.c b/drivers/staging/axis-fifo/axis-fifo.c
+index 57ed58065ebac3..b6261b96e4651c 100644
+--- a/drivers/staging/axis-fifo/axis-fifo.c
++++ b/drivers/staging/axis-fifo/axis-fifo.c
+@@ -43,7 +43,6 @@
+ #define DRIVER_NAME "axis_fifo"
+
+ #define READ_BUF_SIZE 128U /* read buffer length in words */
+-#define WRITE_BUF_SIZE 128U /* write buffer length in words */
+
+ #define AXIS_FIFO_DEBUG_REG_NAME_MAX_LEN 4
+
+@@ -228,6 +227,7 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
+ }
+
+ bytes_available = ioread32(fifo->base_addr + XLLF_RLR_OFFSET);
++ words_available = bytes_available / sizeof(u32);
+ if (!bytes_available) {
+ dev_err(fifo->dt_device, "received a packet of length 0\n");
+ ret = -EIO;
+@@ -238,7 +238,7 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
+ dev_err(fifo->dt_device, "user read buffer too small (available bytes=%zu user buffer bytes=%zu)\n",
+ bytes_available, len);
+ ret = -EINVAL;
+- goto end_unlock;
++ goto err_flush_rx;
+ }
+
+ if (bytes_available % sizeof(u32)) {
+@@ -247,11 +247,9 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
+ */
+ dev_err(fifo->dt_device, "received a packet that isn't word-aligned\n");
+ ret = -EIO;
+- goto end_unlock;
++ goto err_flush_rx;
+ }
+
+- words_available = bytes_available / sizeof(u32);
+-
+ /* read data into an intermediate buffer, copying the contents
+ * to userspace when the buffer is full
+ */
+@@ -263,18 +261,23 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
+ tmp_buf[i] = ioread32(fifo->base_addr +
+ XLLF_RDFD_OFFSET);
+ }
++ words_available -= copy;
+
+ if (copy_to_user(buf + copied * sizeof(u32), tmp_buf,
+ copy * sizeof(u32))) {
+ ret = -EFAULT;
+- goto end_unlock;
++ goto err_flush_rx;
+ }
+
+ copied += copy;
+- words_available -= copy;
+ }
++ mutex_unlock(&fifo->read_lock);
++
++ return bytes_available;
+
+- ret = bytes_available;
++err_flush_rx:
++ while (words_available--)
++ ioread32(fifo->base_addr + XLLF_RDFD_OFFSET);
+
+ end_unlock:
+ mutex_unlock(&fifo->read_lock);
+@@ -302,11 +305,8 @@ static ssize_t axis_fifo_write(struct file *f, const char __user *buf,
+ {
+ struct axis_fifo *fifo = (struct axis_fifo *)f->private_data;
+ unsigned int words_to_write;
+- unsigned int copied;
+- unsigned int copy;
+- unsigned int i;
++ u32 *txbuf;
+ int ret;
+- u32 tmp_buf[WRITE_BUF_SIZE];
+
+ if (len % sizeof(u32)) {
+ dev_err(fifo->dt_device,
+@@ -322,11 +322,17 @@ static ssize_t axis_fifo_write(struct file *f, const char __user *buf,
+ return -EINVAL;
+ }
+
+- if (words_to_write > fifo->tx_fifo_depth) {
+- dev_err(fifo->dt_device, "tried to write more words [%u] than slots in the fifo buffer [%u]\n",
+- words_to_write, fifo->tx_fifo_depth);
++ /*
++ * In 'Store-and-Forward' mode, the maximum packet that can be
++ * transmitted is limited by the size of the FIFO, which is
++ * (C_TX_FIFO_DEPTH–4)*(data interface width/8) bytes.
++ *
++ * Do not attempt to send a packet larger than 'tx_fifo_depth - 4',
++ * otherwise a 'Transmit Packet Overrun Error' interrupt will be
++ * raised, which requires a reset of the TX circuit to recover.
++ */
++ if (words_to_write > (fifo->tx_fifo_depth - 4))
+ return -EINVAL;
+- }
+
+ if (fifo->write_flags & O_NONBLOCK) {
+ /*
+@@ -365,32 +371,20 @@ static ssize_t axis_fifo_write(struct file *f, const char __user *buf,
+ }
+ }
+
+- /* write data from an intermediate buffer into the fifo IP, refilling
+- * the buffer with userspace data as needed
+- */
+- copied = 0;
+- while (words_to_write > 0) {
+- copy = min(words_to_write, WRITE_BUF_SIZE);
+-
+- if (copy_from_user(tmp_buf, buf + copied * sizeof(u32),
+- copy * sizeof(u32))) {
+- ret = -EFAULT;
+- goto end_unlock;
+- }
+-
+- for (i = 0; i < copy; i++)
+- iowrite32(tmp_buf[i], fifo->base_addr +
+- XLLF_TDFD_OFFSET);
+-
+- copied += copy;
+- words_to_write -= copy;
++ txbuf = vmemdup_user(buf, len);
++ if (IS_ERR(txbuf)) {
++ ret = PTR_ERR(txbuf);
++ goto end_unlock;
+ }
+
+- ret = copied * sizeof(u32);
++ for (int i = 0; i < words_to_write; ++i)
++ iowrite32(txbuf[i], fifo->base_addr + XLLF_TDFD_OFFSET);
+
+ /* write packet size to fifo */
+- iowrite32(ret, fifo->base_addr + XLLF_TLR_OFFSET);
++ iowrite32(len, fifo->base_addr + XLLF_TLR_OFFSET);
+
++ ret = len;
++ kvfree(txbuf);
+ end_unlock:
+ mutex_unlock(&fifo->write_lock);
+
+diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
+index 44427415a80d7d..2829950d5bcba0 100644
+--- a/drivers/tty/serial/Kconfig
++++ b/drivers/tty/serial/Kconfig
+@@ -1412,7 +1412,7 @@ config SERIAL_STM32
+
+ config SERIAL_STM32_CONSOLE
+ bool "Support for console on STM32"
+- depends on SERIAL_STM32=y
++ depends on SERIAL_STM32
+ select SERIAL_CORE_CONSOLE
+ select SERIAL_EARLYCON
+
+diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
+index 32ec632fd0807f..81f385d900d061 100644
+--- a/drivers/tty/serial/qcom_geni_serial.c
++++ b/drivers/tty/serial/qcom_geni_serial.c
+@@ -11,7 +11,6 @@
+ #include <linux/irq.h>
+ #include <linux/module.h>
+ #include <linux/of.h>
+-#include <linux/pm_domain.h>
+ #include <linux/pm_opp.h>
+ #include <linux/platform_device.h>
+ #include <linux/pm_runtime.h>
+@@ -100,16 +99,10 @@
+ #define DMA_RX_BUF_SIZE 2048
+
+ static DEFINE_IDA(port_ida);
+-#define DOMAIN_IDX_POWER 0
+-#define DOMAIN_IDX_PERF 1
+
+ struct qcom_geni_device_data {
+ bool console;
+ enum geni_se_xfer_mode mode;
+- struct dev_pm_domain_attach_data pd_data;
+- int (*resources_init)(struct uart_port *uport);
+- int (*set_rate)(struct uart_port *uport, unsigned int baud);
+- int (*power_state)(struct uart_port *uport, bool state);
+ };
+
+ struct qcom_geni_private_data {
+@@ -147,7 +140,6 @@ struct qcom_geni_serial_port {
+
+ struct qcom_geni_private_data private_data;
+ const struct qcom_geni_device_data *dev_data;
+- struct dev_pm_domain_list *pd_list;
+ };
+
+ static const struct uart_ops qcom_geni_console_pops;
+@@ -1370,42 +1362,6 @@ static int geni_serial_set_rate(struct uart_port *uport, unsigned int baud)
+ return 0;
+ }
+
+-static int geni_serial_set_level(struct uart_port *uport, unsigned int baud)
+-{
+- struct qcom_geni_serial_port *port = to_dev_port(uport);
+- struct device *perf_dev = port->pd_list->pd_devs[DOMAIN_IDX_PERF];
+-
+- /*
+- * The performance protocol sets UART communication
+- * speeds by selecting different performance levels
+- * through the OPP framework.
+- *
+- * Supported perf levels for baudrates in firmware are below
+- * +---------------------+--------------------+
+- * | Perf level value | Baudrate values |
+- * +---------------------+--------------------+
+- * | 300 | 300 |
+- * | 1200 | 1200 |
+- * | 2400 | 2400 |
+- * | 4800 | 4800 |
+- * | 9600 | 9600 |
+- * | 19200 | 19200 |
+- * | 38400 | 38400 |
+- * | 57600 | 57600 |
+- * | 115200 | 115200 |
+- * | 230400 | 230400 |
+- * | 460800 | 460800 |
+- * | 921600 | 921600 |
+- * | 2000000 | 2000000 |
+- * | 3000000 | 3000000 |
+- * | 3200000 | 3200000 |
+- * | 4000000 | 4000000 |
+- * +---------------------+--------------------+
+- */
+-
+- return dev_pm_opp_set_level(perf_dev, baud);
+-}
+-
+ static void qcom_geni_serial_set_termios(struct uart_port *uport,
+ struct ktermios *termios,
+ const struct ktermios *old)
+@@ -1424,7 +1380,7 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport,
+ /* baud rate */
+ baud = uart_get_baud_rate(uport, termios, old, 300, 8000000);
+
+- ret = port->dev_data->set_rate(uport, baud);
++ ret = geni_serial_set_rate(uport, baud);
+ if (ret)
+ return;
+
+@@ -1711,27 +1667,8 @@ static int geni_serial_resources_off(struct uart_port *uport)
+ return 0;
+ }
+
+-static int geni_serial_resource_state(struct uart_port *uport, bool power_on)
+-{
+- return power_on ? geni_serial_resources_on(uport) : geni_serial_resources_off(uport);
+-}
+-
+-static int geni_serial_pwr_init(struct uart_port *uport)
+-{
+- struct qcom_geni_serial_port *port = to_dev_port(uport);
+- int ret;
+-
+- ret = dev_pm_domain_attach_list(port->se.dev,
+- &port->dev_data->pd_data, &port->pd_list);
+- if (ret <= 0)
+- return -EINVAL;
+-
+- return 0;
+-}
+-
+-static int geni_serial_resource_init(struct uart_port *uport)
++static int geni_serial_resource_init(struct qcom_geni_serial_port *port)
+ {
+- struct qcom_geni_serial_port *port = to_dev_port(uport);
+ int ret;
+
+ port->se.clk = devm_clk_get(port->se.dev, "se");
+@@ -1776,10 +1713,10 @@ static void qcom_geni_serial_pm(struct uart_port *uport,
+ old_state = UART_PM_STATE_OFF;
+
+ if (new_state == UART_PM_STATE_ON && old_state == UART_PM_STATE_OFF)
+- pm_runtime_resume_and_get(uport->dev);
++ geni_serial_resources_on(uport);
+ else if (new_state == UART_PM_STATE_OFF &&
+ old_state == UART_PM_STATE_ON)
+- pm_runtime_put_sync(uport->dev);
++ geni_serial_resources_off(uport);
+
+ }
+
+@@ -1882,16 +1819,13 @@ static int qcom_geni_serial_probe(struct platform_device *pdev)
+ port->se.dev = &pdev->dev;
+ port->se.wrapper = dev_get_drvdata(pdev->dev.parent);
+
+- ret = port->dev_data->resources_init(uport);
++ ret = geni_serial_resource_init(port);
+ if (ret)
+ return ret;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+- if (!res) {
+- ret = -EINVAL;
+- goto error;
+- }
+-
++ if (!res)
++ return -EINVAL;
+ uport->mapbase = res->start;
+
+ uport->rs485_config = qcom_geni_rs485_config;
+@@ -1903,26 +1837,19 @@ static int qcom_geni_serial_probe(struct platform_device *pdev)
+ if (!data->console) {
+ port->rx_buf = devm_kzalloc(uport->dev,
+ DMA_RX_BUF_SIZE, GFP_KERNEL);
+- if (!port->rx_buf) {
+- ret = -ENOMEM;
+- goto error;
+- }
++ if (!port->rx_buf)
++ return -ENOMEM;
+ }
+
+ port->name = devm_kasprintf(uport->dev, GFP_KERNEL,
+ "qcom_geni_serial_%s%d",
+ uart_console(uport) ? "console" : "uart", uport->line);
+- if (!port->name) {
+- ret = -ENOMEM;
+- goto error;
+- }
++ if (!port->name)
++ return -ENOMEM;
+
+ irq = platform_get_irq(pdev, 0);
+- if (irq < 0) {
+- ret = irq;
+- goto error;
+- }
+-
++ if (irq < 0)
++ return irq;
+ uport->irq = irq;
+ uport->has_sysrq = IS_ENABLED(CONFIG_SERIAL_QCOM_GENI_CONSOLE);
+
+@@ -1944,18 +1871,16 @@ static int qcom_geni_serial_probe(struct platform_device *pdev)
+ IRQF_TRIGGER_HIGH, port->name, uport);
+ if (ret) {
+ dev_err(uport->dev, "Failed to get IRQ ret %d\n", ret);
+- goto error;
++ return ret;
+ }
+
+ ret = uart_get_rs485_mode(uport);
+ if (ret)
+ return ret;
+
+- devm_pm_runtime_enable(port->se.dev);
+-
+ ret = uart_add_one_port(drv, uport);
+ if (ret)
+- goto error;
++ return ret;
+
+ if (port->wakeup_irq > 0) {
+ device_init_wakeup(&pdev->dev, true);
+@@ -1965,15 +1890,11 @@ static int qcom_geni_serial_probe(struct platform_device *pdev)
+ device_init_wakeup(&pdev->dev, false);
+ ida_free(&port_ida, uport->line);
+ uart_remove_one_port(drv, uport);
+- goto error;
++ return ret;
+ }
+ }
+
+ return 0;
+-
+-error:
+- dev_pm_domain_detach_list(port->pd_list);
+- return ret;
+ }
+
+ static void qcom_geni_serial_remove(struct platform_device *pdev)
+@@ -1986,31 +1907,6 @@ static void qcom_geni_serial_remove(struct platform_device *pdev)
+ device_init_wakeup(&pdev->dev, false);
+ ida_free(&port_ida, uport->line);
+ uart_remove_one_port(drv, &port->uport);
+- dev_pm_domain_detach_list(port->pd_list);
+-}
+-
+-static int __maybe_unused qcom_geni_serial_runtime_suspend(struct device *dev)
+-{
+- struct qcom_geni_serial_port *port = dev_get_drvdata(dev);
+- struct uart_port *uport = &port->uport;
+- int ret = 0;
+-
+- if (port->dev_data->power_state)
+- ret = port->dev_data->power_state(uport, false);
+-
+- return ret;
+-}
+-
+-static int __maybe_unused qcom_geni_serial_runtime_resume(struct device *dev)
+-{
+- struct qcom_geni_serial_port *port = dev_get_drvdata(dev);
+- struct uart_port *uport = &port->uport;
+- int ret = 0;
+-
+- if (port->dev_data->power_state)
+- ret = port->dev_data->power_state(uport, true);
+-
+- return ret;
+ }
+
+ static int qcom_geni_serial_suspend(struct device *dev)
+@@ -2048,46 +1944,14 @@ static int qcom_geni_serial_resume(struct device *dev)
+ static const struct qcom_geni_device_data qcom_geni_console_data = {
+ .console = true,
+ .mode = GENI_SE_FIFO,
+- .resources_init = geni_serial_resource_init,
+- .set_rate = geni_serial_set_rate,
+- .power_state = geni_serial_resource_state,
+ };
+
+ static const struct qcom_geni_device_data qcom_geni_uart_data = {
+ .console = false,
+ .mode = GENI_SE_DMA,
+- .resources_init = geni_serial_resource_init,
+- .set_rate = geni_serial_set_rate,
+- .power_state = geni_serial_resource_state,
+-};
+-
+-static const struct qcom_geni_device_data sa8255p_qcom_geni_console_data = {
+- .console = true,
+- .mode = GENI_SE_FIFO,
+- .pd_data = {
+- .pd_flags = PD_FLAG_DEV_LINK_ON,
+- .pd_names = (const char*[]) { "power", "perf" },
+- .num_pd_names = 2,
+- },
+- .resources_init = geni_serial_pwr_init,
+- .set_rate = geni_serial_set_level,
+-};
+-
+-static const struct qcom_geni_device_data sa8255p_qcom_geni_uart_data = {
+- .console = false,
+- .mode = GENI_SE_DMA,
+- .pd_data = {
+- .pd_flags = PD_FLAG_DEV_LINK_ON,
+- .pd_names = (const char*[]) { "power", "perf" },
+- .num_pd_names = 2,
+- },
+- .resources_init = geni_serial_pwr_init,
+- .set_rate = geni_serial_set_level,
+ };
+
+ static const struct dev_pm_ops qcom_geni_serial_pm_ops = {
+- SET_RUNTIME_PM_OPS(qcom_geni_serial_runtime_suspend,
+- qcom_geni_serial_runtime_resume, NULL)
+ SYSTEM_SLEEP_PM_OPS(qcom_geni_serial_suspend, qcom_geni_serial_resume)
+ };
+
+@@ -2096,18 +1960,10 @@ static const struct of_device_id qcom_geni_serial_match_table[] = {
+ .compatible = "qcom,geni-debug-uart",
+ .data = &qcom_geni_console_data,
+ },
+- {
+- .compatible = "qcom,sa8255p-geni-debug-uart",
+- .data = &sa8255p_qcom_geni_console_data,
+- },
+ {
+ .compatible = "qcom,geni-uart",
+ .data = &qcom_geni_uart_data,
+ },
+- {
+- .compatible = "qcom,sa8255p-geni-uart",
+- .data = &sa8255p_qcom_geni_uart_data,
+- },
+ {}
+ };
+ MODULE_DEVICE_TABLE(of, qcom_geni_serial_match_table);
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index fc869b7f803f04..62e984d20e5982 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -2114,6 +2114,12 @@ static const struct usb_device_id option_ids[] = {
+ { USB_DEVICE_INTERFACE_CLASS(0x1e0e, 0x9003, 0xff) }, /* Simcom SIM7500/SIM7600 MBIM mode */
+ { USB_DEVICE_INTERFACE_CLASS(0x1e0e, 0x9011, 0xff), /* Simcom SIM7500/SIM7600 RNDIS mode */
+ .driver_info = RSVD(7) },
++ { USB_DEVICE(0x1e0e, 0x9071), /* Simcom SIM8230 RMNET mode */
++ .driver_info = RSVD(3) | RSVD(4) },
++ { USB_DEVICE_INTERFACE_CLASS(0x1e0e, 0x9078, 0xff), /* Simcom SIM8230 ECM mode */
++ .driver_info = RSVD(5) },
++ { USB_DEVICE_INTERFACE_CLASS(0x1e0e, 0x907b, 0xff), /* Simcom SIM8230 RNDIS mode */
++ .driver_info = RSVD(5) },
+ { USB_DEVICE_INTERFACE_CLASS(0x1e0e, 0x9205, 0xff) }, /* Simcom SIM7070/SIM7080/SIM7090 AT+ECM mode */
+ { USB_DEVICE_INTERFACE_CLASS(0x1e0e, 0x9206, 0xff) }, /* Simcom SIM7070/SIM7080/SIM7090 AT-only mode */
+ { USB_DEVICE(ALCATEL_VENDOR_ID, ALCATEL_PRODUCT_X060S_X200),
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index 46be7560548ce2..b4b62ac46bc640 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -3764,6 +3764,7 @@ void f2fs_hash_filename(const struct inode *dir, struct f2fs_filename *fname);
+ * node.c
+ */
+ struct node_info;
++enum node_type;
+
+ int f2fs_check_nid_range(struct f2fs_sb_info *sbi, nid_t nid);
+ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type);
+@@ -3786,7 +3787,8 @@ int f2fs_remove_inode_page(struct inode *inode);
+ struct folio *f2fs_new_inode_folio(struct inode *inode);
+ struct folio *f2fs_new_node_folio(struct dnode_of_data *dn, unsigned int ofs);
+ void f2fs_ra_node_page(struct f2fs_sb_info *sbi, nid_t nid);
+-struct folio *f2fs_get_node_folio(struct f2fs_sb_info *sbi, pgoff_t nid);
++struct folio *f2fs_get_node_folio(struct f2fs_sb_info *sbi, pgoff_t nid,
++ enum node_type node_type);
+ struct folio *f2fs_get_inode_folio(struct f2fs_sb_info *sbi, pgoff_t ino);
+ struct folio *f2fs_get_xnode_folio(struct f2fs_sb_info *sbi, pgoff_t xnid);
+ int f2fs_move_node_folio(struct folio *node_folio, int gc_type);
+diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
+index 098e9f71421e2c..c0f209f7468829 100644
+--- a/fs/f2fs/gc.c
++++ b/fs/f2fs/gc.c
+@@ -1071,7 +1071,7 @@ static int gc_node_segment(struct f2fs_sb_info *sbi,
+ }
+
+ /* phase == 2 */
+- node_folio = f2fs_get_node_folio(sbi, nid);
++ node_folio = f2fs_get_node_folio(sbi, nid, NODE_TYPE_REGULAR);
+ if (IS_ERR(node_folio))
+ continue;
+
+@@ -1145,7 +1145,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
+ nid = le32_to_cpu(sum->nid);
+ ofs_in_node = le16_to_cpu(sum->ofs_in_node);
+
+- node_folio = f2fs_get_node_folio(sbi, nid);
++ node_folio = f2fs_get_node_folio(sbi, nid, NODE_TYPE_REGULAR);
+ if (IS_ERR(node_folio))
+ return false;
+
+diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
+index 27743b93e18672..92054dcbe20d09 100644
+--- a/fs/f2fs/node.c
++++ b/fs/f2fs/node.c
+@@ -871,7 +871,8 @@ int f2fs_get_dnode_of_data(struct dnode_of_data *dn, pgoff_t index, int mode)
+ }
+
+ if (!done) {
+- nfolio[i] = f2fs_get_node_folio(sbi, nids[i]);
++ nfolio[i] = f2fs_get_node_folio(sbi, nids[i],
++ NODE_TYPE_NON_INODE);
+ if (IS_ERR(nfolio[i])) {
+ err = PTR_ERR(nfolio[i]);
+ f2fs_folio_put(nfolio[0], false);
+@@ -989,7 +990,7 @@ static int truncate_dnode(struct dnode_of_data *dn)
+ return 1;
+
+ /* get direct node */
+- folio = f2fs_get_node_folio(sbi, dn->nid);
++ folio = f2fs_get_node_folio(sbi, dn->nid, NODE_TYPE_NON_INODE);
+ if (PTR_ERR(folio) == -ENOENT)
+ return 1;
+ else if (IS_ERR(folio))
+@@ -1033,7 +1034,8 @@ static int truncate_nodes(struct dnode_of_data *dn, unsigned int nofs,
+
+ trace_f2fs_truncate_nodes_enter(dn->inode, dn->nid, dn->data_blkaddr);
+
+- folio = f2fs_get_node_folio(F2FS_I_SB(dn->inode), dn->nid);
++ folio = f2fs_get_node_folio(F2FS_I_SB(dn->inode), dn->nid,
++ NODE_TYPE_NON_INODE);
+ if (IS_ERR(folio)) {
+ trace_f2fs_truncate_nodes_exit(dn->inode, PTR_ERR(folio));
+ return PTR_ERR(folio);
+@@ -1111,7 +1113,8 @@ static int truncate_partial_nodes(struct dnode_of_data *dn,
+ /* get indirect nodes in the path */
+ for (i = 0; i < idx + 1; i++) {
+ /* reference count'll be increased */
+- folios[i] = f2fs_get_node_folio(F2FS_I_SB(dn->inode), nid[i]);
++ folios[i] = f2fs_get_node_folio(F2FS_I_SB(dn->inode), nid[i],
++ NODE_TYPE_NON_INODE);
+ if (IS_ERR(folios[i])) {
+ err = PTR_ERR(folios[i]);
+ idx = i - 1;
+@@ -1496,21 +1499,37 @@ static int sanity_check_node_footer(struct f2fs_sb_info *sbi,
+ struct folio *folio, pgoff_t nid,
+ enum node_type ntype)
+ {
+- if (unlikely(nid != nid_of_node(folio) ||
+- (ntype == NODE_TYPE_INODE && !IS_INODE(folio)) ||
+- (ntype == NODE_TYPE_XATTR &&
+- !f2fs_has_xattr_block(ofs_of_node(folio))) ||
+- time_to_inject(sbi, FAULT_INCONSISTENT_FOOTER))) {
+- f2fs_warn(sbi, "inconsistent node block, node_type:%d, nid:%lu, "
+- "node_footer[nid:%u,ino:%u,ofs:%u,cpver:%llu,blkaddr:%u]",
+- ntype, nid, nid_of_node(folio), ino_of_node(folio),
+- ofs_of_node(folio), cpver_of_node(folio),
+- next_blkaddr_of_node(folio));
+- set_sbi_flag(sbi, SBI_NEED_FSCK);
+- f2fs_handle_error(sbi, ERROR_INCONSISTENT_FOOTER);
+- return -EFSCORRUPTED;
++ if (unlikely(nid != nid_of_node(folio)))
++ goto out_err;
++
++ switch (ntype) {
++ case NODE_TYPE_INODE:
++ if (!IS_INODE(folio))
++ goto out_err;
++ break;
++ case NODE_TYPE_XATTR:
++ if (!f2fs_has_xattr_block(ofs_of_node(folio)))
++ goto out_err;
++ break;
++ case NODE_TYPE_NON_INODE:
++ if (IS_INODE(folio))
++ goto out_err;
++ break;
++ default:
++ break;
+ }
++ if (time_to_inject(sbi, FAULT_INCONSISTENT_FOOTER))
++ goto out_err;
+ return 0;
++out_err:
++ f2fs_warn(sbi, "inconsistent node block, node_type:%d, nid:%lu, "
++ "node_footer[nid:%u,ino:%u,ofs:%u,cpver:%llu,blkaddr:%u]",
++ ntype, nid, nid_of_node(folio), ino_of_node(folio),
++ ofs_of_node(folio), cpver_of_node(folio),
++ next_blkaddr_of_node(folio));
++ set_sbi_flag(sbi, SBI_NEED_FSCK);
++ f2fs_handle_error(sbi, ERROR_INCONSISTENT_FOOTER);
++ return -EFSCORRUPTED;
+ }
+
+ static struct folio *__get_node_folio(struct f2fs_sb_info *sbi, pgoff_t nid,
+@@ -1567,9 +1586,10 @@ static struct folio *__get_node_folio(struct f2fs_sb_info *sbi, pgoff_t nid,
+ return ERR_PTR(err);
+ }
+
+-struct folio *f2fs_get_node_folio(struct f2fs_sb_info *sbi, pgoff_t nid)
++struct folio *f2fs_get_node_folio(struct f2fs_sb_info *sbi, pgoff_t nid,
++ enum node_type node_type)
+ {
+- return __get_node_folio(sbi, nid, NULL, 0, NODE_TYPE_REGULAR);
++ return __get_node_folio(sbi, nid, NULL, 0, node_type);
+ }
+
+ struct folio *f2fs_get_inode_folio(struct f2fs_sb_info *sbi, pgoff_t ino)
+diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
+index 030390543b54fb..9cb8dcf8d41760 100644
+--- a/fs/f2fs/node.h
++++ b/fs/f2fs/node.h
+@@ -57,6 +57,7 @@ enum node_type {
+ NODE_TYPE_REGULAR,
+ NODE_TYPE_INODE,
+ NODE_TYPE_XATTR,
++ NODE_TYPE_NON_INODE,
+ };
+
+ /*
+diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
+index 4cb3a91801b4d5..215e442db72c82 100644
+--- a/fs/f2fs/recovery.c
++++ b/fs/f2fs/recovery.c
+@@ -548,7 +548,7 @@ static int check_index_in_prev_nodes(struct f2fs_sb_info *sbi,
+ }
+
+ /* Get the node page */
+- node_folio = f2fs_get_node_folio(sbi, nid);
++ node_folio = f2fs_get_node_folio(sbi, nid, NODE_TYPE_REGULAR);
+ if (IS_ERR(node_folio))
+ return PTR_ERR(node_folio);
+
+diff --git a/include/linux/device.h b/include/linux/device.h
+index 0470d19da7f2ca..b031ff71a5bdfe 100644
+--- a/include/linux/device.h
++++ b/include/linux/device.h
+@@ -851,6 +851,9 @@ static inline bool device_pm_not_required(struct device *dev)
+ static inline void device_set_pm_not_required(struct device *dev)
+ {
+ dev->power.no_pm = true;
++#ifdef CONFIG_PM
++ dev->power.no_callbacks = true;
++#endif
+ }
+
+ static inline void dev_pm_syscore_device(struct device *dev, bool val)
+diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
+index 43460949ad3fda..1244d2c5c384ad 100644
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -7273,7 +7273,7 @@ int ring_buffer_map(struct trace_buffer *buffer, int cpu,
+ atomic_dec(&cpu_buffer->resize_disabled);
+ }
+
+- return 0;
++ return err;
+ }
+
+ int ring_buffer_unmap(struct trace_buffer *buffer, int cpu)
+diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
+index 339ec4e54778f3..8992d8bebbddf7 100644
+--- a/net/9p/trans_fd.c
++++ b/net/9p/trans_fd.c
+@@ -726,10 +726,10 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
+ p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);
+
+ spin_lock(&m->req_lock);
+- /* Ignore cancelled request if message has been received
+- * before lock.
+- */
+- if (req->status == REQ_STATUS_RCVD) {
++ /* Ignore cancelled request if status changed since the request was
++ * processed in p9_client_flush()
++ */
++ if (req->status != REQ_STATUS_SENT) {
+ spin_unlock(&m->req_lock);
+ return 0;
+ }
+diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
+index cd54cd64ea8878..e1af0fa302a372 100644
+--- a/rust/kernel/block/mq/gen_disk.rs
++++ b/rust/kernel/block/mq/gen_disk.rs
+@@ -3,7 +3,7 @@
+ //! Generic disk abstraction.
+ //!
+ //! C header: [`include/linux/blkdev.h`](srctree/include/linux/blkdev.h)
+-//! C header: [`include/linux/blk_mq.h`](srctree/include/linux/blk_mq.h)
++//! C header: [`include/linux/blk-mq.h`](srctree/include/linux/blk-mq.h)
+
+ use crate::block::mq::{raw_writer::RawWriter, Operations, TagSet};
+ use crate::{bindings, error::from_err_ptr, error::Result, sync::Arc};
+diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
+index d29c477e89a87d..f8f1db5eeb0f6f 100644
+--- a/rust/kernel/drm/device.rs
++++ b/rust/kernel/drm/device.rs
+@@ -2,7 +2,7 @@
+
+ //! DRM device.
+ //!
+-//! C header: [`include/linux/drm/drm_device.h`](srctree/include/linux/drm/drm_device.h)
++//! C header: [`include/drm/drm_device.h`](srctree/include/drm/drm_device.h)
+
+ use crate::{
+ alloc::allocator::Kmalloc,
+diff --git a/rust/kernel/drm/driver.rs b/rust/kernel/drm/driver.rs
+index fe7e8d06961aa5..d2dad77274c4ca 100644
+--- a/rust/kernel/drm/driver.rs
++++ b/rust/kernel/drm/driver.rs
+@@ -2,7 +2,7 @@
+
+ //! DRM driver core.
+ //!
+-//! C header: [`include/linux/drm/drm_drv.h`](srctree/include/linux/drm/drm_drv.h)
++//! C header: [`include/drm/drm_drv.h`](srctree/include/drm/drm_drv.h)
+
+ use crate::{
+ bindings, device, devres, drm,
+diff --git a/rust/kernel/drm/file.rs b/rust/kernel/drm/file.rs
+index e8789c9110d654..8c46f8d519516a 100644
+--- a/rust/kernel/drm/file.rs
++++ b/rust/kernel/drm/file.rs
+@@ -2,7 +2,7 @@
+
+ //! DRM File objects.
+ //!
+-//! C header: [`include/linux/drm/drm_file.h`](srctree/include/linux/drm/drm_file.h)
++//! C header: [`include/drm/drm_file.h`](srctree/include/drm/drm_file.h)
+
+ use crate::{bindings, drm, error::Result, prelude::*, types::Opaque};
+ use core::marker::PhantomData;
+diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
+index b71821cfb5eaa0..b9f3248876baa3 100644
+--- a/rust/kernel/drm/gem/mod.rs
++++ b/rust/kernel/drm/gem/mod.rs
+@@ -2,7 +2,7 @@
+
+ //! DRM GEM API
+ //!
+-//! C header: [`include/linux/drm/drm_gem.h`](srctree/include/linux/drm/drm_gem.h)
++//! C header: [`include/drm/drm_gem.h`](srctree/include/drm/drm_gem.h)
+
+ use crate::{
+ alloc::flags::*,
+diff --git a/rust/kernel/drm/ioctl.rs b/rust/kernel/drm/ioctl.rs
+index fdec01c371687c..8431cdcd3ae0ef 100644
+--- a/rust/kernel/drm/ioctl.rs
++++ b/rust/kernel/drm/ioctl.rs
+@@ -2,7 +2,7 @@
+
+ //! DRM IOCTL definitions.
+ //!
+-//! C header: [`include/linux/drm/drm_ioctl.h`](srctree/include/linux/drm/drm_ioctl.h)
++//! C header: [`include/drm/drm_ioctl.h`](srctree/include/drm/drm_ioctl.h)
+
+ use crate::ioctl;
+
+diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
+index 887ee611b55310..658e806a5da757 100644
+--- a/rust/kernel/pci.rs
++++ b/rust/kernel/pci.rs
+@@ -240,11 +240,11 @@ pub trait Driver: Send {
+
+ /// PCI driver probe.
+ ///
+- /// Called when a new platform device is added or discovered.
+- /// Implementers should attempt to initialize the device here.
++ /// Called when a new pci device is added or discovered. Implementers should
++ /// attempt to initialize the device here.
+ fn probe(dev: &Device<device::Core>, id_info: &Self::IdInfo) -> Result<Pin<KBox<Self>>>;
+
+- /// Platform driver unbind.
++ /// PCI driver unbind.
+ ///
+ /// Called when a [`Device`] is unbound from its bound [`Driver`]. Implementing this callback
+ /// is optional.
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-15 17:33 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-15 17:33 UTC (permalink / raw
To: gentoo-commits
commit: 56f2a097c24705387ea56a26845a90eb4b448879
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 15 12:46:20 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 15 12:46:20 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=56f2a097
Linux patch 6.17.3
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +
1002_linux-6.17.3.patch | 26612 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 26616 insertions(+)
diff --git a/0000_README b/0000_README
index c1ac05f0..0aa228a9 100644
--- a/0000_README
+++ b/0000_README
@@ -51,6 +51,10 @@ Patch: 1001_linux-6.17.2.patch
From: https://www.kernel.org
Desc: Linux 6.17.2
+Patch: 1002_linux-6.17.3.patch
+From: https://www.kernel.org
+Desc: Linux 6.17.3
+
Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
diff --git a/1002_linux-6.17.3.patch b/1002_linux-6.17.3.patch
new file mode 100644
index 00000000..b14ab696
--- /dev/null
+++ b/1002_linux-6.17.3.patch
@@ -0,0 +1,26612 @@
+diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml b/Documentation/devicetree/bindings/vendor-prefixes.yaml
+index 9ec8947dfcad2f..ed7fec614473dc 100644
+--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
++++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
+@@ -86,6 +86,8 @@ patternProperties:
+ description: Allegro DVT
+ "^allegromicro,.*":
+ description: Allegro MicroSystems, Inc.
++ "^alliedtelesis,.*":
++ description: Allied Telesis, Inc.
+ "^alliedvision,.*":
+ description: Allied Vision Technologies GmbH
+ "^allo,.*":
+@@ -229,6 +231,8 @@ patternProperties:
+ description: Bitmain Technologies
+ "^blaize,.*":
+ description: Blaize, Inc.
++ "^bluegiga,.*":
++ description: Bluegiga Technologies Ltd.
+ "^blutek,.*":
+ description: BluTek Power
+ "^boe,.*":
+@@ -247,6 +251,8 @@ patternProperties:
+ description: Bticino International
+ "^buffalo,.*":
+ description: Buffalo, Inc.
++ "^buglabs,.*":
++ description: Bug Labs, Inc.
+ "^bur,.*":
+ description: B&R Industrial Automation GmbH
+ "^bytedance,.*":
+@@ -325,6 +331,8 @@ patternProperties:
+ description: Conexant Systems, Inc.
+ "^colorfly,.*":
+ description: Colorful GRP, Shenzhen Xueyushi Technology Ltd.
++ "^compal,.*":
++ description: Compal Electronics, Inc.
+ "^compulab,.*":
+ description: CompuLab Ltd.
+ "^comvetia,.*":
+@@ -353,6 +361,8 @@ patternProperties:
+ description: Guangzhou China Star Optoelectronics Technology Co., Ltd
+ "^csq,.*":
+ description: Shenzen Chuangsiqi Technology Co.,Ltd.
++ "^csr,.*":
++ description: Cambridge Silicon Radio
+ "^ctera,.*":
+ description: CTERA Networks Intl.
+ "^ctu,.*":
+@@ -455,6 +465,8 @@ patternProperties:
+ description: Emtop Embedded Solutions
+ "^eeti,.*":
+ description: eGalax_eMPIA Technology Inc
++ "^egnite,.*":
++ description: egnite GmbH
+ "^einfochips,.*":
+ description: Einfochips
+ "^eink,.*":
+@@ -485,8 +497,12 @@ patternProperties:
+ description: Empire Electronix
+ "^emtrion,.*":
+ description: emtrion GmbH
++ "^enbw,.*":
++ description: Energie Baden-Württemberg AG
+ "^enclustra,.*":
+ description: Enclustra GmbH
++ "^endian,.*":
++ description: Endian SRL
+ "^endless,.*":
+ description: Endless Mobile, Inc.
+ "^ene,.*":
+@@ -554,6 +570,8 @@ patternProperties:
+ description: FocalTech Systems Co.,Ltd
+ "^forlinx,.*":
+ description: Baoding Forlinx Embedded Technology Co., Ltd.
++ "^foxlink,.*":
++ description: Foxlink Group
+ "^freebox,.*":
+ description: Freebox SAS
+ "^freecom,.*":
+@@ -642,6 +660,10 @@ patternProperties:
+ description: Haoyu Microelectronic Co. Ltd.
+ "^hardkernel,.*":
+ description: Hardkernel Co., Ltd
++ "^hce,.*":
++ description: HCE Engineering SRL
++ "^headacoustics,.*":
++ description: HEAD acoustics
+ "^hechuang,.*":
+ description: Shenzhen Hechuang Intelligent Co.
+ "^hideep,.*":
+@@ -725,6 +747,8 @@ patternProperties:
+ description: Shenzhen INANBO Electronic Technology Co., Ltd.
+ "^incircuit,.*":
+ description: In-Circuit GmbH
++ "^incostartec,.*":
++ description: INCOstartec GmbH
+ "^indiedroid,.*":
+ description: Indiedroid
+ "^inet-tek,.*":
+@@ -933,6 +957,8 @@ patternProperties:
+ description: Maxim Integrated Products
+ "^maxlinear,.*":
+ description: MaxLinear Inc.
++ "^maxtor,.*":
++ description: Maxtor Corporation
+ "^mbvl,.*":
+ description: Mobiveil Inc.
+ "^mcube,.*":
+@@ -1096,6 +1122,8 @@ patternProperties:
+ description: Nordic Semiconductor
+ "^nothing,.*":
+ description: Nothing Technology Limited
++ "^novatech,.*":
++ description: NovaTech Automation
+ "^novatek,.*":
+ description: Novatek
+ "^novtech,.*":
+@@ -1191,6 +1219,8 @@ patternProperties:
+ description: Pervasive Displays, Inc.
+ "^phicomm,.*":
+ description: PHICOMM Co., Ltd.
++ "^phontech,.*":
++ description: Phontech
+ "^phytec,.*":
+ description: PHYTEC Messtechnik GmbH
+ "^picochip,.*":
+@@ -1275,6 +1305,8 @@ patternProperties:
+ description: Ramtron International
+ "^raspberrypi,.*":
+ description: Raspberry Pi Foundation
++ "^raumfeld,.*":
++ description: Raumfeld GmbH
+ "^raydium,.*":
+ description: Raydium Semiconductor Corp.
+ "^rda,.*":
+@@ -1313,6 +1345,8 @@ patternProperties:
+ description: ROHM Semiconductor Co., Ltd
+ "^ronbo,.*":
+ description: Ronbo Electronics
++ "^ronetix,.*":
++ description: Ronetix GmbH
+ "^roofull,.*":
+ description: Shenzhen Roofull Technology Co, Ltd
+ "^roseapplepi,.*":
+@@ -1339,8 +1373,12 @@ patternProperties:
+ description: Schindler
+ "^schneider,.*":
+ description: Schneider Electric
++ "^schulercontrol,.*":
++ description: Schuler Group
+ "^sciosense,.*":
+ description: ScioSense B.V.
++ "^sdmc,.*":
++ description: SDMC Technology Co., Ltd
+ "^seagate,.*":
+ description: Seagate Technology PLC
+ "^seeed,.*":
+@@ -1379,6 +1417,8 @@ patternProperties:
+ description: Si-En Technology Ltd.
+ "^si-linux,.*":
+ description: Silicon Linux Corporation
++ "^sielaff,.*":
++ description: Sielaff GmbH & Co.
+ "^siemens,.*":
+ description: Siemens AG
+ "^sifive,.*":
+@@ -1447,6 +1487,8 @@ patternProperties:
+ description: SolidRun
+ "^solomon,.*":
+ description: Solomon Systech Limited
++ "^somfy,.*":
++ description: Somfy Systems Inc.
+ "^sony,.*":
+ description: Sony Corporation
+ "^sophgo,.*":
+@@ -1517,6 +1559,8 @@ patternProperties:
+ "^synopsys,.*":
+ description: Synopsys, Inc. (deprecated, use snps)
+ deprecated: true
++ "^taos,.*":
++ description: Texas Advanced Optoelectronic Solutions Inc.
+ "^tbs,.*":
+ description: TBS Technologies
+ "^tbs-biometrics,.*":
+@@ -1547,6 +1591,8 @@ patternProperties:
+ description: Teltonika Networks
+ "^tempo,.*":
+ description: Tempo Semiconductor
++ "^tenda,.*":
++ description: Shenzhen Tenda Technology Co., Ltd.
+ "^terasic,.*":
+ description: Terasic Inc.
+ "^tesla,.*":
+@@ -1650,6 +1696,8 @@ patternProperties:
+ description: V3 Semiconductor
+ "^vaisala,.*":
+ description: Vaisala
++ "^valve,.*":
++ description: Valve Corporation
+ "^vamrs,.*":
+ description: Vamrs Ltd.
+ "^variscite,.*":
+@@ -1750,6 +1798,8 @@ patternProperties:
+ description: Extreme Engineering Solutions (X-ES)
+ "^xiaomi,.*":
+ description: Xiaomi Technology Co., Ltd.
++ "^xicor,.*":
++ description: Xicor Inc.
+ "^xillybus,.*":
+ description: Xillybus Ltd.
+ "^xingbangda,.*":
+diff --git a/Documentation/iio/ad3552r.rst b/Documentation/iio/ad3552r.rst
+index f5d59e4e86c7ec..4274e35f503d9f 100644
+--- a/Documentation/iio/ad3552r.rst
++++ b/Documentation/iio/ad3552r.rst
+@@ -64,7 +64,8 @@ specific debugfs path ``/sys/kernel/debug/iio/iio:deviceX``.
+ Usage examples
+ --------------
+
+-. code-block:: bash
++.. code-block:: bash
++
+ root:/sys/bus/iio/devices/iio:device0# cat data_source
+ normal
+ root:/sys/bus/iio/devices/iio:device0# echo -n ramp-16bit > data_source
+diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst
+index 5765eb3e9efa78..a30f4bed11b4ee 100644
+--- a/Documentation/trace/histogram-design.rst
++++ b/Documentation/trace/histogram-design.rst
+@@ -380,7 +380,9 @@ entry, ts0, corresponding to the ts0 variable in the sched_waking
+ trigger above.
+
+ sched_waking histogram
+-----------------------::
++----------------------
++
++.. code-block::
+
+ +------------------+
+ | hist_data |<-------------------------------------------------------+
+diff --git a/Makefile b/Makefile
+index a04d0223dc840a..22ee632f9104aa 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 17
+-SUBLEVEL = 2
++SUBLEVEL = 3
+ EXTRAVERSION =
+ NAME = Baby Opossum Posse
+
+diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
+index 582d96548385dd..06522451f018f3 100644
+--- a/arch/alpha/kernel/process.c
++++ b/arch/alpha/kernel/process.c
+@@ -231,7 +231,7 @@ flush_thread(void)
+ */
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ extern void ret_from_fork(void);
+diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c
+index 186ceab661eb02..8166d090871304 100644
+--- a/arch/arc/kernel/process.c
++++ b/arch/arc/kernel/process.c
+@@ -166,7 +166,7 @@ asmlinkage void ret_from_fork(void);
+ */
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *c_regs; /* child's pt_regs */
+diff --git a/arch/arm/boot/dts/renesas/r8a7791-porter.dts b/arch/arm/boot/dts/renesas/r8a7791-porter.dts
+index f518eadd8b9cda..81b3c5d74e9b3a 100644
+--- a/arch/arm/boot/dts/renesas/r8a7791-porter.dts
++++ b/arch/arm/boot/dts/renesas/r8a7791-porter.dts
+@@ -289,7 +289,7 @@ vin0_pins: vin0 {
+ };
+
+ can0_pins: can0 {
+- groups = "can0_data";
++ groups = "can0_data_b";
+ function = "can0";
+ };
+
+diff --git a/arch/arm/boot/dts/st/stm32mp151c-plyaqm.dts b/arch/arm/boot/dts/st/stm32mp151c-plyaqm.dts
+index 39a3211c613376..55fe916740d7c8 100644
+--- a/arch/arm/boot/dts/st/stm32mp151c-plyaqm.dts
++++ b/arch/arm/boot/dts/st/stm32mp151c-plyaqm.dts
+@@ -239,7 +239,7 @@ &i2s1 {
+
+ i2s1_port: port {
+ i2s1_endpoint: endpoint {
+- format = "i2s";
++ dai-format = "i2s";
+ mclk-fs = <256>;
+ remote-endpoint = <&codec_endpoint>;
+ };
+diff --git a/arch/arm/boot/dts/ti/omap/am335x-baltos.dtsi b/arch/arm/boot/dts/ti/omap/am335x-baltos.dtsi
+index ae2e8dffbe0492..ea47f9960c3566 100644
+--- a/arch/arm/boot/dts/ti/omap/am335x-baltos.dtsi
++++ b/arch/arm/boot/dts/ti/omap/am335x-baltos.dtsi
+@@ -269,7 +269,7 @@ &tps {
+ vcc7-supply = <&vbat>;
+ vccio-supply = <&vbat>;
+
+- ti,en-ck32k-xtal = <1>;
++ ti,en-ck32k-xtal;
+
+ regulators {
+ vrtc_reg: regulator@0 {
+diff --git a/arch/arm/boot/dts/ti/omap/am335x-cm-t335.dts b/arch/arm/boot/dts/ti/omap/am335x-cm-t335.dts
+index 06767ea164b598..ece7f7854f6aae 100644
+--- a/arch/arm/boot/dts/ti/omap/am335x-cm-t335.dts
++++ b/arch/arm/boot/dts/ti/omap/am335x-cm-t335.dts
+@@ -483,8 +483,6 @@ &mcasp1 {
+
+ op-mode = <0>; /* MCASP_IIS_MODE */
+ tdm-slots = <2>;
+- /* 16 serializers */
+- num-serializer = <16>;
+ serial-dir = < /* 0: INACTIVE, 1: TX, 2: RX */
+ 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0
+ >;
+diff --git a/arch/arm/boot/dts/ti/omap/omap3-devkit8000-lcd-common.dtsi b/arch/arm/boot/dts/ti/omap/omap3-devkit8000-lcd-common.dtsi
+index a7f99ae0c1fe9a..78c657429f6410 100644
+--- a/arch/arm/boot/dts/ti/omap/omap3-devkit8000-lcd-common.dtsi
++++ b/arch/arm/boot/dts/ti/omap/omap3-devkit8000-lcd-common.dtsi
+@@ -65,7 +65,7 @@ ads7846@0 {
+ ti,debounce-max = /bits/ 16 <10>;
+ ti,debounce-tol = /bits/ 16 <5>;
+ ti,debounce-rep = /bits/ 16 <1>;
+- ti,keep-vref-on = <1>;
++ ti,keep-vref-on;
+ ti,settle-delay-usec = /bits/ 16 <150>;
+
+ wakeup-source;
+diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
+index e16ed102960cb0..d7aa95225c70bd 100644
+--- a/arch/arm/kernel/process.c
++++ b/arch/arm/kernel/process.c
+@@ -234,7 +234,7 @@ asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long stack_start = args->stack;
+ unsigned long tls = args->tls;
+ struct thread_info *thread = task_thread_info(p);
+diff --git a/arch/arm/mach-at91/pm_suspend.S b/arch/arm/mach-at91/pm_suspend.S
+index e23b8683409656..7e6c94f8edeef9 100644
+--- a/arch/arm/mach-at91/pm_suspend.S
++++ b/arch/arm/mach-at91/pm_suspend.S
+@@ -904,7 +904,7 @@ e_done:
+ /**
+ * at91_mckx_ps_restore: restore MCKx settings
+ *
+- * Side effects: overwrites tmp1, tmp2
++ * Side effects: overwrites tmp1, tmp2 and tmp3
+ */
+ .macro at91_mckx_ps_restore
+ #ifdef CONFIG_SOC_SAMA7
+@@ -980,7 +980,7 @@ r_ps:
+ bic tmp3, tmp3, #AT91_PMC_MCR_V2_ID_MSK
+ orr tmp3, tmp3, tmp1
+ orr tmp3, tmp3, #AT91_PMC_MCR_V2_CMD
+- str tmp2, [pmc, #AT91_PMC_MCR_V2]
++ str tmp3, [pmc, #AT91_PMC_MCR_V2]
+
+ wait_mckrdy tmp1
+
+diff --git a/arch/arm64/boot/dts/allwinner/sun55i-a527-cubie-a5e.dts b/arch/arm64/boot/dts/allwinner/sun55i-a527-cubie-a5e.dts
+index 553ad774ed13d6..514c221a7a866b 100644
+--- a/arch/arm64/boot/dts/allwinner/sun55i-a527-cubie-a5e.dts
++++ b/arch/arm64/boot/dts/allwinner/sun55i-a527-cubie-a5e.dts
+@@ -6,6 +6,7 @@
+ #include "sun55i-a523.dtsi"
+
+ #include <dt-bindings/gpio/gpio.h>
++#include <dt-bindings/leds/common.h>
+
+ / {
+ model = "Radxa Cubie A5E";
+@@ -20,11 +21,22 @@ chosen {
+ stdout-path = "serial0:115200n8";
+ };
+
+- ext_osc32k: ext-osc32k-clk {
+- #clock-cells = <0>;
+- compatible = "fixed-clock";
+- clock-frequency = <32768>;
+- clock-output-names = "ext_osc32k";
++ leds {
++ compatible = "gpio-leds";
++
++ power-led {
++ function = LED_FUNCTION_POWER;
++ color = <LED_COLOR_ID_GREEN>;
++ gpios = <&r_pio 0 4 GPIO_ACTIVE_LOW>; /* PL4 */
++ default-state = "on";
++ linux,default-trigger = "heartbeat";
++ };
++
++ use-led {
++ function = LED_FUNCTION_ACTIVITY;
++ color = <LED_COLOR_ID_BLUE>;
++ gpios = <&r_pio 0 5 GPIO_ACTIVE_LOW>; /* PL5 */
++ };
+ };
+
+ reg_vcc5v: vcc5v {
+@@ -75,6 +87,9 @@ &mdio0 {
+ ext_rgmii_phy: ethernet-phy@1 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <1>;
++ reset-gpios = <&pio 7 8 GPIO_ACTIVE_LOW>; /* PH8 */
++ reset-assert-us = <10000>;
++ reset-deassert-us = <150000>;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/allwinner/sun55i-t527-avaota-a1.dts b/arch/arm64/boot/dts/allwinner/sun55i-t527-avaota-a1.dts
+index b9eeb6753e9e37..4e71055fbd159d 100644
+--- a/arch/arm64/boot/dts/allwinner/sun55i-t527-avaota-a1.dts
++++ b/arch/arm64/boot/dts/allwinner/sun55i-t527-avaota-a1.dts
+@@ -85,6 +85,9 @@ &mdio0 {
+ ext_rgmii_phy: ethernet-phy@1 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <1>;
++ reset-gpios = <&pio 7 8 GPIO_ACTIVE_LOW>; /* PH8 */
++ reset-assert-us = <10000>;
++ reset-deassert-us = <150000>;
+ };
+ };
+
+@@ -306,6 +309,14 @@ &r_pio {
+ vcc-pm-supply = <®_aldo3>;
+ };
+
++&rtc {
++ clocks = <&r_ccu CLK_BUS_R_RTC>, <&osc24M>,
++ <&r_ccu CLK_R_AHB>, <&ext_osc32k>;
++ clock-names = "bus", "hosc", "ahb", "ext-osc32k";
++ assigned-clocks = <&rtc CLK_OSC32K>;
++ assigned-clock-rates = <32768>;
++};
++
+ &uart0 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart0_pb_pins>;
+diff --git a/arch/arm64/boot/dts/allwinner/sun55i-t527-orangepi-4a.dts b/arch/arm64/boot/dts/allwinner/sun55i-t527-orangepi-4a.dts
+index d07bb9193b4382..b5483bd7b8d5d1 100644
+--- a/arch/arm64/boot/dts/allwinner/sun55i-t527-orangepi-4a.dts
++++ b/arch/arm64/boot/dts/allwinner/sun55i-t527-orangepi-4a.dts
+@@ -346,6 +346,14 @@ &r_pio {
+ vcc-pm-supply = <®_bldo2>;
+ };
+
++&rtc {
++ clocks = <&r_ccu CLK_BUS_R_RTC>, <&osc24M>,
++ <&r_ccu CLK_R_AHB>, <&ext_osc32k>;
++ clock-names = "bus", "hosc", "ahb", "ext-osc32k";
++ assigned-clocks = <&rtc CLK_OSC32K>;
++ assigned-clock-rates = <32768>;
++};
++
+ &uart0 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart0_pb_pins>;
+diff --git a/arch/arm64/boot/dts/amlogic/amlogic-c3.dtsi b/arch/arm64/boot/dts/amlogic/amlogic-c3.dtsi
+index cb9ea3ca6ee0f9..71b2b3b547f7cb 100644
+--- a/arch/arm64/boot/dts/amlogic/amlogic-c3.dtsi
++++ b/arch/arm64/boot/dts/amlogic/amlogic-c3.dtsi
+@@ -792,7 +792,7 @@ spicc1: spi@52000 {
+ pwm_mn: pwm@54000 {
+ compatible = "amlogic,c3-pwm",
+ "amlogic,meson-s4-pwm";
+- reg = <0x0 54000 0x0 0x24>;
++ reg = <0x0 0x54000 0x0 0x24>;
+ clocks = <&clkc_periphs CLKID_PWM_M>,
+ <&clkc_periphs CLKID_PWM_N>;
+ #pwm-cells = <3>;
+diff --git a/arch/arm64/boot/dts/apple/t6000-j314s.dts b/arch/arm64/boot/dts/apple/t6000-j314s.dts
+index c9e192848fe3f9..1430b91ff1b152 100644
+--- a/arch/arm64/boot/dts/apple/t6000-j314s.dts
++++ b/arch/arm64/boot/dts/apple/t6000-j314s.dts
+@@ -16,3 +16,11 @@ / {
+ compatible = "apple,j314s", "apple,t6000", "apple,arm-platform";
+ model = "Apple MacBook Pro (14-inch, M1 Pro, 2021)";
+ };
++
++&wifi0 {
++ brcm,board-type = "apple,maldives";
++};
++
++&bluetooth0 {
++ brcm,board-type = "apple,maldives";
++};
+diff --git a/arch/arm64/boot/dts/apple/t6000-j316s.dts b/arch/arm64/boot/dts/apple/t6000-j316s.dts
+index ff1803ce23001c..da0cbe7d96736b 100644
+--- a/arch/arm64/boot/dts/apple/t6000-j316s.dts
++++ b/arch/arm64/boot/dts/apple/t6000-j316s.dts
+@@ -16,3 +16,11 @@ / {
+ compatible = "apple,j316s", "apple,t6000", "apple,arm-platform";
+ model = "Apple MacBook Pro (16-inch, M1 Pro, 2021)";
+ };
++
++&wifi0 {
++ brcm,board-type = "apple,madagascar";
++};
++
++&bluetooth0 {
++ brcm,board-type = "apple,madagascar";
++};
+diff --git a/arch/arm64/boot/dts/apple/t6001-j314c.dts b/arch/arm64/boot/dts/apple/t6001-j314c.dts
+index 1761d15b98c12f..c37097dcfdb304 100644
+--- a/arch/arm64/boot/dts/apple/t6001-j314c.dts
++++ b/arch/arm64/boot/dts/apple/t6001-j314c.dts
+@@ -16,3 +16,11 @@ / {
+ compatible = "apple,j314c", "apple,t6001", "apple,arm-platform";
+ model = "Apple MacBook Pro (14-inch, M1 Max, 2021)";
+ };
++
++&wifi0 {
++ brcm,board-type = "apple,maldives";
++};
++
++&bluetooth0 {
++ brcm,board-type = "apple,maldives";
++};
+diff --git a/arch/arm64/boot/dts/apple/t6001-j316c.dts b/arch/arm64/boot/dts/apple/t6001-j316c.dts
+index 750e9beeffc0aa..3bc6e0c3294cf9 100644
+--- a/arch/arm64/boot/dts/apple/t6001-j316c.dts
++++ b/arch/arm64/boot/dts/apple/t6001-j316c.dts
+@@ -16,3 +16,11 @@ / {
+ compatible = "apple,j316c", "apple,t6001", "apple,arm-platform";
+ model = "Apple MacBook Pro (16-inch, M1 Max, 2021)";
+ };
++
++&wifi0 {
++ brcm,board-type = "apple,madagascar";
++};
++
++&bluetooth0 {
++ brcm,board-type = "apple,madagascar";
++};
+diff --git a/arch/arm64/boot/dts/apple/t6001-j375c.dts b/arch/arm64/boot/dts/apple/t6001-j375c.dts
+index 62ea437b58b25c..2e7c23714d4d00 100644
+--- a/arch/arm64/boot/dts/apple/t6001-j375c.dts
++++ b/arch/arm64/boot/dts/apple/t6001-j375c.dts
+@@ -16,3 +16,11 @@ / {
+ compatible = "apple,j375c", "apple,t6001", "apple,arm-platform";
+ model = "Apple Mac Studio (M1 Max, 2022)";
+ };
++
++&wifi0 {
++ brcm,board-type = "apple,okinawa";
++};
++
++&bluetooth0 {
++ brcm,board-type = "apple,okinawa";
++};
+diff --git a/arch/arm64/boot/dts/apple/t6002-j375d.dts b/arch/arm64/boot/dts/apple/t6002-j375d.dts
+index 3365429bdc8be9..2b7f80119618ad 100644
+--- a/arch/arm64/boot/dts/apple/t6002-j375d.dts
++++ b/arch/arm64/boot/dts/apple/t6002-j375d.dts
+@@ -38,6 +38,14 @@ hpm5: usb-pd@3a {
+ };
+ };
+
++&wifi0 {
++ brcm,board-type = "apple,okinawa";
++};
++
++&bluetooth0 {
++ brcm,board-type = "apple,okinawa";
++};
++
+ /* delete unused always-on power-domains on die 1 */
+
+ /delete-node/ &ps_atc2_usb_aon_die1;
+diff --git a/arch/arm64/boot/dts/apple/t600x-j314-j316.dtsi b/arch/arm64/boot/dts/apple/t600x-j314-j316.dtsi
+index 22ebc78e120bf8..c0aac59a6fae4f 100644
+--- a/arch/arm64/boot/dts/apple/t600x-j314-j316.dtsi
++++ b/arch/arm64/boot/dts/apple/t600x-j314-j316.dtsi
+@@ -13,6 +13,7 @@
+
+ / {
+ aliases {
++ bluetooth0 = &bluetooth0;
+ serial0 = &serial0;
+ wifi0 = &wifi0;
+ };
+@@ -99,9 +100,18 @@ &port00 {
+ /* WLAN */
+ bus-range = <1 1>;
+ wifi0: wifi@0,0 {
++ compatible = "pci14e4,4433";
+ reg = <0x10000 0x0 0x0 0x0 0x0>;
+ /* To be filled by the loader */
+ local-mac-address = [00 10 18 00 00 10];
++ apple,antenna-sku = "XX";
++ };
++
++ bluetooth0: bluetooth@0,1 {
++ compatible = "pci14e4,5f71";
++ reg = <0x10100 0x0 0x0 0x0 0x0>;
++ /* To be filled by the loader */
++ local-bd-address = [00 00 00 00 00 00];
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/apple/t600x-j375.dtsi b/arch/arm64/boot/dts/apple/t600x-j375.dtsi
+index d5b985ad567936..c0fb93ae72f4d4 100644
+--- a/arch/arm64/boot/dts/apple/t600x-j375.dtsi
++++ b/arch/arm64/boot/dts/apple/t600x-j375.dtsi
+@@ -11,6 +11,8 @@
+
+ / {
+ aliases {
++ bluetooth0 = &bluetooth0;
++ ethernet0 = ðernet0;
+ serial0 = &serial0;
+ wifi0 = &wifi0;
+ };
+@@ -84,9 +86,18 @@ &port00 {
+ /* WLAN */
+ bus-range = <1 1>;
+ wifi0: wifi@0,0 {
++ compatible = "pci14e4,4433";
+ reg = <0x10000 0x0 0x0 0x0 0x0>;
+ /* To be filled by the loader */
+ local-mac-address = [00 10 18 00 00 10];
++ apple,antenna-sku = "XX";
++ };
++
++ bluetooth0: bluetooth@0,1 {
++ compatible = "pci14e4,5f71";
++ reg = <0x10100 0x0 0x0 0x0 0x0>;
++ /* To be filled by the loader */
++ local-bd-address = [00 00 00 00 00 00];
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/apple/t8103-j457.dts b/arch/arm64/boot/dts/apple/t8103-j457.dts
+index 152f95fd49a211..7089ccf3ce5566 100644
+--- a/arch/arm64/boot/dts/apple/t8103-j457.dts
++++ b/arch/arm64/boot/dts/apple/t8103-j457.dts
+@@ -21,6 +21,14 @@ aliases {
+ };
+ };
+
++/*
++ * Adjust pcie0's iommu-map to account for the disabled port01.
++ */
++&pcie0 {
++ iommu-map = <0x100 &pcie0_dart_0 1 1>,
++ <0x200 &pcie0_dart_2 1 1>;
++};
++
+ &bluetooth0 {
+ brcm,board-type = "apple,santorini";
+ };
+@@ -36,10 +44,10 @@ &wifi0 {
+ */
+
+ &port02 {
+- bus-range = <3 3>;
++ bus-range = <2 2>;
+ status = "okay";
+ ethernet0: ethernet@0,0 {
+- reg = <0x30000 0x0 0x0 0x0 0x0>;
++ reg = <0x20000 0x0 0x0 0x0 0x0>;
+ /* To be filled by the loader */
+ local-mac-address = [00 10 18 00 00 00];
+ };
+diff --git a/arch/arm64/boot/dts/freescale/imx93-kontron-bl-osm-s.dts b/arch/arm64/boot/dts/freescale/imx93-kontron-bl-osm-s.dts
+index 89e97c604bd3e4..c3d2ddd887fdf0 100644
+--- a/arch/arm64/boot/dts/freescale/imx93-kontron-bl-osm-s.dts
++++ b/arch/arm64/boot/dts/freescale/imx93-kontron-bl-osm-s.dts
+@@ -33,7 +33,9 @@ pwm-beeper {
+
+ reg_vcc_panel: regulator-vcc-panel {
+ compatible = "regulator-fixed";
+- gpio = <&gpio4 3 GPIO_ACTIVE_HIGH>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&pinctrl_reg_vcc_panel>;
++ gpio = <&gpio2 21 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+ regulator-max-microvolt = <3300000>;
+ regulator-min-microvolt = <3300000>;
+@@ -135,6 +137,16 @@ &tpm6 {
+ };
+
+ &usbotg1 {
++ adp-disable;
++ hnp-disable;
++ srp-disable;
++ disable-over-current;
++ dr_mode = "otg";
++ usb-role-switch;
++ status = "okay";
++};
++
++&usbotg2 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ disable-over-current;
+@@ -147,17 +159,15 @@ usb1@1 {
+ };
+ };
+
+-&usbotg2 {
+- adp-disable;
+- hnp-disable;
+- srp-disable;
+- disable-over-current;
+- dr_mode = "otg";
+- usb-role-switch;
+- status = "okay";
+-};
+-
+ &usdhc2 {
+ vmmc-supply = <®_vdd_3v3>;
+ status = "okay";
+ };
++
++&iomuxc {
++ pinctrl_reg_vcc_panel: regvccpanelgrp {
++ fsl,pins = <
++ MX93_PAD_GPIO_IO21__GPIO2_IO21 0x31e /* PWM_2 */
++ >;
++ };
++};
+diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi
+index 8296888bce5947..4521da02d16959 100644
+--- a/arch/arm64/boot/dts/freescale/imx95.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
+@@ -913,7 +913,7 @@ lpuart7: serial@42690000 {
+ interrupts = <GIC_SPI 68 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&scmi_clk IMX95_CLK_LPUART7>;
+ clock-names = "ipg";
+- dmas = <&edma2 26 0 FSL_EDMA_RX>, <&edma2 25 0 0>;
++ dmas = <&edma2 88 0 FSL_EDMA_RX>, <&edma2 87 0 0>;
+ dma-names = "rx", "tx";
+ status = "disabled";
+ };
+@@ -925,7 +925,7 @@ lpuart8: serial@426a0000 {
+ interrupts = <GIC_SPI 69 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&scmi_clk IMX95_CLK_LPUART8>;
+ clock-names = "ipg";
+- dmas = <&edma2 28 0 FSL_EDMA_RX>, <&edma2 27 0 0>;
++ dmas = <&edma2 90 0 FSL_EDMA_RX>, <&edma2 89 0 0>;
+ dma-names = "rx", "tx";
+ status = "disabled";
+ };
+diff --git a/arch/arm64/boot/dts/mediatek/mt6331.dtsi b/arch/arm64/boot/dts/mediatek/mt6331.dtsi
+index d89858c73ab1b0..243afbffa21fd7 100644
+--- a/arch/arm64/boot/dts/mediatek/mt6331.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt6331.dtsi
+@@ -6,12 +6,12 @@
+ #include <dt-bindings/input/input.h>
+
+ &pwrap {
+- pmic: mt6331 {
++ pmic: pmic {
+ compatible = "mediatek,mt6331";
+ interrupt-controller;
+ #interrupt-cells = <2>;
+
+- mt6331regulator: mt6331regulator {
++ mt6331regulator: regulators {
+ compatible = "mediatek,mt6331-regulator";
+
+ mt6331_vdvfs11_reg: buck-vdvfs11 {
+@@ -258,7 +258,7 @@ mt6331_vrtc_reg: ldo-vrtc {
+ };
+
+ mt6331_vdig18_reg: ldo-vdig18 {
+- regulator-name = "dvdd18_dig";
++ regulator-name = "vdig18";
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-ramp-delay = <0>;
+@@ -266,11 +266,11 @@ mt6331_vdig18_reg: ldo-vdig18 {
+ };
+ };
+
+- mt6331rtc: mt6331rtc {
++ mt6331rtc: rtc {
+ compatible = "mediatek,mt6331-rtc";
+ };
+
+- mt6331keys: mt6331keys {
++ mt6331keys: keys {
+ compatible = "mediatek,mt6331-keys";
+ power {
+ linux,keycodes = <KEY_POWER>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt6795-sony-xperia-m5.dts b/arch/arm64/boot/dts/mediatek/mt6795-sony-xperia-m5.dts
+index 91de920c224571..03cc48321a3f48 100644
+--- a/arch/arm64/boot/dts/mediatek/mt6795-sony-xperia-m5.dts
++++ b/arch/arm64/boot/dts/mediatek/mt6795-sony-xperia-m5.dts
+@@ -212,7 +212,7 @@ proximity@48 {
+
+ &mmc0 {
+ /* eMMC controller */
+- mediatek,latch-ck = <0x14>; /* hs400 */
++ mediatek,latch-ck = <4>; /* hs400 */
+ mediatek,hs200-cmd-int-delay = <1>;
+ mediatek,hs400-cmd-int-delay = <1>;
+ mediatek,hs400-ds-dly3 = <0x1a>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt7986a.dtsi b/arch/arm64/boot/dts/mediatek/mt7986a.dtsi
+index 559990dcd1d179..3211905b6f86dc 100644
+--- a/arch/arm64/boot/dts/mediatek/mt7986a.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt7986a.dtsi
+@@ -428,16 +428,16 @@ pcie_intc: interrupt-controller {
+ };
+ };
+
+- pcie_phy: t-phy {
++ pcie_phy: t-phy@11c00000 {
+ compatible = "mediatek,mt7986-tphy",
+ "mediatek,generic-tphy-v2";
+- ranges;
+- #address-cells = <2>;
+- #size-cells = <2>;
++ ranges = <0 0 0x11c00000 0x20000>;
++ #address-cells = <1>;
++ #size-cells = <1>;
+ status = "disabled";
+
+- pcie_port: pcie-phy@11c00000 {
+- reg = <0 0x11c00000 0 0x20000>;
++ pcie_port: pcie-phy@0 {
++ reg = <0 0x20000>;
+ clocks = <&clk40m>;
+ clock-names = "ref";
+ #phy-cells = <1>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+index 400c61d1103561..fff93e26eb7604 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+@@ -580,7 +580,7 @@ pins-cmd-dat {
+ pins-clk {
+ pinmux = <PINMUX_GPIO124__FUNC_MSDC0_CLK>;
+ drive-strength = <MTK_DRIVE_14mA>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins-rst {
+@@ -609,13 +609,13 @@ pins-cmd-dat {
+ pins-clk {
+ pinmux = <PINMUX_GPIO124__FUNC_MSDC0_CLK>;
+ drive-strength = <MTK_DRIVE_14mA>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins-ds {
+ pinmux = <PINMUX_GPIO131__FUNC_MSDC0_DSL>;
+ drive-strength = <MTK_DRIVE_14mA>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins-rst {
+@@ -633,13 +633,13 @@ pins-cmd-dat {
+ <PINMUX_GPIO33__FUNC_MSDC1_DAT2>,
+ <PINMUX_GPIO30__FUNC_MSDC1_DAT3>;
+ input-enable;
+- mediatek,pull-up-adv = <10>;
++ mediatek,pull-up-adv = <2>;
+ };
+
+ pins-clk {
+ pinmux = <PINMUX_GPIO29__FUNC_MSDC1_CLK>;
+ input-enable;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+ };
+
+@@ -652,13 +652,13 @@ pins-cmd-dat {
+ <PINMUX_GPIO30__FUNC_MSDC1_DAT3>;
+ drive-strength = <6>;
+ input-enable;
+- mediatek,pull-up-adv = <10>;
++ mediatek,pull-up-adv = <2>;
+ };
+
+ pins-clk {
+ pinmux = <PINMUX_GPIO29__FUNC_MSDC1_CLK>;
+ drive-strength = <8>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ input-enable;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/mediatek/mt8183-pumpkin.dts b/arch/arm64/boot/dts/mediatek/mt8183-pumpkin.dts
+index dbdee604edab43..7c3010889ae737 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8183-pumpkin.dts
++++ b/arch/arm64/boot/dts/mediatek/mt8183-pumpkin.dts
+@@ -324,7 +324,7 @@ pins_cmd_dat {
+ pins_clk {
+ pinmux = <PINMUX_GPIO124__FUNC_MSDC0_CLK>;
+ drive-strength = <MTK_DRIVE_14mA>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins_rst {
+@@ -353,13 +353,13 @@ pins_cmd_dat {
+ pins_clk {
+ pinmux = <PINMUX_GPIO124__FUNC_MSDC0_CLK>;
+ drive-strength = <MTK_DRIVE_14mA>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins_ds {
+ pinmux = <PINMUX_GPIO131__FUNC_MSDC0_DSL>;
+ drive-strength = <MTK_DRIVE_14mA>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins_rst {
+@@ -377,13 +377,13 @@ pins_cmd_dat {
+ <PINMUX_GPIO33__FUNC_MSDC1_DAT2>,
+ <PINMUX_GPIO30__FUNC_MSDC1_DAT3>;
+ input-enable;
+- mediatek,pull-up-adv = <10>;
++ mediatek,pull-up-adv = <2>;
+ };
+
+ pins_clk {
+ pinmux = <PINMUX_GPIO29__FUNC_MSDC1_CLK>;
+ input-enable;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ };
+
+ pins_pmu {
+@@ -401,13 +401,13 @@ pins_cmd_dat {
+ <PINMUX_GPIO30__FUNC_MSDC1_DAT3>;
+ drive-strength = <6>;
+ input-enable;
+- mediatek,pull-up-adv = <10>;
++ mediatek,pull-up-adv = <2>;
+ };
+
+ pins_clk {
+ pinmux = <PINMUX_GPIO29__FUNC_MSDC1_CLK>;
+ drive-strength = <8>;
+- mediatek,pull-down-adv = <10>;
++ mediatek,pull-down-adv = <2>;
+ input-enable;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/mediatek/mt8186-corsola-krabby.dtsi b/arch/arm64/boot/dts/mediatek/mt8186-corsola-krabby.dtsi
+index 7c971198fa9561..72a2a2bff0a93f 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8186-corsola-krabby.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8186-corsola-krabby.dtsi
+@@ -71,14 +71,14 @@ &i2c1 {
+ i2c-scl-internal-delay-ns = <10000>;
+
+ touchscreen: touchscreen@10 {
+- compatible = "hid-over-i2c";
++ compatible = "elan,ekth6915";
+ reg = <0x10>;
+ interrupts-extended = <&pio 12 IRQ_TYPE_LEVEL_LOW>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&touchscreen_pins>;
+- post-power-on-delay-ms = <10>;
+- hid-descr-addr = <0x0001>;
+- vdd-supply = <&pp3300_s3>;
++ reset-gpios = <&pio 60 GPIO_ACTIVE_LOW>;
++ vcc33-supply = <&pp3300_s3>;
++ no-reset-on-power-off;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/mediatek/mt8186-corsola-tentacruel-sku262144.dts b/arch/arm64/boot/dts/mediatek/mt8186-corsola-tentacruel-sku262144.dts
+index 26d3451a5e47c0..24d9ede63eaa21 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8186-corsola-tentacruel-sku262144.dts
++++ b/arch/arm64/boot/dts/mediatek/mt8186-corsola-tentacruel-sku262144.dts
+@@ -42,3 +42,7 @@ MATRIX_KEY(0x00, 0x04, KEY_VOLUMEUP)
+ CROS_STD_MAIN_KEYMAP
+ >;
+ };
++
++&touchscreen {
++ compatible = "elan,ekth6a12nay";
++};
+diff --git a/arch/arm64/boot/dts/mediatek/mt8188.dtsi b/arch/arm64/boot/dts/mediatek/mt8188.dtsi
+index 202478407727e0..90c388f1890f51 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8188.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8188.dtsi
+@@ -2183,7 +2183,7 @@ imp_iic_wrap_en: clock-controller@11ec2000 {
+ };
+
+ efuse: efuse@11f20000 {
+- compatible = "mediatek,mt8188-efuse", "mediatek,efuse";
++ compatible = "mediatek,mt8188-efuse", "mediatek,mt8186-efuse";
+ reg = <0 0x11f20000 0 0x1000>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
+index 8877953ce292b6..ab0b2f606eb437 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
+@@ -1588,9 +1588,6 @@ pcie0: pcie@112f0000 {
+
+ power-domains = <&spm MT8195_POWER_DOMAIN_PCIE_MAC_P0>;
+
+- resets = <&infracfg_ao MT8195_INFRA_RST2_PCIE_P0_SWRST>;
+- reset-names = "mac";
+-
+ #interrupt-cells = <1>;
+ interrupt-map-mask = <0 0 0 7>;
+ interrupt-map = <0 0 0 1 &pcie_intc0 0>,
+diff --git a/arch/arm64/boot/dts/mediatek/mt8395-kontron-3-5-sbc-i1200.dts b/arch/arm64/boot/dts/mediatek/mt8395-kontron-3-5-sbc-i1200.dts
+index 4985b65925a9ed..d16f545cbbb272 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8395-kontron-3-5-sbc-i1200.dts
++++ b/arch/arm64/boot/dts/mediatek/mt8395-kontron-3-5-sbc-i1200.dts
+@@ -352,7 +352,7 @@ regulator {
+ LDO_VIN2-supply = <&vsys>;
+ LDO_VIN3-supply = <&vsys>;
+
+- mt6360_buck1: BUCK1 {
++ mt6360_buck1: buck1 {
+ regulator-name = "emi_vdd2";
+ regulator-min-microvolt = <600000>;
+ regulator-max-microvolt = <1800000>;
+@@ -362,7 +362,7 @@ MT6360_OPMODE_LP
+ regulator-always-on;
+ };
+
+- mt6360_buck2: BUCK2 {
++ mt6360_buck2: buck2 {
+ regulator-name = "emi_vddq";
+ regulator-min-microvolt = <300000>;
+ regulator-max-microvolt = <1300000>;
+@@ -372,7 +372,7 @@ MT6360_OPMODE_LP
+ regulator-always-on;
+ };
+
+- mt6360_ldo1: LDO1 {
++ mt6360_ldo1: ldo1 {
+ regulator-name = "mt6360_ldo1"; /* Test point */
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <3600000>;
+@@ -380,7 +380,7 @@ mt6360_ldo1: LDO1 {
+ MT6360_OPMODE_LP>;
+ };
+
+- mt6360_ldo2: LDO2 {
++ mt6360_ldo2: ldo2 {
+ regulator-name = "panel1_p1v8";
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+@@ -388,7 +388,7 @@ mt6360_ldo2: LDO2 {
+ MT6360_OPMODE_LP>;
+ };
+
+- mt6360_ldo3: LDO3 {
++ mt6360_ldo3: ldo3 {
+ regulator-name = "vmc_pmu";
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <3300000>;
+@@ -396,7 +396,7 @@ mt6360_ldo3: LDO3 {
+ MT6360_OPMODE_LP>;
+ };
+
+- mt6360_ldo5: LDO5 {
++ mt6360_ldo5: ldo5 {
+ regulator-name = "vmch_pmu";
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+@@ -404,7 +404,7 @@ mt6360_ldo5: LDO5 {
+ MT6360_OPMODE_LP>;
+ };
+
+- mt6360_ldo6: LDO6 {
++ mt6360_ldo6: ldo6 {
+ regulator-name = "mt6360_ldo6"; /* Test point */
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <2100000>;
+@@ -412,7 +412,7 @@ mt6360_ldo6: LDO6 {
+ MT6360_OPMODE_LP>;
+ };
+
+- mt6360_ldo7: LDO7 {
++ mt6360_ldo7: ldo7 {
+ regulator-name = "emi_vmddr_en";
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt8516-pumpkin.dts b/arch/arm64/boot/dts/mediatek/mt8516-pumpkin.dts
+index cce642c5381280..3d3db33a64dc66 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8516-pumpkin.dts
++++ b/arch/arm64/boot/dts/mediatek/mt8516-pumpkin.dts
+@@ -11,7 +11,7 @@
+
+ / {
+ model = "Pumpkin MT8516";
+- compatible = "mediatek,mt8516";
++ compatible = "mediatek,mt8516-pumpkin", "mediatek,mt8516";
+
+ memory@40000000 {
+ device_type = "memory";
+diff --git a/arch/arm64/boot/dts/qcom/qcm2290.dtsi b/arch/arm64/boot/dts/qcom/qcm2290.dtsi
+index fa24b77a31a750..6b7070dad3df94 100644
+--- a/arch/arm64/boot/dts/qcom/qcm2290.dtsi
++++ b/arch/arm64/boot/dts/qcom/qcm2290.dtsi
+@@ -1454,6 +1454,7 @@ usb_dwc3: usb@4e00000 {
+ snps,has-lpm-erratum;
+ snps,hird-threshold = /bits/ 8 <0x10>;
+ snps,usb3_lpm_capable;
++ snps,parkmode-disable-ss-quirk;
+ maximum-speed = "super-speed";
+ dr_mode = "otg";
+ usb-role-switch;
+diff --git a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
+index 9ba23129e65ec3..2c1ab75e4d910c 100644
+--- a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
++++ b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
+@@ -185,7 +185,7 @@ vcc_sdhi: regulator-vcc-sdhi {
+ regulator-max-microvolt = <3300000>;
+ gpios = <&gpio8 13 GPIO_ACTIVE_HIGH>;
+ gpios-states = <1>;
+- states = <3300000 0>, <1800000 1>;
++ states = <1800000 0>, <3300000 1>;
+ };
+ };
+
+@@ -556,6 +556,10 @@ pins-mii {
+ drive-strength = <21>;
+ };
+
++ pins-vddq18-25-avb {
++ pins = "PIN_VDDQ_AVB0", "PIN_VDDQ_AVB1", "PIN_VDDQ_AVB2", "PIN_VDDQ_TSN0";
++ power-source = <1800>;
++ };
+ };
+
+ /* Page 28 / CANFD_IF */
+diff --git a/arch/arm64/boot/dts/renesas/r9a09g047e57-smarc.dts b/arch/arm64/boot/dts/renesas/r9a09g047e57-smarc.dts
+index 1e67f0a2a945c9..9f6716fa108600 100644
+--- a/arch/arm64/boot/dts/renesas/r9a09g047e57-smarc.dts
++++ b/arch/arm64/boot/dts/renesas/r9a09g047e57-smarc.dts
+@@ -90,10 +90,10 @@ &i2c0 {
+ };
+
+ &keys {
+- key-sleep {
+- pinctrl-0 = <&nmi_pins>;
+- pinctrl-names = "default";
++ pinctrl-0 = <&nmi_pins>;
++ pinctrl-names = "default";
+
++ key-sleep {
+ interrupts-extended = <&icu 0 IRQ_TYPE_EDGE_FALLING>;
+ linux,code = <KEY_SLEEP>;
+ label = "SLEEP";
+diff --git a/arch/arm64/boot/dts/renesas/rzg2lc-smarc.dtsi b/arch/arm64/boot/dts/renesas/rzg2lc-smarc.dtsi
+index 345b779e4f6015..f3d7eff0d2f2a0 100644
+--- a/arch/arm64/boot/dts/renesas/rzg2lc-smarc.dtsi
++++ b/arch/arm64/boot/dts/renesas/rzg2lc-smarc.dtsi
+@@ -48,7 +48,10 @@ sound_card {
+ #if (SW_SCIF_CAN || SW_RSPI_CAN)
+ &canfd {
+ pinctrl-0 = <&can1_pins>;
+- /delete-node/ channel@0;
++
++ channel0 {
++ status = "disabled";
++ };
+ };
+ #else
+ &canfd {
+diff --git a/arch/arm64/boot/dts/rockchip/rk3576-evb1-v10.dts b/arch/arm64/boot/dts/rockchip/rk3576-evb1-v10.dts
+index 56527c56830e3f..012c21b58a5a1b 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3576-evb1-v10.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3576-evb1-v10.dts
+@@ -232,6 +232,20 @@ vcc_ufs_s0: regulator-vcc-ufs-s0 {
+ regulator-max-microvolt = <3300000>;
+ vin-supply = <&vcc_sys>;
+ };
++
++ vcc_wifi_reg_on: regulator-wifi-reg-on {
++ compatible = "regulator-fixed";
++ enable-active-high;
++ gpios = <&gpio1 RK_PC6 GPIO_ACTIVE_HIGH>;
++ pinctrl-0 = <&wifi_reg_on>;
++ pinctrl-names = "default";
++ regulator-name = "wifi_reg_on";
++ regulator-always-on;
++ regulator-boot-on;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ vin-supply = <&vcc_1v8_s3>;
++ };
+ };
+
+ &cpu_l0 {
+@@ -242,6 +256,10 @@ &cpu_b0 {
+ cpu-supply = <&vdd_cpu_big_s0>;
+ };
+
++&combphy0_ps {
++ status = "okay";
++};
++
+ &combphy1_psu {
+ status = "okay";
+ };
+@@ -257,9 +275,6 @@ ð0m0_rx_bus2
+ ð0m0_rgmii_clk
+ ð0m0_rgmii_bus
+ ðm0_clk0_25m_out>;
+- snps,reset-gpio = <&gpio2 RK_PB5 GPIO_ACTIVE_LOW>;
+- snps,reset-active-low;
+- snps,reset-delays-us = <0 20000 100000>;
+ tx_delay = <0x21>;
+ status = "okay";
+ };
+@@ -275,9 +290,6 @@ ð1m0_rx_bus2
+ ð1m0_rgmii_clk
+ ð1m0_rgmii_bus
+ ðm0_clk1_25m_out>;
+- snps,reset-gpio = <&gpio3 RK_PA3 GPIO_ACTIVE_LOW>;
+- snps,reset-active-low;
+- snps,reset-delays-us = <0 20000 100000>;
+ tx_delay = <0x20>;
+ status = "okay";
+ };
+@@ -680,19 +692,73 @@ regulator-state-mem {
+ };
+ };
+
++&i2c2 {
++ status = "okay";
++
++ hym8563: rtc@51 {
++ compatible = "haoyu,hym8563";
++ reg = <0x51>;
++ clock-output-names = "hym8563";
++ interrupt-parent = <&gpio0>;
++ interrupts = <RK_PA0 IRQ_TYPE_LEVEL_LOW>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&rtc_int>;
++ wakeup-source;
++ #clock-cells = <0>;
++ };
++};
++
+ &mdio0 {
+- rgmii_phy0: phy@1 {
+- compatible = "ethernet-phy-ieee802.3-c22";
++ rgmii_phy0: ethernet-phy@1 {
++ compatible = "ethernet-phy-id001c.c916";
+ reg = <0x1>;
+ clocks = <&cru REFCLKO25M_GMAC0_OUT>;
++ assigned-clocks = <&cru REFCLKO25M_GMAC0_OUT>;
++ assigned-clock-rates = <25000000>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&rgmii_phy0_rst>;
++ reset-assert-us = <20000>;
++ reset-deassert-us = <100000>;
++ reset-gpios = <&gpio2 RK_PB5 GPIO_ACTIVE_LOW>;
+ };
+ };
+
+ &mdio1 {
+- rgmii_phy1: phy@1 {
+- compatible = "ethernet-phy-ieee802.3-c22";
++ rgmii_phy1: ethernet-phy@1 {
++ compatible = "ethernet-phy-id001c.c916";
+ reg = <0x1>;
+ clocks = <&cru REFCLKO25M_GMAC1_OUT>;
++ assigned-clocks = <&cru REFCLKO25M_GMAC1_OUT>;
++ assigned-clock-rates = <25000000>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&rgmii_phy1_rst>;
++ reset-assert-us = <20000>;
++ reset-deassert-us = <100000>;
++ reset-gpios = <&gpio3 RK_PA3 GPIO_ACTIVE_LOW>;
++ };
++};
++
++&pcie0 {
++ pinctrl-names = "default";
++ pinctrl-0 = <&pcie0_rst>;
++ reset-gpios = <&gpio2 RK_PB4 GPIO_ACTIVE_HIGH>;
++ vpcie3v3-supply = <&vcc_3v3_s3>;
++ status = "okay";
++
++ pcie@0,0 {
++ reg = <0x0 0 0 0 0>;
++ bus-range = <0x0 0xf>;
++ device_type = "pci";
++ ranges;
++ #address-cells = <3>;
++ #size-cells = <2>;
++
++ wifi: wifi@0,0 {
++ compatible = "pci14e4,449d";
++ reg = <0x10000 0 0 0 0>;
++ clocks = <&hym8563>;
++ clock-names = "lpo";
++ };
+ };
+ };
+
+@@ -708,6 +774,28 @@ &pcie1 {
+ };
+
+ &pinctrl {
++ hym8563 {
++ rtc_int: rtc-int {
++ rockchip,pins = <0 RK_PA0 RK_FUNC_GPIO &pcfg_pull_up>;
++ };
++ };
++
++ network {
++ rgmii_phy0_rst: rgmii-phy0-rst {
++ rockchip,pins = <2 RK_PB5 RK_FUNC_GPIO &pcfg_pull_none>;
++ };
++
++ rgmii_phy1_rst: rgmii-phy1-rst {
++ rockchip,pins = <3 RK_PA3 RK_FUNC_GPIO &pcfg_pull_none>;
++ };
++ };
++
++ pcie0 {
++ pcie0_rst: pcie0-rst {
++ rockchip,pins = <2 RK_PB4 RK_FUNC_GPIO &pcfg_pull_none>;
++ };
++ };
++
+ usb {
+ usb_host_pwren: usb-host-pwren {
+ rockchip,pins = <0 RK_PC7 RK_FUNC_GPIO &pcfg_pull_none>;
+@@ -721,6 +809,16 @@ usbc0_int: usbc0-int {
+ rockchip,pins = <0 RK_PA5 RK_FUNC_GPIO &pcfg_pull_up>;
+ };
+ };
++
++ wifi {
++ wifi_reg_on: wifi-reg-on {
++ rockchip,pins = <1 RK_PC6 RK_FUNC_GPIO &pcfg_pull_up>;
++ };
++
++ wifi_wake_host: wifi-wake-host {
++ rockchip,pins = <0 RK_PB0 RK_FUNC_GPIO &pcfg_pull_down>;
++ };
++ };
+ };
+
+ &sdmmc {
+diff --git a/arch/arm64/boot/dts/ti/k3-am62-phycore-som.dtsi b/arch/arm64/boot/dts/ti/k3-am62-phycore-som.dtsi
+index 10e6b5c08619ec..737ff54c3cd2fd 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62-phycore-som.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62-phycore-som.dtsi
+@@ -46,31 +46,31 @@ ramoops@9c700000 {
+ pmsg-size = <0x8000>;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@9c800000 {
++ rtos_ipc_memory_region: memory@9c800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c800000 0x00 0x00300000>;
+ no-map;
+ };
+
+- mcu_m4fss_dma_memory_region: m4f-dma-memory@9cb00000 {
++ mcu_m4fss_dma_memory_region: memory@9cb00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9cb00000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_m4fss_memory_region: m4f-memory@9cc00000 {
++ mcu_m4fss_memory_region: memory@9cc00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9cc00000 0x00 0xe00000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9da00000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9da00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9da00000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-memory@9db00000 {
++ wkup_r5fss0_core0_memory_region: memory@9db00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9db00000 0x00 0xc00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62-pocketbeagle2.dts b/arch/arm64/boot/dts/ti/k3-am62-pocketbeagle2.dts
+index 2e4cf65ee3239f..1c95947430d3e8 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62-pocketbeagle2.dts
++++ b/arch/arm64/boot/dts/ti/k3-am62-pocketbeagle2.dts
+@@ -54,13 +54,13 @@ linux,cma {
+ linux,cma-default;
+ };
+
+- mcu_m4fss_dma_memory_region: m4f-dma-memory@9cb00000 {
++ mcu_m4fss_dma_memory_region: memory@9cb00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9cb00000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_m4fss_memory_region: m4f-memory@9cc00000 {
++ mcu_m4fss_memory_region: memory@9cc00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9cc00000 0x00 0xe00000>;
+ no-map;
+@@ -78,7 +78,7 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9db00000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9db00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9db00000 0x00 0xc00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
+index bc2289d7477457..2b8b2c76e99465 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
+@@ -206,7 +206,7 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9db00000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9db00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9db00000 0x00 0xc00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am625-beagleplay.dts b/arch/arm64/boot/dts/ti/k3-am625-beagleplay.dts
+index 72b09f9c69d8c8..7028d9835c4a89 100644
+--- a/arch/arm64/boot/dts/ti/k3-am625-beagleplay.dts
++++ b/arch/arm64/boot/dts/ti/k3-am625-beagleplay.dts
+@@ -83,7 +83,7 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9db00000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9db00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9db00000 0x00 0xc00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62a-phycore-som.dtsi b/arch/arm64/boot/dts/ti/k3-am62a-phycore-som.dtsi
+index 5dc5d2cb20ccdd..175fa5048a0bcd 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62a-phycore-som.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62a-phycore-som.dtsi
+@@ -59,37 +59,37 @@ linux,cma {
+ linux,cma-default;
+ };
+
+- c7x_0_dma_memory_region: c7x-dma-memory@99800000 {
++ c7x_0_dma_memory_region: memory@99800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x99800000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_0_memory_region: c7x-memory@99900000 {
++ c7x_0_memory_region: memory@99900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x99900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@9b800000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@9b800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b800000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-dma-memory@9b900000 {
++ mcu_r5fss0_core0_memory_region: memory@9b900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9c800000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9c800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c800000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-dma-memory@9c900000 {
++ wkup_r5fss0_core0_memory_region: memory@9c900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c900000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
+index bceead5e288e6d..4761c3dc2d8e66 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
++++ b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
+@@ -53,37 +53,37 @@ linux,cma {
+ linux,cma-default;
+ };
+
+- c7x_0_dma_memory_region: c7x-dma-memory@99800000 {
++ c7x_0_dma_memory_region: memory@99800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x99800000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_0_memory_region: c7x-memory@99900000 {
++ c7x_0_memory_region: memory@99900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x99900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@9b800000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@9b800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b800000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-dma-memory@9b900000 {
++ mcu_r5fss0_core0_memory_region: memory@9b900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9c800000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9c800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c800000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-dma-memory@9c900000 {
++ wkup_r5fss0_core0_memory_region: memory@9c900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c900000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62d2-evm.dts b/arch/arm64/boot/dts/ti/k3-am62d2-evm.dts
+index daea18b0bc61c6..19a7ca7ee173a4 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62d2-evm.dts
++++ b/arch/arm64/boot/dts/ti/k3-am62d2-evm.dts
+@@ -58,37 +58,37 @@ secure_tfa_ddr: tfa@80000000 {
+ no-map;
+ };
+
+- c7x_0_dma_memory_region: c7x-dma-memory@99800000 {
++ c7x_0_dma_memory_region: memory@99800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x99800000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_0_memory_region: c7x-memory@99900000 {
++ c7x_0_memory_region: memory@99900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x99900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@9b800000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@9b800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b800000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-dma-memory@9b900000 {
++ mcu_r5fss0_core0_memory_region: memory@9b900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9c800000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9c800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c800000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-dma-memory@9c900000 {
++ wkup_r5fss0_core0_memory_region: memory@9c900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c900000 0x00 0xf00000>;
+ no-map;
+@@ -100,7 +100,7 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a0000000 {
++ rtos_ipc_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x01000000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62p-verdin.dtsi b/arch/arm64/boot/dts/ti/k3-am62p-verdin.dtsi
+index a2fdc6741da2cd..3963dbc1faeff3 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62p-verdin.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62p-verdin.dtsi
+@@ -162,7 +162,7 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-dma-memory@9c900000 {
++ wkup_r5fss0_core0_memory_region: memory@9c900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c900000 0x00 0x01e00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts b/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
+index 899da7896563b4..2e081c329d6c21 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
++++ b/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
+@@ -49,25 +49,25 @@ reserved-memory {
+ #size-cells = <2>;
+ ranges;
+
+- mcu_r5fss0_core0_dma_memory_region: mcu-r5fss-dma-memory-region@9b800000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@9b800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b800000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: mcu-r5fss-memory-region@9b900000 {
++ mcu_r5fss0_core0_memory_region: memory@9b900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9b900000 0x00 0xf00000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9c800000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9c800000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c800000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-memory@9c900000 {
++ wkup_r5fss0_core0_memory_region: memory@9c900000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9c900000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+index 13e1d36123d51f..8eed8be2e8bad2 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+@@ -58,25 +58,25 @@ linux,cma {
+ linux,cma-default;
+ };
+
+- mcu_m4fss_dma_memory_region: m4f-dma-memory@9cb00000 {
++ mcu_m4fss_dma_memory_region: memory@9cb00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9cb00000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_m4fss_memory_region: m4f-memory@9cc00000 {
++ mcu_m4fss_memory_region: memory@9cc00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9cc00000 0x00 0xe00000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@9da00000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@9da00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9da00000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-memory@9db00000 {
++ wkup_r5fss0_core0_memory_region: memory@9db00000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0x9db00000 0x00 0xc00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am64-phycore-som.dtsi b/arch/arm64/boot/dts/ti/k3-am64-phycore-som.dtsi
+index d9d491b12c33a8..97ad433e49394b 100644
+--- a/arch/arm64/boot/dts/ti/k3-am64-phycore-som.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am64-phycore-som.dtsi
+@@ -41,67 +41,67 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ main_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ main_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss1_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss1_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_m4fss_dma_memory_region: m4f-dma-memory@a4000000 {
++ mcu_m4fss_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_m4fss_memory_region: m4f-memory@a4100000 {
++ mcu_m4fss_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a5000000 {
++ rtos_ipc_memory_region: memory@a5000000 {
+ reg = <0x00 0xa5000000 0x00 0x00800000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am642-evm.dts b/arch/arm64/boot/dts/ti/k3-am642-evm.dts
+index e01866372293ba..ccb04a3d97c9af 100644
+--- a/arch/arm64/boot/dts/ti/k3-am642-evm.dts
++++ b/arch/arm64/boot/dts/ti/k3-am642-evm.dts
+@@ -53,67 +53,67 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ main_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ main_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss1_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss1_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_m4fss_dma_memory_region: m4f-dma-memory@a4000000 {
++ mcu_m4fss_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_m4fss_memory_region: m4f-memory@a4100000 {
++ mcu_m4fss_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a5000000 {
++ rtos_ipc_memory_region: memory@a5000000 {
+ reg = <0x00 0xa5000000 0x00 0x00800000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am642-sk.dts b/arch/arm64/boot/dts/ti/k3-am642-sk.dts
+index 1deaa0be0085c4..1982608732ee2e 100644
+--- a/arch/arm64/boot/dts/ti/k3-am642-sk.dts
++++ b/arch/arm64/boot/dts/ti/k3-am642-sk.dts
+@@ -51,67 +51,67 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ main_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ main_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss1_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss1_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_m4fss_dma_memory_region: m4f-dma-memory@a4000000 {
++ mcu_m4fss_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_m4fss_memory_region: m4f-memory@a4100000 {
++ mcu_m4fss_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a5000000 {
++ rtos_ipc_memory_region: memory@a5000000 {
+ reg = <0x00 0xa5000000 0x00 0x00800000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am642-sr-som.dtsi b/arch/arm64/boot/dts/ti/k3-am642-sr-som.dtsi
+index a5cec9a075109a..dfe570e0b7071e 100644
+--- a/arch/arm64/boot/dts/ti/k3-am642-sr-som.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am642-sr-som.dtsi
+@@ -115,49 +115,49 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ main_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ main_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss1_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss1_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am642-tqma64xxl.dtsi b/arch/arm64/boot/dts/ti/k3-am642-tqma64xxl.dtsi
+index 828d815d6bdfc2..a8d5144ab1b330 100644
+--- a/arch/arm64/boot/dts/ti/k3-am642-tqma64xxl.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am642-tqma64xxl.dtsi
+@@ -31,55 +31,55 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ main_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ main_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss1_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss1_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a5000000 {
++ rtos_ipc_memory_region: memory@a5000000 {
+ reg = <0x00 0xa5000000 0x00 0x00800000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am65-iot2050-common.dtsi b/arch/arm64/boot/dts/ti/k3-am65-iot2050-common.dtsi
+index e5136ed9476517..211eb9d93159d1 100644
+--- a/arch/arm64/boot/dts/ti/k3-am65-iot2050-common.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am65-iot2050-common.dtsi
+@@ -47,31 +47,31 @@ secure_ddr: secure-ddr@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa0000000 0 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa0100000 0 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa1000000 0 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa1100000 0 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a2000000 {
++ rtos_ipc_memory_region: memory@a2000000 {
+ reg = <0x00 0xa2000000 0x00 0x00200000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am654-base-board.dts b/arch/arm64/boot/dts/ti/k3-am654-base-board.dts
+index e589690c7c8213..dac36ca77a30e6 100644
+--- a/arch/arm64/boot/dts/ti/k3-am654-base-board.dts
++++ b/arch/arm64/boot/dts/ti/k3-am654-base-board.dts
+@@ -50,31 +50,31 @@ secure_ddr: secure-ddr@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa0000000 0 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa0100000 0 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa1000000 0 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0 0xa1100000 0 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a2000000 {
++ rtos_ipc_memory_region: memory@a2000000 {
+ reg = <0x00 0xa2000000 0x00 0x00100000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am67a-beagley-ai.dts b/arch/arm64/boot/dts/ti/k3-am67a-beagley-ai.dts
+index bf9b23df1da2ab..859294b9a2f316 100644
+--- a/arch/arm64/boot/dts/ti/k3-am67a-beagley-ai.dts
++++ b/arch/arm64/boot/dts/ti/k3-am67a-beagley-ai.dts
+@@ -50,67 +50,67 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ wkup_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: mcu-r5fss-dma-memory-region@a1000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: mcu-r5fss-memory-region@a1100000 {
++ mcu_r5fss0_core0_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: main-r5fss-dma-memory-region@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: main-r5fss-memory-region@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c7x_0_dma_memory_region: c7x-dma-memory@a3000000 {
++ c7x_0_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_0_memory_region: c7x-memory@a3100000 {
++ c7x_0_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c7x_1_dma_memory_region: c7x-dma-memory@a4000000 {
++ c7x_1_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_1_memory_region: c7x-memory@a4100000 {
++ c7x_1_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a5000000 {
++ rtos_ipc_memory_region: memory@a5000000 {
+ reg = <0x00 0xa5000000 0x00 0x1c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am68-phycore-som.dtsi b/arch/arm64/boot/dts/ti/k3-am68-phycore-som.dtsi
+index fd715fee8170e0..71f56f0f5363c7 100644
+--- a/arch/arm64/boot/dts/ti/k3-am68-phycore-som.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am68-phycore-som.dtsi
+@@ -49,103 +49,103 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a6000000 {
++ c71_0_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a6100000 {
++ c71_0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_1_dma_memory_region: c71-dma-memory@a7000000 {
++ c71_1_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_1_memory_region: c71-memory@a7100000 {
++ c71_1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a8000000 {
++ rtos_ipc_memory_region: memory@a8000000 {
+ reg = <0x00 0xa8000000 0x00 0x01c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am68-sk-som.dtsi b/arch/arm64/boot/dts/ti/k3-am68-sk-som.dtsi
+index 4ca2d4e2fb9b06..ecc7b3a100d004 100644
+--- a/arch/arm64/boot/dts/ti/k3-am68-sk-som.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am68-sk-som.dtsi
+@@ -27,103 +27,103 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a6000000 {
++ c71_0_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a6100000 {
++ c71_0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_1_dma_memory_region: c71-dma-memory@a7000000 {
++ c71_1_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_1_memory_region: c71-memory@a7100000 {
++ c71_1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a8000000 {
++ rtos_ipc_memory_region: memory@a8000000 {
+ reg = <0x00 0xa8000000 0x00 0x01c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-am69-sk.dts b/arch/arm64/boot/dts/ti/k3-am69-sk.dts
+index 612ac27643d2ce..922866b96e66a3 100644
+--- a/arch/arm64/boot/dts/ti/k3-am69-sk.dts
++++ b/arch/arm64/boot/dts/ti/k3-am69-sk.dts
+@@ -49,145 +49,145 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss2_core0_dma_memory_region: r5f-dma-memory@a6000000 {
++ main_r5fss2_core0_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss2_core0_memory_region: r5f-memory@a6100000 {
++ main_r5fss2_core0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss2_core1_dma_memory_region: r5f-dma-memory@a7000000 {
++ main_r5fss2_core1_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss2_core1_memory_region: r5f-memory@a7100000 {
++ main_r5fss2_core1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a8000000 {
++ c71_0_dma_memory_region: memory@a8000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a8100000 {
++ c71_0_memory_region: memory@a8100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_1_dma_memory_region: c71-dma-memory@a9000000 {
++ c71_1_dma_memory_region: memory@a9000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa9000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_1_memory_region: c71-memory@a9100000 {
++ c71_1_memory_region: memory@a9100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa9100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_2_dma_memory_region: c71-dma-memory@aa000000 {
++ c71_2_dma_memory_region: memory@aa000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xaa000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_2_memory_region: c71-memory@aa100000 {
++ c71_2_memory_region: memory@aa100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xaa100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_3_dma_memory_region: c71-dma-memory@ab000000 {
++ c71_3_dma_memory_region: memory@ab000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xab000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_3_memory_region: c71-memory@ab100000 {
++ c71_3_memory_region: memory@ab100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xab100000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j7200-som-p0.dtsi b/arch/arm64/boot/dts/ti/k3-j7200-som-p0.dtsi
+index 291ab9bb414d78..e8cec315e381b3 100644
+--- a/arch/arm64/boot/dts/ti/k3-j7200-som-p0.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j7200-som-p0.dtsi
+@@ -29,55 +29,55 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a4000000 {
++ rtos_ipc_memory_region: memory@a4000000 {
+ reg = <0x00 0xa4000000 0x00 0x00800000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts b/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts
+index fb899c99753ecd..bb771ce823ec18 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts
++++ b/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts
+@@ -51,115 +51,117 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c66_0_dma_memory_region: c66-dma-memory@a6000000 {
++ /* Carveout locations are flipped due to caching */
++ c66_1_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c66_0_memory_region: c66-memory@a6100000 {
++ c66_0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c66_1_dma_memory_region: c66-dma-memory@a7000000 {
++ /* Carveout locations are flipped due to caching */
++ c66_0_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c66_1_memory_region: c66-memory@a7100000 {
++ c66_1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a8000000 {
++ c71_0_dma_memory_region: memory@a8000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a8100000 {
++ c71_0_memory_region: memory@a8100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@aa000000 {
++ rtos_ipc_memory_region: memory@aa000000 {
+ reg = <0x00 0xaa000000 0x00 0x01c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j721e-sk.dts b/arch/arm64/boot/dts/ti/k3-j721e-sk.dts
+index ffef3d1cfd5532..488c5ebe9e272a 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721e-sk.dts
++++ b/arch/arm64/boot/dts/ti/k3-j721e-sk.dts
+@@ -48,115 +48,117 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c66_0_dma_memory_region: c66-dma-memory@a6000000 {
++ /* Carveout locations are flipped due to caching */
++ c66_1_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c66_0_memory_region: c66-memory@a6100000 {
++ c66_0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c66_1_dma_memory_region: c66-dma-memory@a7000000 {
++ /* Carveout locations are flipped due to caching */
++ c66_0_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c66_1_memory_region: c66-memory@a7100000 {
++ c66_1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a8000000 {
++ c71_0_dma_memory_region: memory@a8000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a8100000 {
++ c71_0_memory_region: memory@a8100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@aa000000 {
++ rtos_ipc_memory_region: memory@aa000000 {
+ reg = <0x00 0xaa000000 0x00 0x01c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j721e-som-p0.dtsi b/arch/arm64/boot/dts/ti/k3-j721e-som-p0.dtsi
+index 0722f6361cc8b0..ef11a5fb6ad56b 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721e-som-p0.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j721e-som-p0.dtsi
+@@ -29,115 +29,115 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c66_1_dma_memory_region: c66-dma-memory@a6000000 {
++ c66_1_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c66_0_memory_region: c66-memory@a6100000 {
++ c66_0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c66_0_dma_memory_region: c66-dma-memory@a7000000 {
++ c66_0_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c66_1_memory_region: c66-memory@a7100000 {
++ c66_1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a8000000 {
++ c71_0_dma_memory_region: memory@a8000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a8100000 {
++ c71_0_memory_region: memory@a8100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@aa000000 {
++ rtos_ipc_memory_region: memory@aa000000 {
+ reg = <0x00 0xaa000000 0x00 0x01c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j721s2-som-p0.dtsi b/arch/arm64/boot/dts/ti/k3-j721s2-som-p0.dtsi
+index 54fc5c4f8c3f52..391e8e3ac26801 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721s2-som-p0.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j721s2-som-p0.dtsi
+@@ -31,103 +31,103 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a6000000 {
++ c71_0_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a6100000 {
++ c71_0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_1_dma_memory_region: c71-dma-memory@a7000000 {
++ c71_1_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_1_memory_region: c71-memory@a7100000 {
++ c71_1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a8000000 {
++ rtos_ipc_memory_region: memory@a8000000 {
+ reg = <0x00 0xa8000000 0x00 0x01c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j722s-evm.dts b/arch/arm64/boot/dts/ti/k3-j722s-evm.dts
+index 9d8abfa9afd274..4cfe5c88e48f59 100644
+--- a/arch/arm64/boot/dts/ti/k3-j722s-evm.dts
++++ b/arch/arm64/boot/dts/ti/k3-j722s-evm.dts
+@@ -52,67 +52,67 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- wkup_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ wkup_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- wkup_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ wkup_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: mcu-r5fss-dma-memory-region@a1000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: mcu-r5fss-memory-region@a1100000 {
++ mcu_r5fss0_core0_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: main-r5fss-dma-memory-region@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: main-r5fss-memory-region@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c7x_0_dma_memory_region: c7x-dma-memory@a3000000 {
++ c7x_0_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_0_memory_region: c7x-memory@a3100000 {
++ c7x_0_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c7x_1_dma_memory_region: c7x-dma-memory@a4000000 {
++ c7x_1_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c7x_1_memory_region: c7x-memory@a4100000 {
++ c7x_1_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- rtos_ipc_memory_region: ipc-memories@a5000000 {
++ rtos_ipc_memory_region: memory@a5000000 {
+ reg = <0x00 0xa5000000 0x00 0x1c00000>;
+ alignment = <0x1000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j742s2-mcu-wakeup.dtsi b/arch/arm64/boot/dts/ti/k3-j742s2-mcu-wakeup.dtsi
+new file mode 100644
+index 00000000000000..61db2348d6a475
+--- /dev/null
++++ b/arch/arm64/boot/dts/ti/k3-j742s2-mcu-wakeup.dtsi
+@@ -0,0 +1,17 @@
++// SPDX-License-Identifier: GPL-2.0-only OR MIT
++/*
++ * Device Tree Source for J742S2 SoC Family
++ *
++ * TRM: https://www.ti.com/lit/pdf/spruje3
++ *
++ * Copyright (C) 2025 Texas Instruments Incorporated - https://www.ti.com/
++ *
++ */
++
++&mcu_r5fss0_core0 {
++ firmware-name = "j742s2-mcu-r5f0_0-fw";
++};
++
++&mcu_r5fss0_core1 {
++ firmware-name = "j742s2-mcu-r5f0_1-fw";
++};
+diff --git a/arch/arm64/boot/dts/ti/k3-j742s2.dtsi b/arch/arm64/boot/dts/ti/k3-j742s2.dtsi
+index 7a72f82f56d688..d265df1abade13 100644
+--- a/arch/arm64/boot/dts/ti/k3-j742s2.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j742s2.dtsi
+@@ -96,3 +96,4 @@ cpu3: cpu@3 {
+ };
+
+ #include "k3-j742s2-main.dtsi"
++#include "k3-j742s2-mcu-wakeup.dtsi"
+diff --git a/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts b/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts
+index a84bde08f85e4a..2ed1ec6d53c880 100644
+--- a/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts
++++ b/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts
+@@ -28,13 +28,13 @@ reserved_memory: reserved-memory {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+- c71_3_dma_memory_region: c71-dma-memory@ab000000 {
++ c71_3_dma_memory_region: memory@ab000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xab000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_3_memory_region: c71-memory@ab100000 {
++ c71_3_memory_region: memory@ab100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xab100000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-j784s4-j742s2-evm-common.dtsi b/arch/arm64/boot/dts/ti/k3-j784s4-j742s2-evm-common.dtsi
+index fa656b7b13a1d6..877b50991ee692 100644
+--- a/arch/arm64/boot/dts/ti/k3-j784s4-j742s2-evm-common.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j784s4-j742s2-evm-common.dtsi
+@@ -35,133 +35,133 @@ secure_ddr: optee@9e800000 {
+ no-map;
+ };
+
+- mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@a0000000 {
++ mcu_r5fss0_core0_dma_memory_region: memory@a0000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core0_memory_region: r5f-memory@a0100000 {
++ mcu_r5fss0_core0_memory_region: memory@a0100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa0100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_dma_memory_region: r5f-dma-memory@a1000000 {
++ mcu_r5fss0_core1_dma_memory_region: memory@a1000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1000000 0x00 0x100000>;
+ no-map;
+ };
+
+- mcu_r5fss0_core1_memory_region: r5f-memory@a1100000 {
++ mcu_r5fss0_core1_memory_region: memory@a1100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa1100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_dma_memory_region: r5f-dma-memory@a2000000 {
++ main_r5fss0_core0_dma_memory_region: memory@a2000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core0_memory_region: r5f-memory@a2100000 {
++ main_r5fss0_core0_memory_region: memory@a2100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa2100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_dma_memory_region: r5f-dma-memory@a3000000 {
++ main_r5fss0_core1_dma_memory_region: memory@a3000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss0_core1_memory_region: r5f-memory@a3100000 {
++ main_r5fss0_core1_memory_region: memory@a3100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa3100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_dma_memory_region: r5f-dma-memory@a4000000 {
++ main_r5fss1_core0_dma_memory_region: memory@a4000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core0_memory_region: r5f-memory@a4100000 {
++ main_r5fss1_core0_memory_region: memory@a4100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa4100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_dma_memory_region: r5f-dma-memory@a5000000 {
++ main_r5fss1_core1_dma_memory_region: memory@a5000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss1_core1_memory_region: r5f-memory@a5100000 {
++ main_r5fss1_core1_memory_region: memory@a5100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa5100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss2_core0_dma_memory_region: r5f-dma-memory@a6000000 {
++ main_r5fss2_core0_dma_memory_region: memory@a6000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss2_core0_memory_region: r5f-memory@a6100000 {
++ main_r5fss2_core0_memory_region: memory@a6100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa6100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- main_r5fss2_core1_dma_memory_region: r5f-dma-memory@a7000000 {
++ main_r5fss2_core1_dma_memory_region: memory@a7000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7000000 0x00 0x100000>;
+ no-map;
+ };
+
+- main_r5fss2_core1_memory_region: r5f-memory@a7100000 {
++ main_r5fss2_core1_memory_region: memory@a7100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa7100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_0_dma_memory_region: c71-dma-memory@a8000000 {
++ c71_0_dma_memory_region: memory@a8000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_0_memory_region: c71-memory@a8100000 {
++ c71_0_memory_region: memory@a8100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa8100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_1_dma_memory_region: c71-dma-memory@a9000000 {
++ c71_1_dma_memory_region: memory@a9000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa9000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_1_memory_region: c71-memory@a9100000 {
++ c71_1_memory_region: memory@a9100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xa9100000 0x00 0xf00000>;
+ no-map;
+ };
+
+- c71_2_dma_memory_region: c71-dma-memory@aa000000 {
++ c71_2_dma_memory_region: memory@aa000000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xaa000000 0x00 0x100000>;
+ no-map;
+ };
+
+- c71_2_memory_region: c71-memory@aa100000 {
++ c71_2_memory_region: memory@aa100000 {
+ compatible = "shared-dma-pool";
+ reg = <0x00 0xaa100000 0x00 0xf00000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/ti/k3-pinctrl.h b/arch/arm64/boot/dts/ti/k3-pinctrl.h
+index c0f09be8d3f94a..146b780f3bd4ad 100644
+--- a/arch/arm64/boot/dts/ti/k3-pinctrl.h
++++ b/arch/arm64/boot/dts/ti/k3-pinctrl.h
+@@ -55,8 +55,8 @@
+
+ #define PIN_DS_FORCE_DISABLE (0 << FORCE_DS_EN_SHIFT)
+ #define PIN_DS_FORCE_ENABLE (1 << FORCE_DS_EN_SHIFT)
+-#define PIN_DS_IO_OVERRIDE_DISABLE (0 << DS_IO_OVERRIDE_EN_SHIFT)
+-#define PIN_DS_IO_OVERRIDE_ENABLE (1 << DS_IO_OVERRIDE_EN_SHIFT)
++#define PIN_DS_ISO_OVERRIDE_DISABLE (0 << ISO_OVERRIDE_EN_SHIFT)
++#define PIN_DS_ISO_OVERRIDE_ENABLE (1 << ISO_OVERRIDE_EN_SHIFT)
+ #define PIN_DS_OUT_ENABLE (0 << DS_OUT_DIS_SHIFT)
+ #define PIN_DS_OUT_DISABLE (1 << DS_OUT_DIS_SHIFT)
+ #define PIN_DS_OUT_VALUE_ZERO (0 << DS_OUT_VAL_SHIFT)
+diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
+index 96482a1412c6aa..fba7ca102a8c42 100644
+--- a/arch/arm64/kernel/process.c
++++ b/arch/arm64/kernel/process.c
+@@ -409,7 +409,7 @@ asmlinkage void ret_from_fork(void) asm("ret_from_fork");
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long stack_start = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *childregs = task_pt_regs(p);
+diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
+index 52ffe115a8c47c..4ef9b7b8fb404a 100644
+--- a/arch/arm64/net/bpf_jit_comp.c
++++ b/arch/arm64/net/bpf_jit_comp.c
+@@ -3064,8 +3064,7 @@ void bpf_jit_free(struct bpf_prog *prog)
+ * before freeing it.
+ */
+ if (jit_data) {
+- bpf_arch_text_copy(&jit_data->ro_header->size, &jit_data->header->size,
+- sizeof(jit_data->header->size));
++ bpf_jit_binary_pack_finalize(jit_data->ro_header, jit_data->header);
+ kfree(jit_data);
+ }
+ prog->bpf_func -= cfi_get_offset();
+diff --git a/arch/csky/kernel/process.c b/arch/csky/kernel/process.c
+index 0c6e4b17fe00fd..a7a90340042a5f 100644
+--- a/arch/csky/kernel/process.c
++++ b/arch/csky/kernel/process.c
+@@ -32,7 +32,7 @@ void flush_thread(void){}
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct switch_stack *childstack;
+diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c
+index 2a77bfd7569450..15b4992bfa298a 100644
+--- a/arch/hexagon/kernel/process.c
++++ b/arch/hexagon/kernel/process.c
+@@ -52,7 +52,7 @@ void arch_cpu_idle(void)
+ */
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct thread_info *ti = task_thread_info(p);
+diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
+index 3582f591bab286..efd9edf65603cc 100644
+--- a/arch/loongarch/kernel/process.c
++++ b/arch/loongarch/kernel/process.c
+@@ -167,7 +167,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ unsigned long childksp;
+ unsigned long tls = args->tls;
+ unsigned long usp = args->stack;
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ struct pt_regs *childregs, *regs = current_pt_regs();
+
+ childksp = (unsigned long)task_stack_page(p) + THREAD_SIZE;
+diff --git a/arch/loongarch/kernel/relocate.c b/arch/loongarch/kernel/relocate.c
+index 50c469067f3aa3..b5e2312a2fca51 100644
+--- a/arch/loongarch/kernel/relocate.c
++++ b/arch/loongarch/kernel/relocate.c
+@@ -166,6 +166,10 @@ static inline __init bool kaslr_disabled(void)
+ return true;
+ #endif
+
++ str = strstr(boot_command_line, "kexec_file");
++ if (str == boot_command_line || (str > boot_command_line && *(str - 1) == ' '))
++ return true;
++
+ return false;
+ }
+
+diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
+index abfdb6bb5c3825..99f55c0bb5e000 100644
+--- a/arch/loongarch/net/bpf_jit.c
++++ b/arch/loongarch/net/bpf_jit.c
+@@ -1294,8 +1294,10 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
+ u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
+ u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
+
+- if (!is_kernel_text((unsigned long)ip) &&
+- !is_bpf_text_address((unsigned long)ip))
++ /* Only poking bpf text is supported. Since kernel function entry
++ * is set up by ftrace, we rely on ftrace to poke kernel functions.
++ */
++ if (!is_bpf_text_address((unsigned long)ip))
+ return -ENOTSUPP;
+
+ ret = emit_jump_or_nops(old_addr, ip, old_insns, is_call);
+@@ -1448,12 +1450,43 @@ void arch_free_bpf_trampoline(void *image, unsigned int size)
+ bpf_prog_pack_free(image, size);
+ }
+
++/*
++ * Sign-extend the register if necessary
++ */
++static void sign_extend(struct jit_ctx *ctx, int rd, int rj, u8 size, bool sign)
++{
++ /* ABI requires unsigned char/short to be zero-extended */
++ if (!sign && (size == 1 || size == 2)) {
++ if (rd != rj)
++ move_reg(ctx, rd, rj);
++ return;
++ }
++
++ switch (size) {
++ case 1:
++ emit_insn(ctx, extwb, rd, rj);
++ break;
++ case 2:
++ emit_insn(ctx, extwh, rd, rj);
++ break;
++ case 4:
++ emit_insn(ctx, addiw, rd, rj, 0);
++ break;
++ case 8:
++ if (rd != rj)
++ move_reg(ctx, rd, rj);
++ break;
++ default:
++ pr_warn("bpf_jit: invalid size %d for sign_extend\n", size);
++ }
++}
++
+ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
+ const struct btf_func_model *m, struct bpf_tramp_links *tlinks,
+ void *func_addr, u32 flags)
+ {
+ int i, ret, save_ret;
+- int stack_size = 0, nargs = 0;
++ int stack_size, nargs;
+ int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off, tcc_ptr_off;
+ bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
+ void *orig_call = func_addr;
+@@ -1462,9 +1495,6 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
+ struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ u32 **branches = NULL;
+
+- if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+- return -ENOTSUPP;
+-
+ /*
+ * FP + 8 [ RA to parent func ] return address to parent
+ * function
+@@ -1495,20 +1525,23 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
+ if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
+ return -ENOTSUPP;
+
++ /* FIXME: No support of struct argument */
++ for (i = 0; i < m->nr_args; i++) {
++ if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
++ return -ENOTSUPP;
++ }
++
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+- stack_size = 0;
+-
+ /* Room of trampoline frame to store return address and frame pointer */
+- stack_size += 16;
++ stack_size = 16;
+
+ save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
+- if (save_ret) {
+- /* Save BPF R0 and A0 */
+- stack_size += 16;
+- retval_off = stack_size;
+- }
++ if (save_ret)
++ stack_size += 16; /* Save BPF R0 and A0 */
++
++ retval_off = stack_size;
+
+ /* Room of trampoline frame to store args */
+ nargs = m->nr_args;
+@@ -1595,7 +1628,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
+ orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+- move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
++ move_addr(ctx, LOONGARCH_GPR_A0, (const u64)im);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
+ if (ret)
+ return ret;
+@@ -1645,7 +1678,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ im->ip_epilogue = ctx->ro_image + ctx->idx;
+- move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
++ move_addr(ctx, LOONGARCH_GPR_A0, (const u64)im);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
+ if (ret)
+ goto out;
+@@ -1655,8 +1688,12 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
+ restore_args(ctx, m->nr_args, args_off);
+
+ if (save_ret) {
+- emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
++ if (is_struct_ops)
++ sign_extend(ctx, LOONGARCH_GPR_A0, regmap[BPF_REG_0],
++ m->ret_size, m->ret_flags & BTF_FMODEL_SIGNED_ARG);
++ else
++ emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ }
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+@@ -1715,7 +1752,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
+
+ jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
+ ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
+- if (ret > 0 && validate_code(&ctx) < 0) {
++ if (ret < 0)
++ goto out;
++
++ if (validate_code(&ctx) < 0) {
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -1726,7 +1766,6 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
+ goto out;
+ }
+
+- bpf_flush_icache(ro_image, ro_image_end);
+ out:
+ kvfree(image);
+ return ret < 0 ? ret : size;
+@@ -1744,8 +1783,7 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
+
+ ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
+
+- /* Page align */
+- return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
++ return ret < 0 ? ret : ret * LOONGARCH_INSN_SIZE;
+ }
+
+ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
+index fda7eac23f872d..f5a07a70e9385a 100644
+--- a/arch/m68k/kernel/process.c
++++ b/arch/m68k/kernel/process.c
+@@ -141,7 +141,7 @@ asmlinkage int m68k_clone3(struct pt_regs *regs)
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct fork_frame {
+diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
+index 56342e11442d2a..6cbf642d7b801d 100644
+--- a/arch/microblaze/kernel/process.c
++++ b/arch/microblaze/kernel/process.c
+@@ -54,7 +54,7 @@ void flush_thread(void)
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *childregs = task_pt_regs(p);
+diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
+index 02aa6a04a21da4..29191fa1801e2a 100644
+--- a/arch/mips/kernel/process.c
++++ b/arch/mips/kernel/process.c
+@@ -107,7 +107,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+ */
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct thread_info *ti = task_thread_info(p);
+diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c
+index f84021303f6a82..151404139085cf 100644
+--- a/arch/nios2/kernel/process.c
++++ b/arch/nios2/kernel/process.c
+@@ -101,7 +101,7 @@ void flush_thread(void)
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *childregs = task_pt_regs(p);
+diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c
+index eef99fee2110cb..73ffb9fa3118bb 100644
+--- a/arch/openrisc/kernel/process.c
++++ b/arch/openrisc/kernel/process.c
+@@ -165,7 +165,7 @@ extern asmlinkage void ret_from_fork(void);
+ int
+ copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *userregs;
+diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
+index ed93bd8c154533..e64ab5d2a40d61 100644
+--- a/arch/parisc/kernel/process.c
++++ b/arch/parisc/kernel/process.c
+@@ -201,7 +201,7 @@ arch_initcall(parisc_idle_init);
+ int
+ copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *cregs = &(p->thread.regs);
+diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
+index 93402a1d9c9fc6..e51a595a06228a 100644
+--- a/arch/powerpc/Kconfig
++++ b/arch/powerpc/Kconfig
+@@ -971,6 +971,10 @@ config SCHED_SMT
+ when dealing with POWER5 cpus at a cost of slightly increased
+ overhead in some places. If unsure say N here.
+
++config SCHED_MC
++ def_bool y
++ depends on SMP
++
+ config PPC_DENORMALISATION
+ bool "PowerPC denormalisation exception handling"
+ depends on PPC_BOOK3S_64
+diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
+index 9753fb87217c35..a58b1029592ce2 100644
+--- a/arch/powerpc/Makefile
++++ b/arch/powerpc/Makefile
+@@ -58,7 +58,7 @@ ifeq ($(CONFIG_PPC64)$(CONFIG_LD_IS_BFD),yy)
+ # There is a corresponding test in arch/powerpc/lib/Makefile
+ KBUILD_LDFLAGS_MODULE += --save-restore-funcs
+ else
+-KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
++KBUILD_LDFLAGS_MODULE += $(objtree)/arch/powerpc/lib/crtsavres.o
+ endif
+
+ ifdef CONFIG_CPU_LITTLE_ENDIAN
+diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h
+index dd4eb306317581..f4390704d5ba29 100644
+--- a/arch/powerpc/include/asm/book3s/32/pgalloc.h
++++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
+@@ -7,8 +7,14 @@
+
+ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
+ {
+- return kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE),
+- pgtable_gfp_flags(mm, GFP_KERNEL));
++ pgd_t *pgd = kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE),
++ pgtable_gfp_flags(mm, GFP_KERNEL));
++
++#ifdef CONFIG_PPC_BOOK3S_603
++ memcpy(pgd + USER_PTRS_PER_PGD, swapper_pg_dir + USER_PTRS_PER_PGD,
++ (MAX_PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
++#endif
++ return pgd;
+ }
+
+ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
+diff --git a/arch/powerpc/include/asm/nohash/pgalloc.h b/arch/powerpc/include/asm/nohash/pgalloc.h
+index bb5f3e8ea912df..4ef780b291bc31 100644
+--- a/arch/powerpc/include/asm/nohash/pgalloc.h
++++ b/arch/powerpc/include/asm/nohash/pgalloc.h
+@@ -22,7 +22,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
+ pgd_t *pgd = kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE),
+ pgtable_gfp_flags(mm, GFP_KERNEL));
+
+-#if defined(CONFIG_PPC_8xx) || defined(CONFIG_PPC_BOOK3S_603)
++#ifdef CONFIG_PPC_8xx
+ memcpy(pgd + USER_PTRS_PER_PGD, swapper_pg_dir + USER_PTRS_PER_PGD,
+ (MAX_PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
+ #endif
+diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
+index da15b5efe8071a..f19ca44512d1e8 100644
+--- a/arch/powerpc/include/asm/topology.h
++++ b/arch/powerpc/include/asm/topology.h
+@@ -131,6 +131,8 @@ static inline int cpu_to_coregroup_id(int cpu)
+ #ifdef CONFIG_SMP
+ #include <asm/cputable.h>
+
++struct cpumask *cpu_coregroup_mask(int cpu);
++
+ #ifdef CONFIG_PPC64
+ #include <asm/smp.h>
+
+diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
+index 56c5ebe21b99a4..613606400ee999 100644
+--- a/arch/powerpc/kernel/head_8xx.S
++++ b/arch/powerpc/kernel/head_8xx.S
+@@ -162,7 +162,7 @@ instruction_counter:
+ * For the MPC8xx, this is a software tablewalk to load the instruction
+ * TLB. The task switch loads the M_TWB register with the pointer to the first
+ * level table.
+- * If we discover there is no second level table (value is zero) or if there
++ * If there is no second level table (value is zero) or if there
+ * is an invalid pte, we load that into the TLB, which causes another fault
+ * into the TLB Error interrupt where we can handle such problems.
+ * We have to use the MD_xxx registers for the tablewalk because the
+@@ -183,9 +183,6 @@ instruction_counter:
+ mtspr SPRN_SPRG_SCRATCH2, r10
+ mtspr SPRN_M_TW, r11
+
+- /* If we are faulting a kernel address, we have to use the
+- * kernel page tables.
+- */
+ mfspr r10, SPRN_SRR0 /* Get effective address of fault */
+ INVALIDATE_ADJACENT_PAGES_CPU15(r10, r11)
+ mtspr SPRN_MD_EPN, r10
+@@ -228,10 +225,6 @@ instruction_counter:
+ mtspr SPRN_SPRG_SCRATCH2, r10
+ mtspr SPRN_M_TW, r11
+
+- /* If we are faulting a kernel address, we have to use the
+- * kernel page tables.
+- */
+- mfspr r10, SPRN_MD_EPN
+ mfspr r10, SPRN_M_TWB /* Get level 1 table */
+ lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r10) /* Get level 1 entry */
+
+diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
+index 126bf3b06ab7e2..0e45cac4de76b6 100644
+--- a/arch/powerpc/kernel/module_64.c
++++ b/arch/powerpc/kernel/module_64.c
+@@ -1139,7 +1139,7 @@ static int setup_ftrace_ool_stubs(const Elf64_Shdr *sechdrs, unsigned long addr,
+
+ /* reserve stubs */
+ for (i = 0; i < num_stubs; i++)
+- if (patch_u32((void *)&stub->funcdata, PPC_RAW_NOP()))
++ if (patch_u32((void *)&stub[i].funcdata, PPC_RAW_NOP()))
+ return -1;
+ #endif
+
+diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
+index 855e0988650326..eb23966ac0a9f0 100644
+--- a/arch/powerpc/kernel/process.c
++++ b/arch/powerpc/kernel/process.c
+@@ -1805,7 +1805,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ f = ret_from_kernel_user_thread;
+ } else {
+ struct pt_regs *regs = current_pt_regs();
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+
+ /* Copy registers */
+diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
+index f59e4b9cc20743..68edb66c2964ba 100644
+--- a/arch/powerpc/kernel/smp.c
++++ b/arch/powerpc/kernel/smp.c
+@@ -1028,19 +1028,19 @@ static int powerpc_shared_proc_flags(void)
+ * We can't just pass cpu_l2_cache_mask() directly because
+ * returns a non-const pointer and the compiler barfs on that.
+ */
+-static const struct cpumask *shared_cache_mask(int cpu)
++static const struct cpumask *tl_cache_mask(struct sched_domain_topology_level *tl, int cpu)
+ {
+ return per_cpu(cpu_l2_cache_map, cpu);
+ }
+
+ #ifdef CONFIG_SCHED_SMT
+-static const struct cpumask *smallcore_smt_mask(int cpu)
++static const struct cpumask *tl_smallcore_smt_mask(struct sched_domain_topology_level *tl, int cpu)
+ {
+ return cpu_smallcore_mask(cpu);
+ }
+ #endif
+
+-static struct cpumask *cpu_coregroup_mask(int cpu)
++struct cpumask *cpu_coregroup_mask(int cpu)
+ {
+ return per_cpu(cpu_coregroup_map, cpu);
+ }
+@@ -1054,11 +1054,6 @@ static bool has_coregroup_support(void)
+ return coregroup_enabled;
+ }
+
+-static const struct cpumask *cpu_mc_mask(int cpu)
+-{
+- return cpu_coregroup_mask(cpu);
+-}
+-
+ static int __init init_big_cores(void)
+ {
+ int cpu;
+@@ -1448,7 +1443,7 @@ static bool update_mask_by_l2(int cpu, cpumask_var_t *mask)
+ return false;
+ }
+
+- cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
++ cpumask_and(*mask, cpu_online_mask, cpu_node_mask(cpu));
+
+ /* Update l2-cache mask with all the CPUs that are part of submask */
+ or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
+@@ -1538,7 +1533,7 @@ static void update_coregroup_mask(int cpu, cpumask_var_t *mask)
+ return;
+ }
+
+- cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
++ cpumask_and(*mask, cpu_online_mask, cpu_node_mask(cpu));
+
+ /* Update coregroup mask with all the CPUs that are part of submask */
+ or_cpumasks_related(cpu, cpu, submask_fn, cpu_coregroup_mask);
+@@ -1601,7 +1596,7 @@ static void add_cpu_to_masks(int cpu)
+
+ /* If chip_id is -1; limit the cpu_core_mask to within PKG */
+ if (chip_id == -1)
+- cpumask_and(mask, mask, cpu_cpu_mask(cpu));
++ cpumask_and(mask, mask, cpu_node_mask(cpu));
+
+ for_each_cpu(i, mask) {
+ if (chip_id == cpu_to_chip_id(i)) {
+@@ -1701,22 +1696,22 @@ static void __init build_sched_topology(void)
+ if (has_big_cores) {
+ pr_info("Big cores detected but using small core scheduling\n");
+ powerpc_topology[i++] =
+- SDTL_INIT(smallcore_smt_mask, powerpc_smt_flags, SMT);
++ SDTL_INIT(tl_smallcore_smt_mask, powerpc_smt_flags, SMT);
+ } else {
+- powerpc_topology[i++] = SDTL_INIT(cpu_smt_mask, powerpc_smt_flags, SMT);
++ powerpc_topology[i++] = SDTL_INIT(tl_smt_mask, powerpc_smt_flags, SMT);
+ }
+ #endif
+ if (shared_caches) {
+ powerpc_topology[i++] =
+- SDTL_INIT(shared_cache_mask, powerpc_shared_cache_flags, CACHE);
++ SDTL_INIT(tl_cache_mask, powerpc_shared_cache_flags, CACHE);
+ }
+
+ if (has_coregroup_support()) {
+ powerpc_topology[i++] =
+- SDTL_INIT(cpu_mc_mask, powerpc_shared_proc_flags, MC);
++ SDTL_INIT(tl_mc_mask, powerpc_shared_proc_flags, MC);
+ }
+
+- powerpc_topology[i++] = SDTL_INIT(cpu_cpu_mask, powerpc_shared_proc_flags, PKG);
++ powerpc_topology[i++] = SDTL_INIT(tl_pkg_mask, powerpc_shared_proc_flags, PKG);
+
+ /* There must be one trailing NULL entry left. */
+ BUG_ON(i >= ARRAY_SIZE(powerpc_topology) - 1);
+diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
+index 6dca92d5a6e822..841d077e28251a 100644
+--- a/arch/powerpc/kernel/trace/ftrace.c
++++ b/arch/powerpc/kernel/trace/ftrace.c
+@@ -488,8 +488,10 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
+ return ret;
+
+ /* Set up out-of-line stub */
+- if (IS_ENABLED(CONFIG_PPC_FTRACE_OUT_OF_LINE))
+- return ftrace_init_ool_stub(mod, rec);
++ if (IS_ENABLED(CONFIG_PPC_FTRACE_OUT_OF_LINE)) {
++ ret = ftrace_init_ool_stub(mod, rec);
++ goto out;
++ }
+
+ /* Nop-out the ftrace location */
+ new = ppc_inst(PPC_RAW_NOP());
+@@ -520,6 +522,10 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
+ return -EINVAL;
+ }
+
++out:
++ if (!ret)
++ ret = ftrace_rec_set_nop_ops(rec);
++
+ return ret;
+ }
+
+diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
+index a0a40889d79a53..31a392993cb452 100644
+--- a/arch/riscv/kernel/process.c
++++ b/arch/riscv/kernel/process.c
+@@ -223,7 +223,7 @@ asmlinkage void ret_from_fork_user(struct pt_regs *regs)
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *childregs = task_pt_regs(p);
+diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
+index 3b426c800480c8..5f33625f407065 100644
+--- a/arch/riscv/kvm/vmid.c
++++ b/arch/riscv/kvm/vmid.c
+@@ -14,6 +14,7 @@
+ #include <linux/smp.h>
+ #include <linux/kvm_host.h>
+ #include <asm/csr.h>
++#include <asm/kvm_mmu.h>
+ #include <asm/kvm_tlb.h>
+ #include <asm/kvm_vmid.h>
+
+@@ -28,7 +29,7 @@ void __init kvm_riscv_gstage_vmid_detect(void)
+
+ /* Figure-out number of VMID bits in HW */
+ old = csr_read(CSR_HGATP);
+- csr_write(CSR_HGATP, old | HGATP_VMID);
++ csr_write(CSR_HGATP, (kvm_riscv_gstage_mode << HGATP_MODE_SHIFT) | HGATP_VMID);
+ vmid_bits = csr_read(CSR_HGATP);
+ vmid_bits = (vmid_bits & HGATP_VMID) >> HGATP_VMID_SHIFT;
+ vmid_bits = fls_long(vmid_bits);
+diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
+index 9883a55d61b5b9..f1efa4d6b27f3a 100644
+--- a/arch/riscv/net/bpf_jit_comp64.c
++++ b/arch/riscv/net/bpf_jit_comp64.c
+@@ -765,6 +765,39 @@ static int emit_atomic_rmw(u8 rd, u8 rs, const struct bpf_insn *insn,
+ return 0;
+ }
+
++/*
++ * Sign-extend the register if necessary
++ */
++static int sign_extend(u8 rd, u8 rs, u8 sz, bool sign, struct rv_jit_context *ctx)
++{
++ if (!sign && (sz == 1 || sz == 2)) {
++ if (rd != rs)
++ emit_mv(rd, rs, ctx);
++ return 0;
++ }
++
++ switch (sz) {
++ case 1:
++ emit_sextb(rd, rs, ctx);
++ break;
++ case 2:
++ emit_sexth(rd, rs, ctx);
++ break;
++ case 4:
++ emit_sextw(rd, rs, ctx);
++ break;
++ case 8:
++ if (rd != rs)
++ emit_mv(rd, rs, ctx);
++ break;
++ default:
++ pr_err("bpf-jit: invalid size %d for sign_extend\n", sz);
++ return -EINVAL;
++ }
++
++ return 0;
++}
++
+ #define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0)
+ #define BPF_FIXUP_REG_MASK GENMASK(31, 27)
+ #define REG_DONT_CLEAR_MARKER 0 /* RV_REG_ZERO unused in pt_regmap */
+@@ -1226,8 +1259,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
+ restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx);
+
+ if (save_ret) {
+- emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx);
+ emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx);
++ if (is_struct_ops) {
++ ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], m->ret_size,
++ m->ret_flags & BTF_FMODEL_SIGNED_ARG, ctx);
++ if (ret)
++ goto out;
++ } else {
++ emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx);
++ }
+ }
+
+ emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx);
+diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
+index f55f09cda6f889..b107dbca4ed7df 100644
+--- a/arch/s390/kernel/process.c
++++ b/arch/s390/kernel/process.c
+@@ -106,7 +106,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long new_stackp = args->stack;
+ unsigned long tls = args->tls;
+ struct fake_frame
+diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
+index 46569b8e47dde3..1594c80e9bc4db 100644
+--- a/arch/s390/kernel/topology.c
++++ b/arch/s390/kernel/topology.c
+@@ -509,33 +509,27 @@ int topology_cpu_init(struct cpu *cpu)
+ return rc;
+ }
+
+-static const struct cpumask *cpu_thread_mask(int cpu)
+-{
+- return &cpu_topology[cpu].thread_mask;
+-}
+-
+-
+ const struct cpumask *cpu_coregroup_mask(int cpu)
+ {
+ return &cpu_topology[cpu].core_mask;
+ }
+
+-static const struct cpumask *cpu_book_mask(int cpu)
++static const struct cpumask *tl_book_mask(struct sched_domain_topology_level *tl, int cpu)
+ {
+ return &cpu_topology[cpu].book_mask;
+ }
+
+-static const struct cpumask *cpu_drawer_mask(int cpu)
++static const struct cpumask *tl_drawer_mask(struct sched_domain_topology_level *tl, int cpu)
+ {
+ return &cpu_topology[cpu].drawer_mask;
+ }
+
+ static struct sched_domain_topology_level s390_topology[] = {
+- SDTL_INIT(cpu_thread_mask, cpu_smt_flags, SMT),
+- SDTL_INIT(cpu_coregroup_mask, cpu_core_flags, MC),
+- SDTL_INIT(cpu_book_mask, NULL, BOOK),
+- SDTL_INIT(cpu_drawer_mask, NULL, DRAWER),
+- SDTL_INIT(cpu_cpu_mask, NULL, PKG),
++ SDTL_INIT(tl_smt_mask, cpu_smt_flags, SMT),
++ SDTL_INIT(tl_mc_mask, cpu_core_flags, MC),
++ SDTL_INIT(tl_book_mask, NULL, BOOK),
++ SDTL_INIT(tl_drawer_mask, NULL, DRAWER),
++ SDTL_INIT(tl_pkg_mask, NULL, PKG),
+ { NULL, },
+ };
+
+diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
+index bb17efe29d6570..b2b8eb62b82e02 100644
+--- a/arch/s390/net/bpf_jit_comp.c
++++ b/arch/s390/net/bpf_jit_comp.c
+@@ -1790,20 +1790,21 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
+
+ REG_SET_SEEN(BPF_REG_5);
+ jit->seen |= SEEN_FUNC;
++
+ /*
+ * Copy the tail call counter to where the callee expects it.
+- *
+- * Note 1: The callee can increment the tail call counter, but
+- * we do not load it back, since the x86 JIT does not do this
+- * either.
+- *
+- * Note 2: We assume that the verifier does not let us call the
+- * main program, which clears the tail call counter on entry.
+ */
+- /* mvc tail_call_cnt(4,%r15),frame_off+tail_call_cnt(%r15) */
+- _EMIT6(0xd203f000 | offsetof(struct prog_frame, tail_call_cnt),
+- 0xf000 | (jit->frame_off +
+- offsetof(struct prog_frame, tail_call_cnt)));
++
++ if (insn->src_reg == BPF_PSEUDO_CALL)
++ /*
++ * mvc tail_call_cnt(4,%r15),
++ * frame_off+tail_call_cnt(%r15)
++ */
++ _EMIT6(0xd203f000 | offsetof(struct prog_frame,
++ tail_call_cnt),
++ 0xf000 | (jit->frame_off +
++ offsetof(struct prog_frame,
++ tail_call_cnt)));
+
+ /* Sign-extend the kfunc arguments. */
+ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
+@@ -1825,6 +1826,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
+ call_r1(jit);
+ /* lgr %b0,%r2: load return value into %b0 */
+ EMIT4(0xb9040000, BPF_REG_0, REG_2);
++
++ /*
++ * Copy the potentially updated tail call counter back.
++ */
++
++ if (insn->src_reg == BPF_PSEUDO_CALL)
++ /*
++ * mvc frame_off+tail_call_cnt(%r15),
++ * tail_call_cnt(4,%r15)
++ */
++ _EMIT6(0xd203f000 | (jit->frame_off +
++ offsetof(struct prog_frame,
++ tail_call_cnt)),
++ 0xf000 | offsetof(struct prog_frame,
++ tail_call_cnt));
++
+ break;
+ }
+ case BPF_JMP | BPF_TAIL_CALL: {
+@@ -2822,6 +2839,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
+ /* stg %r2,retval_off(%r15) */
+ EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15,
+ tjit->retval_off);
++ /* mvc tccnt_off(%r15),tail_call_cnt(4,%r15) */
++ _EMIT6(0xd203f000 | tjit->tccnt_off,
++ 0xf000 | offsetof(struct prog_frame, tail_call_cnt));
+
+ im->ip_after_call = jit->prg_buf + jit->prg;
+
+diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c
+index 92b6649d492952..62f753a85b89c7 100644
+--- a/arch/sh/kernel/process_32.c
++++ b/arch/sh/kernel/process_32.c
+@@ -89,7 +89,7 @@ asmlinkage void ret_from_kernel_thread(void);
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp = args->stack;
+ unsigned long tls = args->tls;
+ struct thread_info *ti = task_thread_info(p);
+diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
+index 9c7c662cb5659e..5a28c0e91bf15f 100644
+--- a/arch/sparc/kernel/process_32.c
++++ b/arch/sparc/kernel/process_32.c
+@@ -260,7 +260,7 @@ extern void ret_from_kernel_thread(void);
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long sp = args->stack;
+ unsigned long tls = args->tls;
+ struct thread_info *ti = task_thread_info(p);
+diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
+index 529adfecd58ca1..25781923788a03 100644
+--- a/arch/sparc/kernel/process_64.c
++++ b/arch/sparc/kernel/process_64.c
+@@ -567,7 +567,7 @@ void fault_in_user_windows(struct pt_regs *regs)
+ */
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long sp = args->stack;
+ unsigned long tls = args->tls;
+ struct thread_info *t = task_thread_info(p);
+diff --git a/arch/sparc/lib/M7memcpy.S b/arch/sparc/lib/M7memcpy.S
+index cbd42ea7c3f7c2..99357bfa8e82ad 100644
+--- a/arch/sparc/lib/M7memcpy.S
++++ b/arch/sparc/lib/M7memcpy.S
+@@ -696,16 +696,16 @@ FUNC_NAME:
+ EX_LD_FP(LOAD(ldd, %o4+40, %f26), memcpy_retl_o2_plus_o5_plus_40)
+ faligndata %f24, %f26, %f10
+ EX_ST_FP(STORE(std, %f6, %o0+24), memcpy_retl_o2_plus_o5_plus_40)
+- EX_LD_FP(LOAD(ldd, %o4+48, %f28), memcpy_retl_o2_plus_o5_plus_40)
++ EX_LD_FP(LOAD(ldd, %o4+48, %f28), memcpy_retl_o2_plus_o5_plus_32)
+ faligndata %f26, %f28, %f12
+- EX_ST_FP(STORE(std, %f8, %o0+32), memcpy_retl_o2_plus_o5_plus_40)
++ EX_ST_FP(STORE(std, %f8, %o0+32), memcpy_retl_o2_plus_o5_plus_32)
+ add %o4, 64, %o4
+- EX_LD_FP(LOAD(ldd, %o4-8, %f30), memcpy_retl_o2_plus_o5_plus_40)
++ EX_LD_FP(LOAD(ldd, %o4-8, %f30), memcpy_retl_o2_plus_o5_plus_24)
+ faligndata %f28, %f30, %f14
+- EX_ST_FP(STORE(std, %f10, %o0+40), memcpy_retl_o2_plus_o5_plus_40)
+- EX_ST_FP(STORE(std, %f12, %o0+48), memcpy_retl_o2_plus_o5_plus_40)
++ EX_ST_FP(STORE(std, %f10, %o0+40), memcpy_retl_o2_plus_o5_plus_24)
++ EX_ST_FP(STORE(std, %f12, %o0+48), memcpy_retl_o2_plus_o5_plus_16)
+ add %o0, 64, %o0
+- EX_ST_FP(STORE(std, %f14, %o0-8), memcpy_retl_o2_plus_o5_plus_40)
++ EX_ST_FP(STORE(std, %f14, %o0-8), memcpy_retl_o2_plus_o5_plus_8)
+ fsrc2 %f30, %f14
+ bgu,pt %xcc, .Lunalign_sloop
+ prefetch [%o4 + (8 * BLOCK_SIZE)], 20
+@@ -728,7 +728,7 @@ FUNC_NAME:
+ add %o4, 8, %o4
+ faligndata %f0, %f2, %f16
+ subcc %o5, 8, %o5
+- EX_ST_FP(STORE(std, %f16, %o0), memcpy_retl_o2_plus_o5)
++ EX_ST_FP(STORE(std, %f16, %o0), memcpy_retl_o2_plus_o5_plus_8)
+ fsrc2 %f2, %f0
+ bgu,pt %xcc, .Lunalign_by8
+ add %o0, 8, %o0
+@@ -772,7 +772,7 @@ FUNC_NAME:
+ subcc %o5, 0x20, %o5
+ EX_ST(STORE(stx, %o3, %o0 + 0x00), memcpy_retl_o2_plus_o5_plus_32)
+ EX_ST(STORE(stx, %g2, %o0 + 0x08), memcpy_retl_o2_plus_o5_plus_24)
+- EX_ST(STORE(stx, %g7, %o0 + 0x10), memcpy_retl_o2_plus_o5_plus_24)
++ EX_ST(STORE(stx, %g7, %o0 + 0x10), memcpy_retl_o2_plus_o5_plus_16)
+ EX_ST(STORE(stx, %o4, %o0 + 0x18), memcpy_retl_o2_plus_o5_plus_8)
+ bne,pt %xcc, 1b
+ add %o0, 0x20, %o0
+@@ -804,12 +804,12 @@ FUNC_NAME:
+ brz,pt %o3, 2f
+ sub %o2, %o3, %o2
+
+-1: EX_LD(LOAD(ldub, %o1 + 0x00, %g2), memcpy_retl_o2_plus_g1)
++1: EX_LD(LOAD(ldub, %o1 + 0x00, %g2), memcpy_retl_o2_plus_o3)
+ add %o1, 1, %o1
+ subcc %o3, 1, %o3
+ add %o0, 1, %o0
+ bne,pt %xcc, 1b
+- EX_ST(STORE(stb, %g2, %o0 - 0x01), memcpy_retl_o2_plus_g1_plus_1)
++ EX_ST(STORE(stb, %g2, %o0 - 0x01), memcpy_retl_o2_plus_o3_plus_1)
+ 2:
+ and %o1, 0x7, %o3
+ brz,pn %o3, .Lmedium_noprefetch_cp
+diff --git a/arch/sparc/lib/Memcpy_utils.S b/arch/sparc/lib/Memcpy_utils.S
+index 64fbac28b3db18..207343367bb2da 100644
+--- a/arch/sparc/lib/Memcpy_utils.S
++++ b/arch/sparc/lib/Memcpy_utils.S
+@@ -137,6 +137,15 @@ ENTRY(memcpy_retl_o2_plus_63_8)
+ ba,pt %xcc, __restore_asi
+ add %o2, 8, %o0
+ ENDPROC(memcpy_retl_o2_plus_63_8)
++ENTRY(memcpy_retl_o2_plus_o3)
++ ba,pt %xcc, __restore_asi
++ add %o2, %o3, %o0
++ENDPROC(memcpy_retl_o2_plus_o3)
++ENTRY(memcpy_retl_o2_plus_o3_plus_1)
++ add %o3, 1, %o3
++ ba,pt %xcc, __restore_asi
++ add %o2, %o3, %o0
++ENDPROC(memcpy_retl_o2_plus_o3_plus_1)
+ ENTRY(memcpy_retl_o2_plus_o5)
+ ba,pt %xcc, __restore_asi
+ add %o2, %o5, %o0
+diff --git a/arch/sparc/lib/NG4memcpy.S b/arch/sparc/lib/NG4memcpy.S
+index 7ad58ebe0d0096..df0ec1bd194892 100644
+--- a/arch/sparc/lib/NG4memcpy.S
++++ b/arch/sparc/lib/NG4memcpy.S
+@@ -281,7 +281,7 @@ FUNC_NAME: /* %o0=dst, %o1=src, %o2=len */
+ subcc %o5, 0x20, %o5
+ EX_ST(STORE(stx, %g1, %o0 + 0x00), memcpy_retl_o2_plus_o5_plus_32)
+ EX_ST(STORE(stx, %g2, %o0 + 0x08), memcpy_retl_o2_plus_o5_plus_24)
+- EX_ST(STORE(stx, GLOBAL_SPARE, %o0 + 0x10), memcpy_retl_o2_plus_o5_plus_24)
++ EX_ST(STORE(stx, GLOBAL_SPARE, %o0 + 0x10), memcpy_retl_o2_plus_o5_plus_16)
+ EX_ST(STORE(stx, %o4, %o0 + 0x18), memcpy_retl_o2_plus_o5_plus_8)
+ bne,pt %icc, 1b
+ add %o0, 0x20, %o0
+diff --git a/arch/sparc/lib/NGmemcpy.S b/arch/sparc/lib/NGmemcpy.S
+index ee51c12306894e..bbd3ea0a64822c 100644
+--- a/arch/sparc/lib/NGmemcpy.S
++++ b/arch/sparc/lib/NGmemcpy.S
+@@ -79,8 +79,8 @@
+ #ifndef EX_RETVAL
+ #define EX_RETVAL(x) x
+ __restore_asi:
+- ret
+ wr %g0, ASI_AIUS, %asi
++ ret
+ restore
+ ENTRY(NG_ret_i2_plus_i4_plus_1)
+ ba,pt %xcc, __restore_asi
+@@ -125,15 +125,16 @@ ENTRY(NG_ret_i2_plus_g1_minus_56)
+ ba,pt %xcc, __restore_asi
+ add %i2, %g1, %i0
+ ENDPROC(NG_ret_i2_plus_g1_minus_56)
+-ENTRY(NG_ret_i2_plus_i4)
++ENTRY(NG_ret_i2_plus_i4_plus_16)
++ add %i4, 16, %i4
+ ba,pt %xcc, __restore_asi
+ add %i2, %i4, %i0
+-ENDPROC(NG_ret_i2_plus_i4)
+-ENTRY(NG_ret_i2_plus_i4_minus_8)
+- sub %i4, 8, %i4
++ENDPROC(NG_ret_i2_plus_i4_plus_16)
++ENTRY(NG_ret_i2_plus_i4_plus_8)
++ add %i4, 8, %i4
+ ba,pt %xcc, __restore_asi
+ add %i2, %i4, %i0
+-ENDPROC(NG_ret_i2_plus_i4_minus_8)
++ENDPROC(NG_ret_i2_plus_i4_plus_8)
+ ENTRY(NG_ret_i2_plus_8)
+ ba,pt %xcc, __restore_asi
+ add %i2, 8, %i0
+@@ -160,6 +161,12 @@ ENTRY(NG_ret_i2_and_7_plus_i4)
+ ba,pt %xcc, __restore_asi
+ add %i2, %i4, %i0
+ ENDPROC(NG_ret_i2_and_7_plus_i4)
++ENTRY(NG_ret_i2_and_7_plus_i4_plus_8)
++ and %i2, 7, %i2
++ add %i4, 8, %i4
++ ba,pt %xcc, __restore_asi
++ add %i2, %i4, %i0
++ENDPROC(NG_ret_i2_and_7_plus_i4)
+ #endif
+
+ .align 64
+@@ -405,13 +412,13 @@ FUNC_NAME: /* %i0=dst, %i1=src, %i2=len */
+ andn %i2, 0xf, %i4
+ and %i2, 0xf, %i2
+ 1: subcc %i4, 0x10, %i4
+- EX_LD(LOAD(ldx, %i1, %o4), NG_ret_i2_plus_i4)
++ EX_LD(LOAD(ldx, %i1, %o4), NG_ret_i2_plus_i4_plus_16)
+ add %i1, 0x08, %i1
+- EX_LD(LOAD(ldx, %i1, %g1), NG_ret_i2_plus_i4)
++ EX_LD(LOAD(ldx, %i1, %g1), NG_ret_i2_plus_i4_plus_16)
+ sub %i1, 0x08, %i1
+- EX_ST(STORE(stx, %o4, %i1 + %i3), NG_ret_i2_plus_i4)
++ EX_ST(STORE(stx, %o4, %i1 + %i3), NG_ret_i2_plus_i4_plus_16)
+ add %i1, 0x8, %i1
+- EX_ST(STORE(stx, %g1, %i1 + %i3), NG_ret_i2_plus_i4_minus_8)
++ EX_ST(STORE(stx, %g1, %i1 + %i3), NG_ret_i2_plus_i4_plus_8)
+ bgu,pt %XCC, 1b
+ add %i1, 0x8, %i1
+ 73: andcc %i2, 0x8, %g0
+@@ -468,7 +475,7 @@ FUNC_NAME: /* %i0=dst, %i1=src, %i2=len */
+ subcc %i4, 0x8, %i4
+ srlx %g3, %i3, %i5
+ or %i5, %g2, %i5
+- EX_ST(STORE(stx, %i5, %o0), NG_ret_i2_and_7_plus_i4)
++ EX_ST(STORE(stx, %i5, %o0), NG_ret_i2_and_7_plus_i4_plus_8)
+ add %o0, 0x8, %o0
+ bgu,pt %icc, 1b
+ sllx %g3, %g1, %g2
+diff --git a/arch/sparc/lib/U1memcpy.S b/arch/sparc/lib/U1memcpy.S
+index 635398ec7540ee..154fbd35400ca8 100644
+--- a/arch/sparc/lib/U1memcpy.S
++++ b/arch/sparc/lib/U1memcpy.S
+@@ -164,17 +164,18 @@ ENTRY(U1_gs_40_fp)
+ retl
+ add %o0, %o2, %o0
+ ENDPROC(U1_gs_40_fp)
+-ENTRY(U1_g3_0_fp)
+- VISExitHalf
+- retl
+- add %g3, %o2, %o0
+-ENDPROC(U1_g3_0_fp)
+ ENTRY(U1_g3_8_fp)
+ VISExitHalf
+ add %g3, 8, %g3
+ retl
+ add %g3, %o2, %o0
+ ENDPROC(U1_g3_8_fp)
++ENTRY(U1_g3_16_fp)
++ VISExitHalf
++ add %g3, 16, %g3
++ retl
++ add %g3, %o2, %o0
++ENDPROC(U1_g3_16_fp)
+ ENTRY(U1_o2_0_fp)
+ VISExitHalf
+ retl
+@@ -547,18 +548,18 @@ FUNC_NAME: /* %o0=dst, %o1=src, %o2=len */
+ 62: FINISH_VISCHUNK(o0, f44, f46)
+ 63: UNEVEN_VISCHUNK_LAST(o0, f46, f0)
+
+-93: EX_LD_FP(LOAD(ldd, %o1, %f2), U1_g3_0_fp)
++93: EX_LD_FP(LOAD(ldd, %o1, %f2), U1_g3_8_fp)
+ add %o1, 8, %o1
+ subcc %g3, 8, %g3
+ faligndata %f0, %f2, %f8
+- EX_ST_FP(STORE(std, %f8, %o0), U1_g3_8_fp)
++ EX_ST_FP(STORE(std, %f8, %o0), U1_g3_16_fp)
+ bl,pn %xcc, 95f
+ add %o0, 8, %o0
+- EX_LD_FP(LOAD(ldd, %o1, %f0), U1_g3_0_fp)
++ EX_LD_FP(LOAD(ldd, %o1, %f0), U1_g3_8_fp)
+ add %o1, 8, %o1
+ subcc %g3, 8, %g3
+ faligndata %f2, %f0, %f8
+- EX_ST_FP(STORE(std, %f8, %o0), U1_g3_8_fp)
++ EX_ST_FP(STORE(std, %f8, %o0), U1_g3_16_fp)
+ bge,pt %xcc, 93b
+ add %o0, 8, %o0
+
+diff --git a/arch/sparc/lib/U3memcpy.S b/arch/sparc/lib/U3memcpy.S
+index 9248d59c734ce2..bace3a18f836f1 100644
+--- a/arch/sparc/lib/U3memcpy.S
++++ b/arch/sparc/lib/U3memcpy.S
+@@ -267,6 +267,7 @@ FUNC_NAME: /* %o0=dst, %o1=src, %o2=len */
+ faligndata %f10, %f12, %f26
+ EX_LD_FP(LOAD(ldd, %o1 + 0x040, %f0), U3_retl_o2)
+
++ and %o2, 0x3f, %o2
+ subcc GLOBAL_SPARE, 0x80, GLOBAL_SPARE
+ add %o1, 0x40, %o1
+ bgu,pt %XCC, 1f
+@@ -336,7 +337,6 @@ FUNC_NAME: /* %o0=dst, %o1=src, %o2=len */
+ * Also notice how this code is careful not to perform a
+ * load past the end of the src buffer.
+ */
+- and %o2, 0x3f, %o2
+ andcc %o2, 0x38, %g2
+ be,pn %XCC, 2f
+ subcc %g2, 0x8, %g2
+diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
+index 1be644de9e41ec..9c9c66dc45f054 100644
+--- a/arch/um/kernel/process.c
++++ b/arch/um/kernel/process.c
+@@ -143,7 +143,7 @@ static void fork_handler(void)
+
+ int copy_thread(struct task_struct * p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long sp = args->stack;
+ unsigned long tls = args->tls;
+ void (*handler)(void);
+diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
+index 61da6b8a3d519f..cbac54cb3a9ec5 100644
+--- a/arch/x86/events/intel/bts.c
++++ b/arch/x86/events/intel/bts.c
+@@ -643,4 +643,4 @@ static __init int bts_init(void)
+
+ return perf_pmu_register(&bts_pmu, "intel_bts", -1);
+ }
+-arch_initcall(bts_init);
++early_initcall(bts_init);
+diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
+index c2fb729c270ec4..15da60cf69f20c 100644
+--- a/arch/x86/events/intel/core.c
++++ b/arch/x86/events/intel/core.c
+@@ -2997,7 +2997,8 @@ static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
+ if (event->group_leader != leader->group_leader)
+ break;
+ for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
+- if (WARN_ON_ONCE(i + idx > cpuc->n_events))
++ if (i + idx >= cpuc->n_events ||
++ !is_acr_event_group(cpuc->event_list[i + idx]))
+ return;
+ __set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1);
+ }
+diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h
+index c060549c6c9407..89004f4ca208da 100644
+--- a/arch/x86/include/asm/fpu/sched.h
++++ b/arch/x86/include/asm/fpu/sched.h
+@@ -11,7 +11,7 @@
+
+ extern void save_fpregs_to_fpstate(struct fpu *fpu);
+ extern void fpu__drop(struct task_struct *tsk);
+-extern int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal,
++extern int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal,
+ unsigned long shstk_addr);
+ extern void fpu_flush_thread(void);
+
+diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
+index 77d8f49b92bdd0..f59ae7186940a9 100644
+--- a/arch/x86/include/asm/segment.h
++++ b/arch/x86/include/asm/segment.h
+@@ -244,7 +244,7 @@ static inline unsigned long vdso_encode_cpunode(int cpu, unsigned long node)
+
+ static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node)
+ {
+- unsigned int p;
++ unsigned long p;
+
+ /*
+ * Load CPU and node number from the GDT. LSL is faster than RDTSCP
+@@ -254,10 +254,10 @@ static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node)
+ *
+ * If RDPID is available, use it.
+ */
+- alternative_io ("lsl %[seg],%[p]",
+- ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
++ alternative_io ("lsl %[seg],%k[p]",
++ "rdpid %[p]",
+ X86_FEATURE_RDPID,
+- [p] "=a" (p), [seg] "r" (__CPUNODE_SEG));
++ [p] "=r" (p), [seg] "r" (__CPUNODE_SEG));
+
+ if (cpu)
+ *cpu = (p & VDSO_CPUNODE_MASK);
+diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
+index ba6f2fe438488d..0f50e01259430c 100644
+--- a/arch/x86/include/asm/shstk.h
++++ b/arch/x86/include/asm/shstk.h
+@@ -16,7 +16,7 @@ struct thread_shstk {
+
+ long shstk_prctl(struct task_struct *task, int option, unsigned long arg2);
+ void reset_thread_features(void);
+-unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clone_flags,
++unsigned long shstk_alloc_thread_stack(struct task_struct *p, u64 clone_flags,
+ unsigned long stack_size);
+ void shstk_free(struct task_struct *p);
+ int setup_signal_shadow_stack(struct ksignal *ksig);
+@@ -28,7 +28,7 @@ static inline long shstk_prctl(struct task_struct *task, int option,
+ unsigned long arg2) { return -EINVAL; }
+ static inline void reset_thread_features(void) {}
+ static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p,
+- unsigned long clone_flags,
++ u64 clone_flags,
+ unsigned long stack_size) { return 0; }
+ static inline void shstk_free(struct task_struct *p) {}
+ static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; }
+diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
+index aefd412a23dc24..1f71cc135e9ade 100644
+--- a/arch/x86/kernel/fpu/core.c
++++ b/arch/x86/kernel/fpu/core.c
+@@ -631,7 +631,7 @@ static int update_fpu_shstk(struct task_struct *dst, unsigned long ssp)
+ }
+
+ /* Clone current's FPU state on fork */
+-int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal,
++int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal,
+ unsigned long ssp)
+ {
+ /*
+diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
+index 1b7960cf6eb0c1..e3a3987b0c4fb6 100644
+--- a/arch/x86/kernel/process.c
++++ b/arch/x86/kernel/process.c
+@@ -159,7 +159,7 @@ __visible void ret_from_fork(struct task_struct *prev, struct pt_regs *regs,
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long sp = args->stack;
+ unsigned long tls = args->tls;
+ struct inactive_task_frame *frame;
+diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
+index 2ddf23387c7ef7..5eba6c5a67757d 100644
+--- a/arch/x86/kernel/shstk.c
++++ b/arch/x86/kernel/shstk.c
+@@ -191,7 +191,7 @@ void reset_thread_features(void)
+ current->thread.features_locked = 0;
+ }
+
+-unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags,
++unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, u64 clone_flags,
+ unsigned long stack_size)
+ {
+ struct thread_shstk *shstk = &tsk->thread.shstk;
+diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
+index 33e166f6ab1224..eb289abece2370 100644
+--- a/arch/x86/kernel/smpboot.c
++++ b/arch/x86/kernel/smpboot.c
+@@ -479,14 +479,14 @@ static int x86_cluster_flags(void)
+ static bool x86_has_numa_in_package;
+
+ static struct sched_domain_topology_level x86_topology[] = {
+- SDTL_INIT(cpu_smt_mask, cpu_smt_flags, SMT),
++ SDTL_INIT(tl_smt_mask, cpu_smt_flags, SMT),
+ #ifdef CONFIG_SCHED_CLUSTER
+- SDTL_INIT(cpu_clustergroup_mask, x86_cluster_flags, CLS),
++ SDTL_INIT(tl_cls_mask, x86_cluster_flags, CLS),
+ #endif
+ #ifdef CONFIG_SCHED_MC
+- SDTL_INIT(cpu_coregroup_mask, x86_core_flags, MC),
++ SDTL_INIT(tl_mc_mask, x86_core_flags, MC),
+ #endif
+- SDTL_INIT(cpu_cpu_mask, x86_sched_itmt_flags, PKG),
++ SDTL_INIT(tl_pkg_mask, x86_sched_itmt_flags, PKG),
+ { NULL },
+ };
+
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index 1bfebe40854f49..c813d6cce69ff3 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -4180,13 +4180,21 @@ static int svm_vcpu_pre_run(struct kvm_vcpu *vcpu)
+ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
+ {
+ struct vcpu_svm *svm = to_svm(vcpu);
++ struct vmcb_control_area *control = &svm->vmcb->control;
++
++ /*
++ * Next RIP must be provided as IRQs are disabled, and accessing guest
++ * memory to decode the instruction might fault, i.e. might sleep.
++ */
++ if (!nrips || !control->next_rip)
++ return EXIT_FASTPATH_NONE;
+
+ if (is_guest_mode(vcpu))
+ return EXIT_FASTPATH_NONE;
+
+- switch (svm->vmcb->control.exit_code) {
++ switch (control->exit_code) {
+ case SVM_EXIT_MSR:
+- if (!svm->vmcb->control.exit_info_1)
++ if (!control->exit_info_1)
+ break;
+ return handle_fastpath_set_msr_irqoff(vcpu);
+ case SVM_EXIT_HLT:
+diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c
+index 7bd66677f7b6de..94d43f44be1315 100644
+--- a/arch/xtensa/kernel/process.c
++++ b/arch/xtensa/kernel/process.c
+@@ -267,7 +267,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+
+ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
+ {
+- unsigned long clone_flags = args->flags;
++ u64 clone_flags = args->flags;
+ unsigned long usp_thread_fn = args->stack;
+ unsigned long tls = args->tls;
+ struct pt_regs *childregs = task_pt_regs(p);
+diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
+index 50e51047e1fe56..4a8d3d96bfe492 100644
+--- a/block/bfq-iosched.c
++++ b/block/bfq-iosched.c
+@@ -7109,9 +7109,10 @@ void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg)
+ * See the comments on bfq_limit_depth for the purpose of
+ * the depths set in the function. Return minimum shallow depth we'll use.
+ */
+-static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt)
++static void bfq_depth_updated(struct request_queue *q)
+ {
+- unsigned int nr_requests = bfqd->queue->nr_requests;
++ struct bfq_data *bfqd = q->elevator->elevator_data;
++ unsigned int nr_requests = q->nr_requests;
+
+ /*
+ * In-word depths if no bfq_queue is being weight-raised:
+@@ -7143,21 +7144,8 @@ static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt)
+ bfqd->async_depths[1][0] = max((nr_requests * 3) >> 4, 1U);
+ /* no more than ~37% of tags for sync writes (~20% extra tags) */
+ bfqd->async_depths[1][1] = max((nr_requests * 6) >> 4, 1U);
+-}
+-
+-static void bfq_depth_updated(struct blk_mq_hw_ctx *hctx)
+-{
+- struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
+- struct blk_mq_tags *tags = hctx->sched_tags;
+
+- bfq_update_depths(bfqd, &tags->bitmap_tags);
+- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
+-}
+-
+-static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index)
+-{
+- bfq_depth_updated(hctx);
+- return 0;
++ blk_mq_set_min_shallow_depth(q, 1);
+ }
+
+ static void bfq_exit_queue(struct elevator_queue *e)
+@@ -7369,6 +7357,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_queue *eq)
+ goto out_free;
+ bfq_init_root_group(bfqd->root_group, bfqd);
+ bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group);
++ bfq_depth_updated(q);
+
+ /* We dispatch from request queue wide instead of hw queue */
+ blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q);
+@@ -7628,7 +7617,6 @@ static struct elevator_type iosched_bfq_mq = {
+ .request_merged = bfq_request_merged,
+ .has_work = bfq_has_work,
+ .depth_updated = bfq_depth_updated,
+- .init_hctx = bfq_init_hctx,
+ .init_sched = bfq_init_queue,
+ .exit_sched = bfq_exit_queue,
+ },
+diff --git a/block/bio.c b/block/bio.c
+index 3b371a5da159e9..1904683f7ab054 100644
+--- a/block/bio.c
++++ b/block/bio.c
+@@ -261,7 +261,7 @@ void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table,
+ bio->bi_private = NULL;
+ #ifdef CONFIG_BLK_CGROUP
+ bio->bi_blkg = NULL;
+- bio->bi_issue.value = 0;
++ bio->issue_time_ns = 0;
+ if (bdev)
+ bio_associate_blkg(bio);
+ #ifdef CONFIG_BLK_CGROUP_IOCOST
+diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
+index fe9ebd6a2e14d1..7246fc2563152c 100644
+--- a/block/blk-cgroup.c
++++ b/block/blk-cgroup.c
+@@ -110,12 +110,6 @@ static struct cgroup_subsys_state *blkcg_css(void)
+ return task_css(current, io_cgrp_id);
+ }
+
+-static bool blkcg_policy_enabled(struct request_queue *q,
+- const struct blkcg_policy *pol)
+-{
+- return pol && test_bit(pol->plid, q->blkcg_pols);
+-}
+-
+ static void blkg_free_workfn(struct work_struct *work)
+ {
+ struct blkcg_gq *blkg = container_of(work, struct blkcg_gq,
+diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
+index 81868ad86330cf..1cce3294634d18 100644
+--- a/block/blk-cgroup.h
++++ b/block/blk-cgroup.h
+@@ -370,11 +370,6 @@ static inline void blkg_put(struct blkcg_gq *blkg)
+ if (((d_blkg) = blkg_lookup(css_to_blkcg(pos_css), \
+ (p_blkg)->q)))
+
+-static inline void blkcg_bio_issue_init(struct bio *bio)
+-{
+- bio_issue_init(&bio->bi_issue, bio_sectors(bio));
+-}
+-
+ static inline void blkcg_use_delay(struct blkcg_gq *blkg)
+ {
+ if (WARN_ON_ONCE(atomic_read(&blkg->use_delay) < 0))
+@@ -459,6 +454,12 @@ static inline bool blk_cgroup_mergeable(struct request *rq, struct bio *bio)
+ bio_issue_as_root_blkg(rq->bio) == bio_issue_as_root_blkg(bio);
+ }
+
++static inline bool blkcg_policy_enabled(struct request_queue *q,
++ const struct blkcg_policy *pol)
++{
++ return pol && test_bit(pol->plid, q->blkcg_pols);
++}
++
+ void blk_cgroup_bio_start(struct bio *bio);
+ void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta);
+ #else /* CONFIG_BLK_CGROUP */
+@@ -491,7 +492,6 @@ static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg,
+ static inline struct blkcg_gq *pd_to_blkg(struct blkg_policy_data *pd) { return NULL; }
+ static inline void blkg_get(struct blkcg_gq *blkg) { }
+ static inline void blkg_put(struct blkcg_gq *blkg) { }
+-static inline void blkcg_bio_issue_init(struct bio *bio) { }
+ static inline void blk_cgroup_bio_start(struct bio *bio) { }
+ static inline bool blk_cgroup_mergeable(struct request *rq, struct bio *bio) { return true; }
+
+diff --git a/block/blk-core.c b/block/blk-core.c
+index a27185cd8edead..14ae73eebe0d7b 100644
+--- a/block/blk-core.c
++++ b/block/blk-core.c
+@@ -539,7 +539,7 @@ static inline void bio_check_ro(struct bio *bio)
+ }
+ }
+
+-static noinline int should_fail_bio(struct bio *bio)
++int should_fail_bio(struct bio *bio)
+ {
+ if (should_fail_request(bdev_whole(bio->bi_bdev), bio->bi_iter.bi_size))
+ return -EIO;
+@@ -727,10 +727,9 @@ static void __submit_bio_noacct_mq(struct bio *bio)
+ current->bio_list = NULL;
+ }
+
+-void submit_bio_noacct_nocheck(struct bio *bio)
++void submit_bio_noacct_nocheck(struct bio *bio, bool split)
+ {
+ blk_cgroup_bio_start(bio);
+- blkcg_bio_issue_init(bio);
+
+ if (!bio_flagged(bio, BIO_TRACE_COMPLETION)) {
+ trace_block_bio_queue(bio);
+@@ -747,12 +746,16 @@ void submit_bio_noacct_nocheck(struct bio *bio)
+ * to collect a list of requests submited by a ->submit_bio method while
+ * it is active, and then process them after it returned.
+ */
+- if (current->bio_list)
+- bio_list_add(¤t->bio_list[0], bio);
+- else if (!bdev_test_flag(bio->bi_bdev, BD_HAS_SUBMIT_BIO))
++ if (current->bio_list) {
++ if (split)
++ bio_list_add_head(¤t->bio_list[0], bio);
++ else
++ bio_list_add(¤t->bio_list[0], bio);
++ } else if (!bdev_test_flag(bio->bi_bdev, BD_HAS_SUBMIT_BIO)) {
+ __submit_bio_noacct_mq(bio);
+- else
++ } else {
+ __submit_bio_noacct(bio);
++ }
+ }
+
+ static blk_status_t blk_validate_atomic_write_op_size(struct request_queue *q,
+@@ -873,7 +876,7 @@ void submit_bio_noacct(struct bio *bio)
+
+ if (blk_throtl_bio(bio))
+ return;
+- submit_bio_noacct_nocheck(bio);
++ submit_bio_noacct_nocheck(bio, false);
+ return;
+
+ not_supported:
+diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
+index 2f8fdecdd7a9b1..554b191a68921c 100644
+--- a/block/blk-iolatency.c
++++ b/block/blk-iolatency.c
+@@ -485,19 +485,11 @@ static void blkcg_iolatency_throttle(struct rq_qos *rqos, struct bio *bio)
+ mod_timer(&blkiolat->timer, jiffies + HZ);
+ }
+
+-static void iolatency_record_time(struct iolatency_grp *iolat,
+- struct bio_issue *issue, u64 now,
+- bool issue_as_root)
++static void iolatency_record_time(struct iolatency_grp *iolat, u64 start,
++ u64 now, bool issue_as_root)
+ {
+- u64 start = bio_issue_time(issue);
+ u64 req_time;
+
+- /*
+- * Have to do this so we are truncated to the correct time that our
+- * issue is truncated to.
+- */
+- now = __bio_issue_time(now);
+-
+ if (now <= start)
+ return;
+
+@@ -625,7 +617,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio)
+ * submitted, so do not account for it.
+ */
+ if (iolat->min_lat_nsec && bio->bi_status != BLK_STS_AGAIN) {
+- iolatency_record_time(iolat, &bio->bi_issue, now,
++ iolatency_record_time(iolat, bio->issue_time_ns, now,
+ issue_as_root);
+ window_start = atomic64_read(&iolat->window_start);
+ if (now > window_start &&
+diff --git a/block/blk-merge.c b/block/blk-merge.c
+index 70d704615be520..77488f11a94418 100644
+--- a/block/blk-merge.c
++++ b/block/blk-merge.c
+@@ -104,34 +104,58 @@ static unsigned int bio_allowed_max_sectors(const struct queue_limits *lim)
+ return round_down(UINT_MAX, lim->logical_block_size) >> SECTOR_SHIFT;
+ }
+
++/*
++ * bio_submit_split_bioset - Submit a bio, splitting it at a designated sector
++ * @bio: the original bio to be submitted and split
++ * @split_sectors: the sector count at which to split
++ * @bs: the bio set used for allocating the new split bio
++ *
++ * The original bio is modified to contain the remaining sectors and submitted.
++ * The caller is responsible for submitting the returned bio.
++ *
++ * If succeed, the newly allocated bio representing the initial part will be
++ * returned, on failure NULL will be returned and original bio will fail.
++ */
++struct bio *bio_submit_split_bioset(struct bio *bio, unsigned int split_sectors,
++ struct bio_set *bs)
++{
++ struct bio *split = bio_split(bio, split_sectors, GFP_NOIO, bs);
++
++ if (IS_ERR(split)) {
++ bio->bi_status = errno_to_blk_status(PTR_ERR(split));
++ bio_endio(bio);
++ return NULL;
++ }
++
++ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
++ WARN_ON_ONCE(bio_zone_write_plugging(bio));
++
++ if (should_fail_bio(bio))
++ bio_io_error(bio);
++ else if (!blk_throtl_bio(bio))
++ submit_bio_noacct_nocheck(bio, true);
++
++ return split;
++}
++EXPORT_SYMBOL_GPL(bio_submit_split_bioset);
++
+ static struct bio *bio_submit_split(struct bio *bio, int split_sectors)
+ {
+- if (unlikely(split_sectors < 0))
+- goto error;
++ if (unlikely(split_sectors < 0)) {
++ bio->bi_status = errno_to_blk_status(split_sectors);
++ bio_endio(bio);
++ return NULL;
++ }
+
+ if (split_sectors) {
+- struct bio *split;
+-
+- split = bio_split(bio, split_sectors, GFP_NOIO,
++ bio = bio_submit_split_bioset(bio, split_sectors,
+ &bio->bi_bdev->bd_disk->bio_split);
+- if (IS_ERR(split)) {
+- split_sectors = PTR_ERR(split);
+- goto error;
+- }
+- split->bi_opf |= REQ_NOMERGE;
+- blkcg_bio_issue_init(split);
+- bio_chain(split, bio);
+- trace_block_split(split, bio->bi_iter.bi_sector);
+- WARN_ON_ONCE(bio_zone_write_plugging(bio));
+- submit_bio_noacct(bio);
+- return split;
++ if (bio)
++ bio->bi_opf |= REQ_NOMERGE;
+ }
+
+ return bio;
+-error:
+- bio->bi_status = errno_to_blk_status(split_sectors);
+- bio_endio(bio);
+- return NULL;
+ }
+
+ struct bio *bio_split_discard(struct bio *bio, const struct queue_limits *lim,
+diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
+index e2ce4a28e6c9e0..d06bb137a74377 100644
+--- a/block/blk-mq-sched.c
++++ b/block/blk-mq-sched.c
+@@ -454,7 +454,7 @@ void blk_mq_free_sched_tags_batch(struct xarray *et_table,
+ }
+
+ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set,
+- unsigned int nr_hw_queues)
++ unsigned int nr_hw_queues, unsigned int nr_requests)
+ {
+ unsigned int nr_tags;
+ int i;
+@@ -470,13 +470,8 @@ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set,
+ nr_tags * sizeof(struct blk_mq_tags *), gfp);
+ if (!et)
+ return NULL;
+- /*
+- * Default to double of smaller one between hw queue_depth and
+- * 128, since we don't split into sync/async like the old code
+- * did. Additionally, this is a per-hw queue depth.
+- */
+- et->nr_requests = 2 * min_t(unsigned int, set->queue_depth,
+- BLKDEV_DEFAULT_RQ);
++
++ et->nr_requests = nr_requests;
+ et->nr_hw_queues = nr_hw_queues;
+
+ if (blk_mq_is_shared_tags(set->flags)) {
+@@ -521,7 +516,8 @@ int blk_mq_alloc_sched_tags_batch(struct xarray *et_table,
+ * concurrently.
+ */
+ if (q->elevator) {
+- et = blk_mq_alloc_sched_tags(set, nr_hw_queues);
++ et = blk_mq_alloc_sched_tags(set, nr_hw_queues,
++ blk_mq_default_nr_requests(set));
+ if (!et)
+ goto out_unwind;
+ if (xa_insert(et_table, q->id, et, gfp))
+diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
+index b554e1d559508c..8e21a6b1415d9d 100644
+--- a/block/blk-mq-sched.h
++++ b/block/blk-mq-sched.h
+@@ -24,7 +24,7 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e);
+ void blk_mq_sched_free_rqs(struct request_queue *q);
+
+ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set,
+- unsigned int nr_hw_queues);
++ unsigned int nr_hw_queues, unsigned int nr_requests);
+ int blk_mq_alloc_sched_tags_batch(struct xarray *et_table,
+ struct blk_mq_tag_set *set, unsigned int nr_hw_queues);
+ void blk_mq_free_sched_tags(struct elevator_tags *et,
+@@ -92,4 +92,15 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx)
+ return test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
+ }
+
++static inline void blk_mq_set_min_shallow_depth(struct request_queue *q,
++ unsigned int depth)
++{
++ struct blk_mq_hw_ctx *hctx;
++ unsigned long i;
++
++ queue_for_each_hw_ctx(q, hctx, i)
++ sbitmap_queue_min_shallow_depth(&hctx->sched_tags->bitmap_tags,
++ depth);
++}
++
+ #endif
+diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c
+index 24656980f4431a..5c399ac562eae9 100644
+--- a/block/blk-mq-sysfs.c
++++ b/block/blk-mq-sysfs.c
+@@ -150,9 +150,11 @@ static void blk_mq_unregister_hctx(struct blk_mq_hw_ctx *hctx)
+ return;
+
+ hctx_for_each_ctx(hctx, ctx, i)
+- kobject_del(&ctx->kobj);
++ if (ctx->kobj.state_in_sysfs)
++ kobject_del(&ctx->kobj);
+
+- kobject_del(&hctx->kobj);
++ if (hctx->kobj.state_in_sysfs)
++ kobject_del(&hctx->kobj);
+ }
+
+ static int blk_mq_register_hctx(struct blk_mq_hw_ctx *hctx)
+diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
+index 5cffa5668d0c38..aed84c5d5c2b22 100644
+--- a/block/blk-mq-tag.c
++++ b/block/blk-mq-tag.c
+@@ -584,14 +584,10 @@ void blk_mq_free_tags(struct blk_mq_tags *tags)
+ }
+
+ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+- struct blk_mq_tags **tagsptr, unsigned int tdepth,
+- bool can_grow)
++ struct blk_mq_tags **tagsptr, unsigned int tdepth)
+ {
+ struct blk_mq_tags *tags = *tagsptr;
+
+- if (tdepth <= tags->nr_reserved_tags)
+- return -EINVAL;
+-
+ /*
+ * If we are allowed to grow beyond the original size, allocate
+ * a new set of tags before freeing the old one.
+@@ -600,23 +596,6 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+ struct blk_mq_tag_set *set = hctx->queue->tag_set;
+ struct blk_mq_tags *new;
+
+- if (!can_grow)
+- return -EINVAL;
+-
+- /*
+- * We need some sort of upper limit, set it high enough that
+- * no valid use cases should require more.
+- */
+- if (tdepth > MAX_SCHED_RQ)
+- return -EINVAL;
+-
+- /*
+- * Only the sbitmap needs resizing since we allocated the max
+- * initially.
+- */
+- if (blk_mq_is_shared_tags(set->flags))
+- return 0;
+-
+ new = blk_mq_alloc_map_and_rqs(set, hctx->queue_num, tdepth);
+ if (!new)
+ return -ENOMEM;
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index ba3a4b77f5786e..f8a8a23b904023 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -396,6 +396,13 @@ static inline void blk_mq_rq_time_init(struct request *rq, u64 alloc_time_ns)
+ #endif
+ }
+
++static inline void blk_mq_bio_issue_init(struct bio *bio)
++{
++#ifdef CONFIG_BLK_CGROUP
++ bio->issue_time_ns = blk_time_get_ns();
++#endif
++}
++
+ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
+ struct blk_mq_tags *tags, unsigned int tag)
+ {
+@@ -3168,6 +3175,7 @@ void blk_mq_submit_bio(struct bio *bio)
+ if (!bio_integrity_prep(bio))
+ goto queue_exit;
+
++ blk_mq_bio_issue_init(bio);
+ if (blk_mq_attempt_bio_merge(q, bio, nr_segs))
+ goto queue_exit;
+
+@@ -4917,57 +4925,59 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set)
+ }
+ EXPORT_SYMBOL(blk_mq_free_tag_set);
+
+-int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
++struct elevator_tags *blk_mq_update_nr_requests(struct request_queue *q,
++ struct elevator_tags *et,
++ unsigned int nr)
+ {
+ struct blk_mq_tag_set *set = q->tag_set;
++ struct elevator_tags *old_et = NULL;
+ struct blk_mq_hw_ctx *hctx;
+- int ret;
+ unsigned long i;
+
+- if (WARN_ON_ONCE(!q->mq_freeze_depth))
+- return -EINVAL;
+-
+- if (!set)
+- return -EINVAL;
+-
+- if (q->nr_requests == nr)
+- return 0;
+-
+ blk_mq_quiesce_queue(q);
+
+- ret = 0;
+- queue_for_each_hw_ctx(q, hctx, i) {
+- if (!hctx->tags)
+- continue;
++ if (blk_mq_is_shared_tags(set->flags)) {
+ /*
+- * If we're using an MQ scheduler, just update the scheduler
+- * queue depth. This is similar to what the old code would do.
++ * Shared tags, for sched tags, we allocate max initially hence
++ * tags can't grow, see blk_mq_alloc_sched_tags().
+ */
+- if (hctx->sched_tags) {
+- ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
+- nr, true);
+- } else {
+- ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr,
+- false);
++ if (q->elevator)
++ blk_mq_tag_update_sched_shared_tags(q);
++ else
++ blk_mq_tag_resize_shared_tags(set, nr);
++ } else if (!q->elevator) {
++ /*
++ * Non-shared hardware tags, nr is already checked from
++ * queue_requests_store() and tags can't grow.
++ */
++ queue_for_each_hw_ctx(q, hctx, i) {
++ if (!hctx->tags)
++ continue;
++ sbitmap_queue_resize(&hctx->tags->bitmap_tags,
++ nr - hctx->tags->nr_reserved_tags);
+ }
+- if (ret)
+- break;
+- if (q->elevator && q->elevator->type->ops.depth_updated)
+- q->elevator->type->ops.depth_updated(hctx);
+- }
+- if (!ret) {
+- q->nr_requests = nr;
+- if (blk_mq_is_shared_tags(set->flags)) {
+- if (q->elevator)
+- blk_mq_tag_update_sched_shared_tags(q);
+- else
+- blk_mq_tag_resize_shared_tags(set, nr);
++ } else if (nr <= q->elevator->et->nr_requests) {
++ /* Non-shared sched tags, and tags don't grow. */
++ queue_for_each_hw_ctx(q, hctx, i) {
++ if (!hctx->sched_tags)
++ continue;
++ sbitmap_queue_resize(&hctx->sched_tags->bitmap_tags,
++ nr - hctx->sched_tags->nr_reserved_tags);
+ }
++ } else {
++ /* Non-shared sched tags, and tags grow */
++ queue_for_each_hw_ctx(q, hctx, i)
++ hctx->sched_tags = et->tags[i];
++ old_et = q->elevator->et;
++ q->elevator->et = et;
+ }
+
+- blk_mq_unquiesce_queue(q);
++ q->nr_requests = nr;
++ if (q->elevator && q->elevator->type->ops.depth_updated)
++ q->elevator->type->ops.depth_updated(q);
+
+- return ret;
++ blk_mq_unquiesce_queue(q);
++ return old_et;
+ }
+
+ /*
+diff --git a/block/blk-mq.h b/block/blk-mq.h
+index affb2e14b56e3a..6c9d03625ba124 100644
+--- a/block/blk-mq.h
++++ b/block/blk-mq.h
+@@ -6,6 +6,7 @@
+ #include "blk-stat.h"
+
+ struct blk_mq_tag_set;
++struct elevator_tags;
+
+ struct blk_mq_ctxs {
+ struct kobject kobj;
+@@ -45,7 +46,9 @@ void blk_mq_submit_bio(struct bio *bio);
+ int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, struct io_comp_batch *iob,
+ unsigned int flags);
+ void blk_mq_exit_queue(struct request_queue *q);
+-int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
++struct elevator_tags *blk_mq_update_nr_requests(struct request_queue *q,
++ struct elevator_tags *tags,
++ unsigned int nr);
+ void blk_mq_wake_waiters(struct request_queue *q);
+ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *,
+ bool);
+@@ -109,6 +112,17 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(blk_opf_t opf,
+ return ctx->hctxs[blk_mq_get_hctx_type(opf)];
+ }
+
++/*
++ * Default to double of smaller one between hw queue_depth and
++ * 128, since we don't split into sync/async like the old code
++ * did. Additionally, this is a per-hw queue depth.
++ */
++static inline unsigned int blk_mq_default_nr_requests(
++ struct blk_mq_tag_set *set)
++{
++ return 2 * min_t(unsigned int, set->queue_depth, BLKDEV_DEFAULT_RQ);
++}
++
+ /*
+ * sysfs helpers
+ */
+@@ -171,7 +185,7 @@ void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx,
+ unsigned int tag);
+ void blk_mq_put_tags(struct blk_mq_tags *tags, int *tag_array, int nr_tags);
+ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+- struct blk_mq_tags **tags, unsigned int depth, bool can_grow);
++ struct blk_mq_tags **tags, unsigned int depth);
+ void blk_mq_tag_resize_shared_tags(struct blk_mq_tag_set *set,
+ unsigned int size);
+ void blk_mq_tag_update_sched_shared_tags(struct request_queue *q);
+diff --git a/block/blk-settings.c b/block/blk-settings.c
+index d6438e6c276dcc..8fa52914e16b02 100644
+--- a/block/blk-settings.c
++++ b/block/blk-settings.c
+@@ -56,6 +56,7 @@ void blk_set_stacking_limits(struct queue_limits *lim)
+ lim->max_user_wzeroes_unmap_sectors = UINT_MAX;
+ lim->max_hw_zone_append_sectors = UINT_MAX;
+ lim->max_user_discard_sectors = UINT_MAX;
++ lim->atomic_write_hw_max = UINT_MAX;
+ }
+ EXPORT_SYMBOL(blk_set_stacking_limits);
+
+@@ -232,6 +233,10 @@ static void blk_validate_atomic_write_limits(struct queue_limits *lim)
+ if (!(lim->features & BLK_FEAT_ATOMIC_WRITES))
+ goto unsupported;
+
++ /* UINT_MAX indicates stacked limits in initial state */
++ if (lim->atomic_write_hw_max == UINT_MAX)
++ goto unsupported;
++
+ if (!lim->atomic_write_hw_max)
+ goto unsupported;
+
+@@ -643,18 +648,24 @@ static bool blk_stack_atomic_writes_tail(struct queue_limits *t,
+ static bool blk_stack_atomic_writes_boundary_head(struct queue_limits *t,
+ struct queue_limits *b)
+ {
++ unsigned int boundary_sectors;
++
++ if (!b->atomic_write_hw_boundary || !t->chunk_sectors)
++ return true;
++
++ boundary_sectors = b->atomic_write_hw_boundary >> SECTOR_SHIFT;
++
+ /*
+ * Ensure atomic write boundary is aligned with chunk sectors. Stacked
+- * devices store chunk sectors in t->io_min.
++ * devices store any stripe size in t->chunk_sectors.
+ */
+- if (b->atomic_write_hw_boundary > t->io_min &&
+- b->atomic_write_hw_boundary % t->io_min)
++ if (boundary_sectors > t->chunk_sectors &&
++ boundary_sectors % t->chunk_sectors)
+ return false;
+- if (t->io_min > b->atomic_write_hw_boundary &&
+- t->io_min % b->atomic_write_hw_boundary)
++ if (t->chunk_sectors > boundary_sectors &&
++ t->chunk_sectors % boundary_sectors)
+ return false;
+
+- t->atomic_write_hw_boundary = b->atomic_write_hw_boundary;
+ return true;
+ }
+
+@@ -695,13 +706,13 @@ static void blk_stack_atomic_writes_chunk_sectors(struct queue_limits *t)
+ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
+ struct queue_limits *b)
+ {
+- if (b->atomic_write_hw_boundary &&
+- !blk_stack_atomic_writes_boundary_head(t, b))
++ if (!blk_stack_atomic_writes_boundary_head(t, b))
+ return false;
+
+ t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
+ t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
+ t->atomic_write_hw_max = b->atomic_write_hw_max;
++ t->atomic_write_hw_boundary = b->atomic_write_hw_boundary;
+ return true;
+ }
+
+@@ -717,18 +728,14 @@ static void blk_stack_atomic_writes_limits(struct queue_limits *t,
+ if (!blk_atomic_write_start_sect_aligned(start, b))
+ goto unsupported;
+
+- /*
+- * If atomic_write_hw_max is set, we have already stacked 1x bottom
+- * device, so check for compliance.
+- */
+- if (t->atomic_write_hw_max) {
++ /* UINT_MAX indicates no stacking of bottom devices yet */
++ if (t->atomic_write_hw_max == UINT_MAX) {
++ if (!blk_stack_atomic_writes_head(t, b))
++ goto unsupported;
++ } else {
+ if (!blk_stack_atomic_writes_tail(t, b))
+ goto unsupported;
+- return;
+ }
+-
+- if (!blk_stack_atomic_writes_head(t, b))
+- goto unsupported;
+ blk_stack_atomic_writes_chunk_sectors(t);
+ return;
+
+@@ -763,7 +770,8 @@ static void blk_stack_atomic_writes_limits(struct queue_limits *t,
+ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
+ sector_t start)
+ {
+- unsigned int top, bottom, alignment, ret = 0;
++ unsigned int top, bottom, alignment;
++ int ret = 0;
+
+ t->features |= (b->features & BLK_FEAT_INHERIT_MASK);
+
+diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
+index 4a7f1a349998ba..9b03261b3e042e 100644
+--- a/block/blk-sysfs.c
++++ b/block/blk-sysfs.c
+@@ -64,10 +64,12 @@ static ssize_t queue_requests_show(struct gendisk *disk, char *page)
+ static ssize_t
+ queue_requests_store(struct gendisk *disk, const char *page, size_t count)
+ {
+- unsigned long nr;
+- int ret, err;
+- unsigned int memflags;
+ struct request_queue *q = disk->queue;
++ struct blk_mq_tag_set *set = q->tag_set;
++ struct elevator_tags *et = NULL;
++ unsigned int memflags;
++ unsigned long nr;
++ int ret;
+
+ if (!queue_is_mq(q))
+ return -EINVAL;
+@@ -76,16 +78,55 @@ queue_requests_store(struct gendisk *disk, const char *page, size_t count)
+ if (ret < 0)
+ return ret;
+
+- memflags = blk_mq_freeze_queue(q);
+- mutex_lock(&q->elevator_lock);
++ /*
++ * Serialize updating nr_requests with concurrent queue_requests_store()
++ * and switching elevator.
++ */
++ down_write(&set->update_nr_hwq_lock);
++
++ if (nr == q->nr_requests)
++ goto unlock;
++
+ if (nr < BLKDEV_MIN_RQ)
+ nr = BLKDEV_MIN_RQ;
+
+- err = blk_mq_update_nr_requests(disk->queue, nr);
+- if (err)
+- ret = err;
++ /*
++ * Switching elevator is protected by update_nr_hwq_lock:
++ * - read lock is held from elevator sysfs attribute;
++ * - write lock is held from updating nr_hw_queues;
++ * Hence it's safe to access q->elevator here with write lock held.
++ */
++ if (nr <= set->reserved_tags ||
++ (q->elevator && nr > MAX_SCHED_RQ) ||
++ (!q->elevator && nr > set->queue_depth)) {
++ ret = -EINVAL;
++ goto unlock;
++ }
++
++ if (!blk_mq_is_shared_tags(set->flags) && q->elevator &&
++ nr > q->elevator->et->nr_requests) {
++ /*
++ * Tags will grow, allocate memory before freezing queue to
++ * prevent deadlock.
++ */
++ et = blk_mq_alloc_sched_tags(set, q->nr_hw_queues, nr);
++ if (!et) {
++ ret = -ENOMEM;
++ goto unlock;
++ }
++ }
++
++ memflags = blk_mq_freeze_queue(q);
++ mutex_lock(&q->elevator_lock);
++ et = blk_mq_update_nr_requests(q, et, nr);
+ mutex_unlock(&q->elevator_lock);
+ blk_mq_unfreeze_queue(q, memflags);
++
++ if (et)
++ blk_mq_free_sched_tags(et, set);
++
++unlock:
++ up_write(&set->update_nr_hwq_lock);
+ return ret;
+ }
+
+diff --git a/block/blk-throttle.c b/block/blk-throttle.c
+index 397b6a410f9e50..2c5b64b1a724a7 100644
+--- a/block/blk-throttle.c
++++ b/block/blk-throttle.c
+@@ -1224,7 +1224,7 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work)
+ if (!bio_list_empty(&bio_list_on_stack)) {
+ blk_start_plug(&plug);
+ while ((bio = bio_list_pop(&bio_list_on_stack)))
+- submit_bio_noacct_nocheck(bio);
++ submit_bio_noacct_nocheck(bio, false);
+ blk_finish_plug(&plug);
+ }
+ }
+@@ -1327,17 +1327,13 @@ static int blk_throtl_init(struct gendisk *disk)
+ INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn);
+ throtl_service_queue_init(&td->service_queue);
+
+- /*
+- * Freeze queue before activating policy, to synchronize with IO path,
+- * which is protected by 'q_usage_counter'.
+- */
+ memflags = blk_mq_freeze_queue(disk->queue);
+ blk_mq_quiesce_queue(disk->queue);
+
+ q->td = td;
+ td->queue = q;
+
+- /* activate policy */
++ /* activate policy, blk_throtl_activated() will return true */
+ ret = blkcg_activate_policy(disk, &blkcg_policy_throtl);
+ if (ret) {
+ q->td = NULL;
+@@ -1846,12 +1842,15 @@ void blk_throtl_exit(struct gendisk *disk)
+ {
+ struct request_queue *q = disk->queue;
+
+- if (!blk_throtl_activated(q))
++ /*
++ * blkg_destroy_all() already deactivate throtl policy, just check and
++ * free throtl data.
++ */
++ if (!q->td)
+ return;
+
+ timer_delete_sync(&q->td->service_queue.pending_timer);
+ throtl_shutdown_wq(q);
+- blkcg_deactivate_policy(disk, &blkcg_policy_throtl);
+ kfree(q->td);
+ }
+
+diff --git a/block/blk-throttle.h b/block/blk-throttle.h
+index 3b27755bfbff1d..9d7a42c039a15e 100644
+--- a/block/blk-throttle.h
++++ b/block/blk-throttle.h
+@@ -156,7 +156,13 @@ void blk_throtl_cancel_bios(struct gendisk *disk);
+
+ static inline bool blk_throtl_activated(struct request_queue *q)
+ {
+- return q->td != NULL;
++ /*
++ * q->td guarantees that the blk-throttle module is already loaded,
++ * and the plid of blk-throttle is assigned.
++ * blkcg_policy_enabled() guarantees that the policy is activated
++ * in the request_queue.
++ */
++ return q->td != NULL && blkcg_policy_enabled(q, &blkcg_policy_throtl);
+ }
+
+ static inline bool blk_should_throtl(struct bio *bio)
+@@ -164,11 +170,6 @@ static inline bool blk_should_throtl(struct bio *bio)
+ struct throtl_grp *tg;
+ int rw = bio_data_dir(bio);
+
+- /*
+- * This is called under bio_queue_enter(), and it's synchronized with
+- * the activation of blk-throtl, which is protected by
+- * blk_mq_freeze_queue().
+- */
+ if (!blk_throtl_activated(bio->bi_bdev->bd_queue))
+ return false;
+
+@@ -194,7 +195,10 @@ static inline bool blk_should_throtl(struct bio *bio)
+
+ static inline bool blk_throtl_bio(struct bio *bio)
+ {
+-
++ /*
++ * block throttling takes effect if the policy is activated
++ * in the bio's request_queue.
++ */
+ if (!blk_should_throtl(bio))
+ return false;
+
+diff --git a/block/blk.h b/block/blk.h
+index 46f566f9b1266c..d9efc8693aa489 100644
+--- a/block/blk.h
++++ b/block/blk.h
+@@ -54,7 +54,7 @@ bool blk_queue_start_drain(struct request_queue *q);
+ bool __blk_freeze_queue_start(struct request_queue *q,
+ struct task_struct *owner);
+ int __bio_queue_enter(struct request_queue *q, struct bio *bio);
+-void submit_bio_noacct_nocheck(struct bio *bio);
++void submit_bio_noacct_nocheck(struct bio *bio, bool split);
+ void bio_await_chain(struct bio *bio);
+
+ static inline bool blk_try_enter_queue(struct request_queue *q, bool pm)
+@@ -615,6 +615,7 @@ extern const struct address_space_operations def_blk_aops;
+ int disk_register_independent_access_ranges(struct gendisk *disk);
+ void disk_unregister_independent_access_ranges(struct gendisk *disk);
+
++int should_fail_bio(struct bio *bio);
+ #ifdef CONFIG_FAIL_MAKE_REQUEST
+ bool should_fail_request(struct block_device *part, unsigned int bytes);
+ #else /* CONFIG_FAIL_MAKE_REQUEST */
+@@ -680,48 +681,6 @@ static inline ktime_t blk_time_get(void)
+ return ns_to_ktime(blk_time_get_ns());
+ }
+
+-/*
+- * From most significant bit:
+- * 1 bit: reserved for other usage, see below
+- * 12 bits: original size of bio
+- * 51 bits: issue time of bio
+- */
+-#define BIO_ISSUE_RES_BITS 1
+-#define BIO_ISSUE_SIZE_BITS 12
+-#define BIO_ISSUE_RES_SHIFT (64 - BIO_ISSUE_RES_BITS)
+-#define BIO_ISSUE_SIZE_SHIFT (BIO_ISSUE_RES_SHIFT - BIO_ISSUE_SIZE_BITS)
+-#define BIO_ISSUE_TIME_MASK ((1ULL << BIO_ISSUE_SIZE_SHIFT) - 1)
+-#define BIO_ISSUE_SIZE_MASK \
+- (((1ULL << BIO_ISSUE_SIZE_BITS) - 1) << BIO_ISSUE_SIZE_SHIFT)
+-#define BIO_ISSUE_RES_MASK (~((1ULL << BIO_ISSUE_RES_SHIFT) - 1))
+-
+-/* Reserved bit for blk-throtl */
+-#define BIO_ISSUE_THROTL_SKIP_LATENCY (1ULL << 63)
+-
+-static inline u64 __bio_issue_time(u64 time)
+-{
+- return time & BIO_ISSUE_TIME_MASK;
+-}
+-
+-static inline u64 bio_issue_time(struct bio_issue *issue)
+-{
+- return __bio_issue_time(issue->value);
+-}
+-
+-static inline sector_t bio_issue_size(struct bio_issue *issue)
+-{
+- return ((issue->value & BIO_ISSUE_SIZE_MASK) >> BIO_ISSUE_SIZE_SHIFT);
+-}
+-
+-static inline void bio_issue_init(struct bio_issue *issue,
+- sector_t size)
+-{
+- size &= (1ULL << BIO_ISSUE_SIZE_BITS) - 1;
+- issue->value = ((issue->value & BIO_ISSUE_RES_MASK) |
+- (blk_time_get_ns() & BIO_ISSUE_TIME_MASK) |
+- ((u64)size << BIO_ISSUE_SIZE_SHIFT));
+-}
+-
+ void bdev_release(struct file *bdev_file);
+ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
+ const struct blk_holder_ops *hops, struct file *bdev_file);
+diff --git a/block/elevator.c b/block/elevator.c
+index fe96c6f4753ca2..e2ebfbf107b3af 100644
+--- a/block/elevator.c
++++ b/block/elevator.c
+@@ -669,7 +669,8 @@ static int elevator_change(struct request_queue *q, struct elv_change_ctx *ctx)
+ lockdep_assert_held(&set->update_nr_hwq_lock);
+
+ if (strncmp(ctx->name, "none", 4)) {
+- ctx->et = blk_mq_alloc_sched_tags(set, set->nr_hw_queues);
++ ctx->et = blk_mq_alloc_sched_tags(set, set->nr_hw_queues,
++ blk_mq_default_nr_requests(set));
+ if (!ctx->et)
+ return -ENOMEM;
+ }
+diff --git a/block/elevator.h b/block/elevator.h
+index adc5c157e17e51..c4d20155065e80 100644
+--- a/block/elevator.h
++++ b/block/elevator.h
+@@ -37,7 +37,7 @@ struct elevator_mq_ops {
+ void (*exit_sched)(struct elevator_queue *);
+ int (*init_hctx)(struct blk_mq_hw_ctx *, unsigned int);
+ void (*exit_hctx)(struct blk_mq_hw_ctx *, unsigned int);
+- void (*depth_updated)(struct blk_mq_hw_ctx *);
++ void (*depth_updated)(struct request_queue *);
+
+ bool (*allow_merge)(struct request_queue *, struct request *, struct bio *);
+ bool (*bio_merge)(struct request_queue *, struct bio *, unsigned int);
+diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
+index 70cbc7b2deb40b..18efd6ef2a2b94 100644
+--- a/block/kyber-iosched.c
++++ b/block/kyber-iosched.c
+@@ -399,6 +399,14 @@ static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q)
+ return ERR_PTR(ret);
+ }
+
++static void kyber_depth_updated(struct request_queue *q)
++{
++ struct kyber_queue_data *kqd = q->elevator->elevator_data;
++
++ kqd->async_depth = q->nr_requests * KYBER_ASYNC_PERCENT / 100U;
++ blk_mq_set_min_shallow_depth(q, kqd->async_depth);
++}
++
+ static int kyber_init_sched(struct request_queue *q, struct elevator_queue *eq)
+ {
+ struct kyber_queue_data *kqd;
+@@ -413,6 +421,7 @@ static int kyber_init_sched(struct request_queue *q, struct elevator_queue *eq)
+
+ eq->elevator_data = kqd;
+ q->elevator = eq;
++ kyber_depth_updated(q);
+
+ return 0;
+ }
+@@ -440,15 +449,6 @@ static void kyber_ctx_queue_init(struct kyber_ctx_queue *kcq)
+ INIT_LIST_HEAD(&kcq->rq_list[i]);
+ }
+
+-static void kyber_depth_updated(struct blk_mq_hw_ctx *hctx)
+-{
+- struct kyber_queue_data *kqd = hctx->queue->elevator->elevator_data;
+- struct blk_mq_tags *tags = hctx->sched_tags;
+-
+- kqd->async_depth = hctx->queue->nr_requests * KYBER_ASYNC_PERCENT / 100U;
+- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, kqd->async_depth);
+-}
+-
+ static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
+ {
+ struct kyber_hctx_data *khd;
+@@ -493,7 +493,6 @@ static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
+ khd->batching = 0;
+
+ hctx->sched_data = khd;
+- kyber_depth_updated(hctx);
+
+ return 0;
+
+diff --git a/block/mq-deadline.c b/block/mq-deadline.c
+index b9b7cdf1d3c980..2e689b2c40213a 100644
+--- a/block/mq-deadline.c
++++ b/block/mq-deadline.c
+@@ -507,22 +507,12 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
+ }
+
+ /* Called by blk_mq_update_nr_requests(). */
+-static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
++static void dd_depth_updated(struct request_queue *q)
+ {
+- struct request_queue *q = hctx->queue;
+ struct deadline_data *dd = q->elevator->elevator_data;
+- struct blk_mq_tags *tags = hctx->sched_tags;
+
+ dd->async_depth = q->nr_requests;
+-
+- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
+-}
+-
+-/* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
+-static int dd_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
+-{
+- dd_depth_updated(hctx);
+- return 0;
++ blk_mq_set_min_shallow_depth(q, 1);
+ }
+
+ static void dd_exit_sched(struct elevator_queue *e)
+@@ -587,6 +577,7 @@ static int dd_init_sched(struct request_queue *q, struct elevator_queue *eq)
+ blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q);
+
+ q->elevator = eq;
++ dd_depth_updated(q);
+ return 0;
+ }
+
+@@ -1048,7 +1039,6 @@ static struct elevator_type mq_deadline = {
+ .has_work = dd_has_work,
+ .init_sched = dd_init_sched,
+ .exit_sched = dd_exit_sched,
+- .init_hctx = dd_init_hctx,
+ },
+
+ #ifdef CONFIG_BLK_DEBUG_FS
+diff --git a/crypto/842.c b/crypto/842.c
+index 8c257c40e2b901..4007e87bed806d 100644
+--- a/crypto/842.c
++++ b/crypto/842.c
+@@ -54,8 +54,10 @@ static int crypto842_sdecompress(struct crypto_scomp *tfm,
+ }
+
+ static struct scomp_alg scomp = {
+- .alloc_ctx = crypto842_alloc_ctx,
+- .free_ctx = crypto842_free_ctx,
++ .streams = {
++ .alloc_ctx = crypto842_alloc_ctx,
++ .free_ctx = crypto842_free_ctx,
++ },
+ .compress = crypto842_scompress,
+ .decompress = crypto842_sdecompress,
+ .base = {
+diff --git a/crypto/asymmetric_keys/x509_cert_parser.c b/crypto/asymmetric_keys/x509_cert_parser.c
+index 2ffe4ae90bea0e..8df3fa60a44f80 100644
+--- a/crypto/asymmetric_keys/x509_cert_parser.c
++++ b/crypto/asymmetric_keys/x509_cert_parser.c
+@@ -610,11 +610,14 @@ int x509_process_extension(void *context, size_t hdrlen,
+ /*
+ * Get hold of the basicConstraints
+ * v[1] is the encoding size
+- * (Expect 0x2 or greater, making it 1 or more bytes)
++ * (Expect 0x00 for empty SEQUENCE with CA:FALSE, or
++ * 0x03 or greater for non-empty SEQUENCE)
+ * v[2] is the encoding type
+ * (Expect an ASN1_BOOL for the CA)
+- * v[3] is the contents of the ASN1_BOOL
+- * (Expect 1 if the CA is TRUE)
++ * v[3] is the length of the ASN1_BOOL
++ * (Expect 1 for a single byte boolean)
++ * v[4] is the contents of the ASN1_BOOL
++ * (Expect 0xFF if the CA is TRUE)
+ * vlen should match the entire extension size
+ */
+ if (v[0] != (ASN1_CONS_BIT | ASN1_SEQ))
+@@ -623,8 +626,13 @@ int x509_process_extension(void *context, size_t hdrlen,
+ return -EBADMSG;
+ if (v[1] != vlen - 2)
+ return -EBADMSG;
+- if (vlen >= 4 && v[1] != 0 && v[2] == ASN1_BOOL && v[3] == 1)
++ /* Empty SEQUENCE means CA:FALSE (default value omitted per DER) */
++ if (v[1] == 0)
++ return 0;
++ if (vlen >= 5 && v[2] == ASN1_BOOL && v[3] == 1 && v[4] == 0xFF)
+ ctx->cert->pub->key_eflags |= 1 << KEY_EFLAG_CA;
++ else
++ return -EBADMSG;
+ return 0;
+ }
+
+diff --git a/crypto/lz4.c b/crypto/lz4.c
+index 7a984ae5ae52ea..57b713516aefac 100644
+--- a/crypto/lz4.c
++++ b/crypto/lz4.c
+@@ -68,8 +68,10 @@ static int lz4_sdecompress(struct crypto_scomp *tfm, const u8 *src,
+ }
+
+ static struct scomp_alg scomp = {
+- .alloc_ctx = lz4_alloc_ctx,
+- .free_ctx = lz4_free_ctx,
++ .streams = {
++ .alloc_ctx = lz4_alloc_ctx,
++ .free_ctx = lz4_free_ctx,
++ },
+ .compress = lz4_scompress,
+ .decompress = lz4_sdecompress,
+ .base = {
+diff --git a/crypto/lz4hc.c b/crypto/lz4hc.c
+index 9c61d05b621429..bb84f8a68cb58f 100644
+--- a/crypto/lz4hc.c
++++ b/crypto/lz4hc.c
+@@ -66,8 +66,10 @@ static int lz4hc_sdecompress(struct crypto_scomp *tfm, const u8 *src,
+ }
+
+ static struct scomp_alg scomp = {
+- .alloc_ctx = lz4hc_alloc_ctx,
+- .free_ctx = lz4hc_free_ctx,
++ .streams = {
++ .alloc_ctx = lz4hc_alloc_ctx,
++ .free_ctx = lz4hc_free_ctx,
++ },
+ .compress = lz4hc_scompress,
+ .decompress = lz4hc_sdecompress,
+ .base = {
+diff --git a/crypto/lzo-rle.c b/crypto/lzo-rle.c
+index ba013f2d5090d5..794e7ec49536b0 100644
+--- a/crypto/lzo-rle.c
++++ b/crypto/lzo-rle.c
+@@ -70,8 +70,10 @@ static int lzorle_sdecompress(struct crypto_scomp *tfm, const u8 *src,
+ }
+
+ static struct scomp_alg scomp = {
+- .alloc_ctx = lzorle_alloc_ctx,
+- .free_ctx = lzorle_free_ctx,
++ .streams = {
++ .alloc_ctx = lzorle_alloc_ctx,
++ .free_ctx = lzorle_free_ctx,
++ },
+ .compress = lzorle_scompress,
+ .decompress = lzorle_sdecompress,
+ .base = {
+diff --git a/crypto/lzo.c b/crypto/lzo.c
+index 7867e2c67c4ed1..d43242b24b4e83 100644
+--- a/crypto/lzo.c
++++ b/crypto/lzo.c
+@@ -70,8 +70,10 @@ static int lzo_sdecompress(struct crypto_scomp *tfm, const u8 *src,
+ }
+
+ static struct scomp_alg scomp = {
+- .alloc_ctx = lzo_alloc_ctx,
+- .free_ctx = lzo_free_ctx,
++ .streams = {
++ .alloc_ctx = lzo_alloc_ctx,
++ .free_ctx = lzo_free_ctx,
++ },
+ .compress = lzo_scompress,
+ .decompress = lzo_sdecompress,
+ .base = {
+diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
+index 2cff5419bd2fac..cda964ba33cd7c 100644
+--- a/drivers/accel/amdxdna/aie2_ctx.c
++++ b/drivers/accel/amdxdna/aie2_ctx.c
+@@ -192,7 +192,7 @@ aie2_sched_resp_handler(void *handle, void __iomem *data, size_t size)
+ {
+ struct amdxdna_sched_job *job = handle;
+ struct amdxdna_gem_obj *cmd_abo;
+- u32 ret = 0;
++ int ret = 0;
+ u32 status;
+
+ cmd_abo = job->cmd_bo;
+@@ -222,7 +222,7 @@ static int
+ aie2_sched_nocmd_resp_handler(void *handle, void __iomem *data, size_t size)
+ {
+ struct amdxdna_sched_job *job = handle;
+- u32 ret = 0;
++ int ret = 0;
+ u32 status;
+
+ if (unlikely(!data))
+@@ -250,7 +250,7 @@ aie2_sched_cmdlist_resp_handler(void *handle, void __iomem *data, size_t size)
+ u32 fail_cmd_status;
+ u32 fail_cmd_idx;
+ u32 cmd_status;
+- u32 ret = 0;
++ int ret = 0;
+
+ cmd_abo = job->cmd_bo;
+ if (unlikely(!data) || unlikely(size != sizeof(u32) * 3)) {
+diff --git a/drivers/acpi/acpica/aclocal.h b/drivers/acpi/acpica/aclocal.h
+index 0c41f0097e8d71..f98640086f4ef3 100644
+--- a/drivers/acpi/acpica/aclocal.h
++++ b/drivers/acpi/acpica/aclocal.h
+@@ -1141,7 +1141,7 @@ struct acpi_port_info {
+ #define ACPI_RESOURCE_NAME_PIN_GROUP_FUNCTION 0x91
+ #define ACPI_RESOURCE_NAME_PIN_GROUP_CONFIG 0x92
+ #define ACPI_RESOURCE_NAME_CLOCK_INPUT 0x93
+-#define ACPI_RESOURCE_NAME_LARGE_MAX 0x94
++#define ACPI_RESOURCE_NAME_LARGE_MAX 0x93
+
+ /*****************************************************************************
+ *
+diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
+index ae035b93da0878..3eb56b77cb6d93 100644
+--- a/drivers/acpi/nfit/core.c
++++ b/drivers/acpi/nfit/core.c
+@@ -2637,7 +2637,7 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
+ if (ndr_desc->target_node == NUMA_NO_NODE) {
+ ndr_desc->target_node = phys_to_target_node(spa->address);
+ dev_info(acpi_desc->dev, "changing target node from %d to %d for nfit region [%pa-%pa]",
+- NUMA_NO_NODE, ndr_desc->numa_node, &res.start, &res.end);
++ NUMA_NO_NODE, ndr_desc->target_node, &res.start, &res.end);
+ }
+
+ /*
+diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
+index 2c2dc559e0f8de..d0fc045a8d310c 100644
+--- a/drivers/acpi/processor_idle.c
++++ b/drivers/acpi/processor_idle.c
+@@ -1405,6 +1405,9 @@ int acpi_processor_power_init(struct acpi_processor *pr)
+ if (retval) {
+ if (acpi_processor_registered == 0)
+ cpuidle_unregister_driver(&acpi_idle_driver);
++
++ per_cpu(acpi_cpuidle_device, pr->id) = NULL;
++ kfree(dev);
+ return retval;
+ }
+ acpi_processor_registered++;
+diff --git a/drivers/base/node.c b/drivers/base/node.c
+index 3399594136b2a1..67b01d57973774 100644
+--- a/drivers/base/node.c
++++ b/drivers/base/node.c
+@@ -885,6 +885,10 @@ int register_one_node(int nid)
+ node_devices[nid] = node;
+
+ error = register_node(node_devices[nid], nid);
++ if (error) {
++ node_devices[nid] = NULL;
++ return error;
++ }
+
+ /* link cpu under this node */
+ for_each_present_cpu(cpu) {
+diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
+index 2ea6e05e6ec90e..c883b01ffbddc4 100644
+--- a/drivers/base/power/main.c
++++ b/drivers/base/power/main.c
+@@ -724,8 +724,20 @@ static void device_resume_noirq(struct device *dev, pm_message_t state, bool asy
+ if (dev->power.syscore || dev->power.direct_complete)
+ goto Out;
+
+- if (!dev->power.is_noirq_suspended)
++ if (!dev->power.is_noirq_suspended) {
++ /*
++ * This means that system suspend has been aborted in the noirq
++ * phase before invoking the noirq suspend callback for the
++ * device, so if device_suspend_late() has left it in suspend,
++ * device_resume_early() should leave it in suspend either in
++ * case the early resume of it depends on the noirq resume that
++ * has not run.
++ */
++ if (dev_pm_skip_suspend(dev))
++ dev->power.must_resume = false;
++
+ goto Out;
++ }
+
+ if (!dpm_wait_for_superior(dev, async))
+ goto Out;
+diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
+index 1f3f782a04ba23..6883e1a43fe5d7 100644
+--- a/drivers/base/regmap/regmap.c
++++ b/drivers/base/regmap/regmap.c
+@@ -827,7 +827,7 @@ struct regmap *__regmap_init(struct device *dev,
+ map->read_flag_mask = bus->read_flag_mask;
+ }
+
+- if (config && config->read && config->write) {
++ if (config->read && config->write) {
+ map->reg_read = _regmap_bus_read;
+ if (config->reg_update_bits)
+ map->reg_update_bits = config->reg_update_bits;
+diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
+index 6463d0e8d0cef7..87b0b78249da33 100644
+--- a/drivers/block/nbd.c
++++ b/drivers/block/nbd.c
+@@ -1217,6 +1217,14 @@ static struct socket *nbd_get_socket(struct nbd_device *nbd, unsigned long fd,
+ if (!sock)
+ return NULL;
+
++ if (!sk_is_tcp(sock->sk) &&
++ !sk_is_stream_unix(sock->sk)) {
++ dev_err(disk_to_dev(nbd->disk), "Unsupported socket: should be TCP or UNIX.\n");
++ *err = -EINVAL;
++ sockfd_put(sock);
++ return NULL;
++ }
++
+ if (sock->ops->shutdown == sock_no_shutdown) {
+ dev_err(disk_to_dev(nbd->disk), "Unsupported socket: shutdown callout must be supported.\n");
+ *err = -EINVAL;
+diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
+index 91642c9a3b2935..f982027e8c8582 100644
+--- a/drivers/block/null_blk/main.c
++++ b/drivers/block/null_blk/main.c
+@@ -223,7 +223,7 @@ MODULE_PARM_DESC(discard, "Support discard operations (requires memory-backed nu
+
+ static unsigned long g_cache_size;
+ module_param_named(cache_size, g_cache_size, ulong, 0444);
+-MODULE_PARM_DESC(mbps, "Cache size in MiB for memory-backed device. Default: 0 (none)");
++MODULE_PARM_DESC(cache_size, "Cache size in MiB for memory-backed device. Default: 0 (none)");
+
+ static bool g_fua = true;
+ module_param_named(fua, g_fua, bool, 0444);
+diff --git a/drivers/bluetooth/btintel_pcie.c b/drivers/bluetooth/btintel_pcie.c
+index 6e7bbbd35279f8..585de143ab2555 100644
+--- a/drivers/bluetooth/btintel_pcie.c
++++ b/drivers/bluetooth/btintel_pcie.c
+@@ -15,6 +15,7 @@
+ #include <linux/interrupt.h>
+
+ #include <linux/unaligned.h>
++#include <linux/devcoredump.h>
+
+ #include <net/bluetooth/bluetooth.h>
+ #include <net/bluetooth/hci_core.h>
+@@ -554,25 +555,6 @@ static void btintel_pcie_mac_init(struct btintel_pcie_data *data)
+ btintel_pcie_wr_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG, reg);
+ }
+
+-static int btintel_pcie_add_dmp_data(struct hci_dev *hdev, const void *data, int size)
+-{
+- struct sk_buff *skb;
+- int err;
+-
+- skb = alloc_skb(size, GFP_ATOMIC);
+- if (!skb)
+- return -ENOMEM;
+-
+- skb_put_data(skb, data, size);
+- err = hci_devcd_append(hdev, skb);
+- if (err) {
+- bt_dev_err(hdev, "Failed to append data in the coredump");
+- return err;
+- }
+-
+- return 0;
+-}
+-
+ static int btintel_pcie_get_mac_access(struct btintel_pcie_data *data)
+ {
+ u32 reg;
+@@ -617,30 +599,35 @@ static void btintel_pcie_release_mac_access(struct btintel_pcie_data *data)
+ btintel_pcie_wr_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG, reg);
+ }
+
+-static void btintel_pcie_copy_tlv(struct sk_buff *skb, enum btintel_pcie_tlv_type type,
+- void *data, int size)
++static void *btintel_pcie_copy_tlv(void *dest, enum btintel_pcie_tlv_type type,
++ void *data, size_t size)
+ {
+ struct intel_tlv *tlv;
+
+- tlv = skb_put(skb, sizeof(*tlv) + size);
++ tlv = dest;
+ tlv->type = type;
+ tlv->len = size;
+ memcpy(tlv->val, data, tlv->len);
++ return dest + sizeof(*tlv) + size;
+ }
+
+ static int btintel_pcie_read_dram_buffers(struct btintel_pcie_data *data)
+ {
+- u32 offset, prev_size, wr_ptr_status, dump_size, i;
++ u32 offset, prev_size, wr_ptr_status, dump_size, data_len;
+ struct btintel_pcie_dbgc *dbgc = &data->dbgc;
+- u8 buf_idx, dump_time_len, fw_build;
+ struct hci_dev *hdev = data->hdev;
++ u8 *pdata, *p, buf_idx;
+ struct intel_tlv *tlv;
+ struct timespec64 now;
+- struct sk_buff *skb;
+ struct tm tm_now;
+- char buf[256];
+- u16 hdr_len;
+- int ret;
++ char fw_build[128];
++ char ts[128];
++ char vendor[64];
++ char driver[64];
++
++ if (!IS_ENABLED(CONFIG_DEV_COREDUMP))
++ return -EOPNOTSUPP;
++
+
+ wr_ptr_status = btintel_pcie_rd_dev_mem(data, BTINTEL_PCIE_DBGC_CUR_DBGBUFF_STATUS);
+ offset = wr_ptr_status & BTINTEL_PCIE_DBG_OFFSET_BIT_MASK;
+@@ -657,88 +644,84 @@ static int btintel_pcie_read_dram_buffers(struct btintel_pcie_data *data)
+ else
+ return -EINVAL;
+
++ snprintf(vendor, sizeof(vendor), "Vendor: Intel\n");
++ snprintf(driver, sizeof(driver), "Driver: %s\n",
++ data->dmp_hdr.driver_name);
++
+ ktime_get_real_ts64(&now);
+ time64_to_tm(now.tv_sec, 0, &tm_now);
+- dump_time_len = snprintf(buf, sizeof(buf), "Dump Time: %02d-%02d-%04ld %02d:%02d:%02d",
++ snprintf(ts, sizeof(ts), "Dump Time: %02d-%02d-%04ld %02d:%02d:%02d",
+ tm_now.tm_mday, tm_now.tm_mon + 1, tm_now.tm_year + 1900,
+ tm_now.tm_hour, tm_now.tm_min, tm_now.tm_sec);
+
+- fw_build = snprintf(buf + dump_time_len, sizeof(buf) - dump_time_len,
++ snprintf(fw_build, sizeof(fw_build),
+ "Firmware Timestamp: Year %u WW %02u buildtype %u build %u",
+ 2000 + (data->dmp_hdr.fw_timestamp >> 8),
+ data->dmp_hdr.fw_timestamp & 0xff, data->dmp_hdr.fw_build_type,
+ data->dmp_hdr.fw_build_num);
+
+- hdr_len = sizeof(*tlv) + sizeof(data->dmp_hdr.cnvi_bt) +
+- sizeof(*tlv) + sizeof(data->dmp_hdr.write_ptr) +
+- sizeof(*tlv) + sizeof(data->dmp_hdr.wrap_ctr) +
+- sizeof(*tlv) + sizeof(data->dmp_hdr.trigger_reason) +
+- sizeof(*tlv) + sizeof(data->dmp_hdr.fw_git_sha1) +
+- sizeof(*tlv) + sizeof(data->dmp_hdr.cnvr_top) +
+- sizeof(*tlv) + sizeof(data->dmp_hdr.cnvi_top) +
+- sizeof(*tlv) + dump_time_len +
+- sizeof(*tlv) + fw_build;
++ data_len = sizeof(*tlv) + sizeof(data->dmp_hdr.cnvi_bt) +
++ sizeof(*tlv) + sizeof(data->dmp_hdr.write_ptr) +
++ sizeof(*tlv) + sizeof(data->dmp_hdr.wrap_ctr) +
++ sizeof(*tlv) + sizeof(data->dmp_hdr.trigger_reason) +
++ sizeof(*tlv) + sizeof(data->dmp_hdr.fw_git_sha1) +
++ sizeof(*tlv) + sizeof(data->dmp_hdr.cnvr_top) +
++ sizeof(*tlv) + sizeof(data->dmp_hdr.cnvi_top) +
++ sizeof(*tlv) + strlen(ts) +
++ sizeof(*tlv) + strlen(fw_build) +
++ sizeof(*tlv) + strlen(vendor) +
++ sizeof(*tlv) + strlen(driver);
+
+- dump_size = hdr_len + sizeof(hdr_len);
++ /*
++ * sizeof(u32) - signature
++ * sizeof(data_len) - to store tlv data size
++ * data_len - TLV data
++ */
++ dump_size = sizeof(u32) + sizeof(data_len) + data_len;
+
+- skb = alloc_skb(dump_size, GFP_KERNEL);
+- if (!skb)
+- return -ENOMEM;
+
+ /* Add debug buffers data length to dump size */
+ dump_size += BTINTEL_PCIE_DBGC_BUFFER_SIZE * dbgc->count;
+
+- ret = hci_devcd_init(hdev, dump_size);
+- if (ret) {
+- bt_dev_err(hdev, "Failed to init devcoredump, err %d", ret);
+- kfree_skb(skb);
+- return ret;
+- }
++ pdata = vmalloc(dump_size);
++ if (!pdata)
++ return -ENOMEM;
++ p = pdata;
++
++ *(u32 *)p = BTINTEL_PCIE_MAGIC_NUM;
++ p += sizeof(u32);
+
+- skb_put_data(skb, &hdr_len, sizeof(hdr_len));
++ *(u32 *)p = data_len;
++ p += sizeof(u32);
+
+- btintel_pcie_copy_tlv(skb, BTINTEL_CNVI_BT, &data->dmp_hdr.cnvi_bt,
+- sizeof(data->dmp_hdr.cnvi_bt));
+
+- btintel_pcie_copy_tlv(skb, BTINTEL_WRITE_PTR, &data->dmp_hdr.write_ptr,
+- sizeof(data->dmp_hdr.write_ptr));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_VENDOR, vendor, strlen(vendor));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_DRIVER, driver, strlen(driver));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_DUMP_TIME, ts, strlen(ts));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_FW_BUILD, fw_build,
++ strlen(fw_build));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_CNVI_BT, &data->dmp_hdr.cnvi_bt,
++ sizeof(data->dmp_hdr.cnvi_bt));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_WRITE_PTR, &data->dmp_hdr.write_ptr,
++ sizeof(data->dmp_hdr.write_ptr));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_WRAP_CTR, &data->dmp_hdr.wrap_ctr,
++ sizeof(data->dmp_hdr.wrap_ctr));
+
+ data->dmp_hdr.wrap_ctr = btintel_pcie_rd_dev_mem(data,
+ BTINTEL_PCIE_DBGC_DBGBUFF_WRAP_ARND);
+
+- btintel_pcie_copy_tlv(skb, BTINTEL_WRAP_CTR, &data->dmp_hdr.wrap_ctr,
+- sizeof(data->dmp_hdr.wrap_ctr));
+-
+- btintel_pcie_copy_tlv(skb, BTINTEL_TRIGGER_REASON, &data->dmp_hdr.trigger_reason,
+- sizeof(data->dmp_hdr.trigger_reason));
+-
+- btintel_pcie_copy_tlv(skb, BTINTEL_FW_SHA, &data->dmp_hdr.fw_git_sha1,
+- sizeof(data->dmp_hdr.fw_git_sha1));
+-
+- btintel_pcie_copy_tlv(skb, BTINTEL_CNVR_TOP, &data->dmp_hdr.cnvr_top,
+- sizeof(data->dmp_hdr.cnvr_top));
+-
+- btintel_pcie_copy_tlv(skb, BTINTEL_CNVI_TOP, &data->dmp_hdr.cnvi_top,
+- sizeof(data->dmp_hdr.cnvi_top));
+-
+- btintel_pcie_copy_tlv(skb, BTINTEL_DUMP_TIME, buf, dump_time_len);
+-
+- btintel_pcie_copy_tlv(skb, BTINTEL_FW_BUILD, buf + dump_time_len, fw_build);
+-
+- ret = hci_devcd_append(hdev, skb);
+- if (ret)
+- goto exit_err;
+-
+- for (i = 0; i < dbgc->count; i++) {
+- ret = btintel_pcie_add_dmp_data(hdev, dbgc->bufs[i].data,
+- BTINTEL_PCIE_DBGC_BUFFER_SIZE);
+- if (ret)
+- break;
+- }
+-
+-exit_err:
+- hci_devcd_complete(hdev);
+- return ret;
++ p = btintel_pcie_copy_tlv(p, BTINTEL_TRIGGER_REASON, &data->dmp_hdr.trigger_reason,
++ sizeof(data->dmp_hdr.trigger_reason));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_FW_SHA, &data->dmp_hdr.fw_git_sha1,
++ sizeof(data->dmp_hdr.fw_git_sha1));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_CNVR_TOP, &data->dmp_hdr.cnvr_top,
++ sizeof(data->dmp_hdr.cnvr_top));
++ p = btintel_pcie_copy_tlv(p, BTINTEL_CNVI_TOP, &data->dmp_hdr.cnvi_top,
++ sizeof(data->dmp_hdr.cnvi_top));
++
++ memcpy(p, dbgc->bufs[0].data, dbgc->count * BTINTEL_PCIE_DBGC_BUFFER_SIZE);
++ dev_coredumpv(&hdev->dev, pdata, dump_size, GFP_KERNEL);
++ return 0;
+ }
+
+ static void btintel_pcie_dump_traces(struct hci_dev *hdev)
+@@ -760,51 +743,6 @@ static void btintel_pcie_dump_traces(struct hci_dev *hdev)
+ bt_dev_err(hdev, "Failed to dump traces: (%d)", ret);
+ }
+
+-static void btintel_pcie_dump_hdr(struct hci_dev *hdev, struct sk_buff *skb)
+-{
+- struct btintel_pcie_data *data = hci_get_drvdata(hdev);
+- u16 len = skb->len;
+- u16 *hdrlen_ptr;
+- char buf[80];
+-
+- hdrlen_ptr = skb_put_zero(skb, sizeof(len));
+-
+- snprintf(buf, sizeof(buf), "Controller Name: 0x%X\n",
+- INTEL_HW_VARIANT(data->dmp_hdr.cnvi_bt));
+- skb_put_data(skb, buf, strlen(buf));
+-
+- snprintf(buf, sizeof(buf), "Firmware Build Number: %u\n",
+- data->dmp_hdr.fw_build_num);
+- skb_put_data(skb, buf, strlen(buf));
+-
+- snprintf(buf, sizeof(buf), "Driver: %s\n", data->dmp_hdr.driver_name);
+- skb_put_data(skb, buf, strlen(buf));
+-
+- snprintf(buf, sizeof(buf), "Vendor: Intel\n");
+- skb_put_data(skb, buf, strlen(buf));
+-
+- *hdrlen_ptr = skb->len - len;
+-}
+-
+-static void btintel_pcie_dump_notify(struct hci_dev *hdev, int state)
+-{
+- struct btintel_pcie_data *data = hci_get_drvdata(hdev);
+-
+- switch (state) {
+- case HCI_DEVCOREDUMP_IDLE:
+- data->dmp_hdr.state = HCI_DEVCOREDUMP_IDLE;
+- break;
+- case HCI_DEVCOREDUMP_ACTIVE:
+- data->dmp_hdr.state = HCI_DEVCOREDUMP_ACTIVE;
+- break;
+- case HCI_DEVCOREDUMP_TIMEOUT:
+- case HCI_DEVCOREDUMP_ABORT:
+- case HCI_DEVCOREDUMP_DONE:
+- data->dmp_hdr.state = HCI_DEVCOREDUMP_IDLE;
+- break;
+- }
+-}
+-
+ /* This function enables BT function by setting BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_INIT bit in
+ * BTINTEL_PCIE_CSR_FUNC_CTRL_REG register and wait for MSI-X with
+ * BTINTEL_PCIE_MSIX_HW_INT_CAUSES_GP0.
+@@ -1378,6 +1316,11 @@ static void btintel_pcie_rx_work(struct work_struct *work)
+ struct btintel_pcie_data, rx_work);
+ struct sk_buff *skb;
+
++ if (test_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags)) {
++ btintel_pcie_dump_traces(data->hdev);
++ clear_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags);
++ }
++
+ if (test_bit(BTINTEL_PCIE_HWEXP_INPROGRESS, &data->flags)) {
+ /* Unlike usb products, controller will not send hardware
+ * exception event on exception. Instead controller writes the
+@@ -1390,11 +1333,6 @@ static void btintel_pcie_rx_work(struct work_struct *work)
+ clear_bit(BTINTEL_PCIE_HWEXP_INPROGRESS, &data->flags);
+ }
+
+- if (test_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags)) {
+- btintel_pcie_dump_traces(data->hdev);
+- clear_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags);
+- }
+-
+ /* Process the sk_buf in queue and send to the HCI layer */
+ while ((skb = skb_dequeue(&data->rx_skb_q))) {
+ btintel_pcie_recv_frame(data, skb);
+@@ -2184,13 +2122,6 @@ static int btintel_pcie_setup_internal(struct hci_dev *hdev)
+ if (ver_tlv.img_type == 0x02 || ver_tlv.img_type == 0x03)
+ data->dmp_hdr.fw_git_sha1 = ver_tlv.git_sha1;
+
+- err = hci_devcd_register(hdev, btintel_pcie_dump_traces, btintel_pcie_dump_hdr,
+- btintel_pcie_dump_notify);
+- if (err) {
+- bt_dev_err(hdev, "Failed to register coredump (%d)", err);
+- goto exit_error;
+- }
+-
+ btintel_print_fseq_info(hdev);
+ exit_error:
+ kfree_skb(skb);
+@@ -2319,7 +2250,6 @@ static void btintel_pcie_removal_work(struct work_struct *wk)
+ btintel_pcie_synchronize_irqs(data);
+
+ flush_work(&data->rx_work);
+- flush_work(&data->hdev->dump.dump_rx);
+
+ bt_dev_dbg(data->hdev, "Release bluetooth interface");
+ btintel_pcie_release_hdev(data);
+diff --git a/drivers/bluetooth/btintel_pcie.h b/drivers/bluetooth/btintel_pcie.h
+index 0fa876c5b954a0..04b21f968ad30f 100644
+--- a/drivers/bluetooth/btintel_pcie.h
++++ b/drivers/bluetooth/btintel_pcie.h
+@@ -132,6 +132,8 @@ enum btintel_pcie_tlv_type {
+ BTINTEL_CNVI_TOP,
+ BTINTEL_DUMP_TIME,
+ BTINTEL_FW_BUILD,
++ BTINTEL_VENDOR,
++ BTINTEL_DRIVER
+ };
+
+ /* causes for the MBOX interrupts */
+diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
+index c1c0a4759c7e4f..5027da143728eb 100644
+--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
++++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
+@@ -1104,6 +1104,9 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
+ * Get physical address of MC portal for the root DPRC:
+ */
+ plat_res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
++ if (!plat_res)
++ return -EINVAL;
++
+ mc_portal_phys_addr = plat_res->start;
+ mc_portal_size = resource_size(plat_res);
+ mc_portal_base_phys_addr = mc_portal_phys_addr & ~0x3ffffff;
+diff --git a/drivers/cdx/Kconfig b/drivers/cdx/Kconfig
+index 3af41f51cf38bc..1f1e360507d7d5 100644
+--- a/drivers/cdx/Kconfig
++++ b/drivers/cdx/Kconfig
+@@ -8,7 +8,6 @@
+ config CDX_BUS
+ bool "CDX Bus driver"
+ depends on OF && ARM64 || COMPILE_TEST
+- select GENERIC_MSI_IRQ
+ help
+ Driver to enable Composable DMA Transfer(CDX) Bus. CDX bus
+ exposes Fabric devices which uses composable DMA IP to the
+diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c
+index 092306ca2541cc..3d50f8cd9c0bd7 100644
+--- a/drivers/cdx/cdx.c
++++ b/drivers/cdx/cdx.c
+@@ -310,7 +310,7 @@ static int cdx_probe(struct device *dev)
+ * Setup MSI device data so that generic MSI alloc/free can
+ * be used by the device driver.
+ */
+- if (cdx->msi_domain) {
++ if (IS_ENABLED(CONFIG_GENERIC_MSI_IRQ) && cdx->msi_domain) {
+ error = msi_setup_device_data(&cdx_dev->dev);
+ if (error)
+ return error;
+@@ -833,7 +833,7 @@ int cdx_device_add(struct cdx_dev_params *dev_params)
+ ((cdx->id << CDX_CONTROLLER_ID_SHIFT) | (cdx_dev->bus_num & CDX_BUS_NUM_MASK)),
+ cdx_dev->dev_num);
+
+- if (cdx->msi_domain) {
++ if (IS_ENABLED(CONFIG_GENERIC_MSI_IRQ) && cdx->msi_domain) {
+ cdx_dev->num_msi = dev_params->num_msi;
+ dev_set_msi_domain(&cdx_dev->dev, cdx->msi_domain);
+ }
+diff --git a/drivers/cdx/controller/Kconfig b/drivers/cdx/controller/Kconfig
+index 0641a4c21e6608..a480b62cbd1f74 100644
+--- a/drivers/cdx/controller/Kconfig
++++ b/drivers/cdx/controller/Kconfig
+@@ -10,7 +10,6 @@ if CDX_BUS
+ config CDX_CONTROLLER
+ tristate "CDX bus controller"
+ depends on HAS_DMA
+- select GENERIC_MSI_IRQ
+ select REMOTEPROC
+ select RPMSG
+ help
+diff --git a/drivers/cdx/controller/cdx_controller.c b/drivers/cdx/controller/cdx_controller.c
+index fca83141e3e66e..5e3fd89b6b561b 100644
+--- a/drivers/cdx/controller/cdx_controller.c
++++ b/drivers/cdx/controller/cdx_controller.c
+@@ -193,7 +193,8 @@ static int xlnx_cdx_probe(struct platform_device *pdev)
+ cdx->ops = &cdx_ops;
+
+ /* Create MSI domain */
+- cdx->msi_domain = cdx_msi_domain_init(&pdev->dev);
++ if (IS_ENABLED(CONFIG_GENERIC_MSI_IRQ))
++ cdx->msi_domain = cdx_msi_domain_init(&pdev->dev);
+ if (!cdx->msi_domain) {
+ ret = dev_err_probe(&pdev->dev, -ENODEV, "cdx_msi_domain_init() failed");
+ goto cdx_msi_fail;
+diff --git a/drivers/char/hw_random/Kconfig b/drivers/char/hw_random/Kconfig
+index c858278434475b..7826fd7c4603f2 100644
+--- a/drivers/char/hw_random/Kconfig
++++ b/drivers/char/hw_random/Kconfig
+@@ -312,6 +312,7 @@ config HW_RANDOM_INGENIC_TRNG
+ config HW_RANDOM_NOMADIK
+ tristate "ST-Ericsson Nomadik Random Number Generator support"
+ depends on ARCH_NOMADIK || COMPILE_TEST
++ depends on ARM_AMBA
+ default HW_RANDOM
+ help
+ This driver provides kernel-side support for the Random Number
+diff --git a/drivers/char/hw_random/ks-sa-rng.c b/drivers/char/hw_random/ks-sa-rng.c
+index d8fd8a3544828a..9e408144a10c1e 100644
+--- a/drivers/char/hw_random/ks-sa-rng.c
++++ b/drivers/char/hw_random/ks-sa-rng.c
+@@ -231,6 +231,10 @@ static int ks_sa_rng_probe(struct platform_device *pdev)
+ if (IS_ERR(ks_sa_rng->regmap_cfg))
+ return dev_err_probe(dev, -EINVAL, "syscon_node_to_regmap failed\n");
+
++ ks_sa_rng->clk = devm_clk_get_enabled(dev, NULL);
++ if (IS_ERR(ks_sa_rng->clk))
++ return dev_err_probe(dev, PTR_ERR(ks_sa_rng->clk), "Failed to get clock\n");
++
+ pm_runtime_enable(dev);
+ ret = pm_runtime_resume_and_get(dev);
+ if (ret < 0) {
+diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
+index dddd702b2454a6..3e4684f6b4afaa 100644
+--- a/drivers/char/tpm/Kconfig
++++ b/drivers/char/tpm/Kconfig
+@@ -29,7 +29,7 @@ if TCG_TPM
+
+ config TCG_TPM2_HMAC
+ bool "Use HMAC and encrypted transactions on the TPM bus"
+- default X86_64
++ default n
+ select CRYPTO_ECDH
+ select CRYPTO_LIB_AESCFB
+ select CRYPTO_LIB_SHA256
+diff --git a/drivers/clocksource/timer-tegra186.c b/drivers/clocksource/timer-tegra186.c
+index e5394f98a02e66..47bdb1e320af97 100644
+--- a/drivers/clocksource/timer-tegra186.c
++++ b/drivers/clocksource/timer-tegra186.c
+@@ -159,7 +159,7 @@ static void tegra186_wdt_enable(struct tegra186_wdt *wdt)
+ tmr_writel(wdt->tmr, TMRCSSR_SRC_USEC, TMRCSSR);
+
+ /* configure timer (system reset happens on the fifth expiration) */
+- value = TMRCR_PTV(wdt->base.timeout * USEC_PER_SEC / 5) |
++ value = TMRCR_PTV(wdt->base.timeout * (USEC_PER_SEC / 5)) |
+ TMRCR_PERIODIC | TMRCR_ENABLE;
+ tmr_writel(wdt->tmr, value, TMRCR);
+
+@@ -267,7 +267,7 @@ static unsigned int tegra186_wdt_get_timeleft(struct watchdog_device *wdd)
+ * counter value to the time of the counter expirations that
+ * remain.
+ */
+- timeleft += (((u64)wdt->base.timeout * USEC_PER_SEC) / 5) * (4 - expiration);
++ timeleft += ((u64)wdt->base.timeout * (USEC_PER_SEC / 5)) * (4 - expiration);
+
+ /*
+ * Convert the current counter value to seconds,
+diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
+index ef078426bfd51a..38c165d526d144 100644
+--- a/drivers/cpufreq/scmi-cpufreq.c
++++ b/drivers/cpufreq/scmi-cpufreq.c
+@@ -15,6 +15,7 @@
+ #include <linux/energy_model.h>
+ #include <linux/export.h>
+ #include <linux/module.h>
++#include <linux/of.h>
+ #include <linux/pm_opp.h>
+ #include <linux/pm_qos.h>
+ #include <linux/slab.h>
+@@ -424,6 +425,15 @@ static bool scmi_dev_used_by_cpus(struct device *scmi_dev)
+ return true;
+ }
+
++ /*
++ * Older Broadcom STB chips had a "clocks" property for CPU node(s)
++ * that did not match the SCMI performance protocol node, if we got
++ * there, it means we had such an older Device Tree, therefore return
++ * true to preserve backwards compatibility.
++ */
++ if (of_machine_is_compatible("brcm,brcmstb"))
++ return true;
++
+ return false;
+ }
+
+diff --git a/drivers/cpuidle/cpuidle-qcom-spm.c b/drivers/cpuidle/cpuidle-qcom-spm.c
+index 5f386761b1562a..f60a4cf5364237 100644
+--- a/drivers/cpuidle/cpuidle-qcom-spm.c
++++ b/drivers/cpuidle/cpuidle-qcom-spm.c
+@@ -96,20 +96,23 @@ static int spm_cpuidle_register(struct device *cpuidle_dev, int cpu)
+ return -ENODEV;
+
+ saw_node = of_parse_phandle(cpu_node, "qcom,saw", 0);
++ of_node_put(cpu_node);
+ if (!saw_node)
+ return -ENODEV;
+
+ pdev = of_find_device_by_node(saw_node);
+ of_node_put(saw_node);
+- of_node_put(cpu_node);
+ if (!pdev)
+ return -ENODEV;
+
+ data = devm_kzalloc(cpuidle_dev, sizeof(*data), GFP_KERNEL);
+- if (!data)
++ if (!data) {
++ put_device(&pdev->dev);
+ return -ENOMEM;
++ }
+
+ data->spm = dev_get_drvdata(&pdev->dev);
++ put_device(&pdev->dev);
+ if (!data->spm)
+ return -EINVAL;
+
+diff --git a/drivers/crypto/hisilicon/debugfs.c b/drivers/crypto/hisilicon/debugfs.c
+index 45e130b901eb5e..17eb236e9ee4d5 100644
+--- a/drivers/crypto/hisilicon/debugfs.c
++++ b/drivers/crypto/hisilicon/debugfs.c
+@@ -888,6 +888,7 @@ static int qm_diff_regs_init(struct hisi_qm *qm,
+ dfx_regs_uninit(qm, qm->debug.qm_diff_regs, ARRAY_SIZE(qm_diff_regs));
+ ret = PTR_ERR(qm->debug.acc_diff_regs);
+ qm->debug.acc_diff_regs = NULL;
++ qm->debug.qm_diff_regs = NULL;
+ return ret;
+ }
+
+diff --git a/drivers/crypto/hisilicon/hpre/hpre_main.c b/drivers/crypto/hisilicon/hpre/hpre_main.c
+index f5b47e5ff48a42..7b60e89015bdf1 100644
+--- a/drivers/crypto/hisilicon/hpre/hpre_main.c
++++ b/drivers/crypto/hisilicon/hpre/hpre_main.c
+@@ -78,6 +78,11 @@
+ #define HPRE_PREFETCH_ENABLE (~(BIT(0) | BIT(30)))
+ #define HPRE_PREFETCH_DISABLE BIT(30)
+ #define HPRE_SVA_DISABLE_READY (BIT(4) | BIT(8))
++#define HPRE_SVA_PREFTCH_DFX4 0x301144
++#define HPRE_WAIT_SVA_READY 500000
++#define HPRE_READ_SVA_STATUS_TIMES 3
++#define HPRE_WAIT_US_MIN 10
++#define HPRE_WAIT_US_MAX 20
+
+ /* clock gate */
+ #define HPRE_CLKGATE_CTL 0x301a10
+@@ -466,6 +471,33 @@ struct hisi_qp *hpre_create_qp(u8 type)
+ return NULL;
+ }
+
++static int hpre_wait_sva_ready(struct hisi_qm *qm)
++{
++ u32 val, try_times = 0;
++ u8 count = 0;
++
++ /*
++ * Read the register value every 10-20us. If the value is 0 for three
++ * consecutive times, the SVA module is ready.
++ */
++ do {
++ val = readl(qm->io_base + HPRE_SVA_PREFTCH_DFX4);
++ if (val)
++ count = 0;
++ else if (++count == HPRE_READ_SVA_STATUS_TIMES)
++ break;
++
++ usleep_range(HPRE_WAIT_US_MIN, HPRE_WAIT_US_MAX);
++ } while (++try_times < HPRE_WAIT_SVA_READY);
++
++ if (try_times == HPRE_WAIT_SVA_READY) {
++ pci_err(qm->pdev, "failed to wait sva prefetch ready\n");
++ return -ETIMEDOUT;
++ }
++
++ return 0;
++}
++
+ static void hpre_config_pasid(struct hisi_qm *qm)
+ {
+ u32 val1, val2;
+@@ -563,7 +595,7 @@ static void disable_flr_of_bme(struct hisi_qm *qm)
+ writel(PEH_AXUSER_CFG_ENABLE, qm->io_base + QM_PEH_AXUSER_CFG_ENABLE);
+ }
+
+-static void hpre_open_sva_prefetch(struct hisi_qm *qm)
++static void hpre_close_sva_prefetch(struct hisi_qm *qm)
+ {
+ u32 val;
+ int ret;
+@@ -571,20 +603,21 @@ static void hpre_open_sva_prefetch(struct hisi_qm *qm)
+ if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
+ return;
+
+- /* Enable prefetch */
+ val = readl_relaxed(qm->io_base + HPRE_PREFETCH_CFG);
+- val &= HPRE_PREFETCH_ENABLE;
++ val |= HPRE_PREFETCH_DISABLE;
+ writel(val, qm->io_base + HPRE_PREFETCH_CFG);
+
+- ret = readl_relaxed_poll_timeout(qm->io_base + HPRE_PREFETCH_CFG,
+- val, !(val & HPRE_PREFETCH_DISABLE),
++ ret = readl_relaxed_poll_timeout(qm->io_base + HPRE_SVA_PREFTCH_DFX,
++ val, !(val & HPRE_SVA_DISABLE_READY),
+ HPRE_REG_RD_INTVRL_US,
+ HPRE_REG_RD_TMOUT_US);
+ if (ret)
+- pci_err(qm->pdev, "failed to open sva prefetch\n");
++ pci_err(qm->pdev, "failed to close sva prefetch\n");
++
++ (void)hpre_wait_sva_ready(qm);
+ }
+
+-static void hpre_close_sva_prefetch(struct hisi_qm *qm)
++static void hpre_open_sva_prefetch(struct hisi_qm *qm)
+ {
+ u32 val;
+ int ret;
+@@ -592,16 +625,24 @@ static void hpre_close_sva_prefetch(struct hisi_qm *qm)
+ if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
+ return;
+
++ /* Enable prefetch */
+ val = readl_relaxed(qm->io_base + HPRE_PREFETCH_CFG);
+- val |= HPRE_PREFETCH_DISABLE;
++ val &= HPRE_PREFETCH_ENABLE;
+ writel(val, qm->io_base + HPRE_PREFETCH_CFG);
+
+- ret = readl_relaxed_poll_timeout(qm->io_base + HPRE_SVA_PREFTCH_DFX,
+- val, !(val & HPRE_SVA_DISABLE_READY),
++ ret = readl_relaxed_poll_timeout(qm->io_base + HPRE_PREFETCH_CFG,
++ val, !(val & HPRE_PREFETCH_DISABLE),
+ HPRE_REG_RD_INTVRL_US,
+ HPRE_REG_RD_TMOUT_US);
++ if (ret) {
++ pci_err(qm->pdev, "failed to open sva prefetch\n");
++ hpre_close_sva_prefetch(qm);
++ return;
++ }
++
++ ret = hpre_wait_sva_ready(qm);
+ if (ret)
+- pci_err(qm->pdev, "failed to close sva prefetch\n");
++ hpre_close_sva_prefetch(qm);
+ }
+
+ static void hpre_enable_clock_gate(struct hisi_qm *qm)
+@@ -721,6 +762,7 @@ static int hpre_set_user_domain_and_cache(struct hisi_qm *qm)
+
+ /* Config data buffer pasid needed by Kunpeng 920 */
+ hpre_config_pasid(qm);
++ hpre_open_sva_prefetch(qm);
+
+ hpre_enable_clock_gate(qm);
+
+@@ -1450,8 +1492,6 @@ static int hpre_pf_probe_init(struct hpre *hpre)
+ if (ret)
+ return ret;
+
+- hpre_open_sva_prefetch(qm);
+-
+ hisi_qm_dev_err_init(qm);
+ ret = hpre_show_last_regs_init(qm);
+ if (ret)
+diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
+index 2e4ee7ecfdfbb6..102aff9ea19a03 100644
+--- a/drivers/crypto/hisilicon/qm.c
++++ b/drivers/crypto/hisilicon/qm.c
+@@ -3826,6 +3826,10 @@ static ssize_t qm_get_qos_value(struct hisi_qm *qm, const char *buf,
+ }
+
+ pdev = container_of(dev, struct pci_dev, dev);
++ if (pci_physfn(pdev) != qm->pdev) {
++ pci_err(qm->pdev, "the pdev input does not match the pf!\n");
++ return -EINVAL;
++ }
+
+ *fun_index = pdev->devfn;
+
+@@ -4447,9 +4451,6 @@ static void qm_restart_prepare(struct hisi_qm *qm)
+ {
+ u32 value;
+
+- if (qm->err_ini->open_sva_prefetch)
+- qm->err_ini->open_sva_prefetch(qm);
+-
+ if (qm->ver >= QM_HW_V3)
+ return;
+
+@@ -4731,6 +4732,15 @@ void hisi_qm_reset_done(struct pci_dev *pdev)
+ }
+ EXPORT_SYMBOL_GPL(hisi_qm_reset_done);
+
++static irqreturn_t qm_rsvd_irq(int irq, void *data)
++{
++ struct hisi_qm *qm = data;
++
++ dev_info(&qm->pdev->dev, "Reserved interrupt, ignore!\n");
++
++ return IRQ_HANDLED;
++}
++
+ static irqreturn_t qm_abnormal_irq(int irq, void *data)
+ {
+ struct hisi_qm *qm = data;
+@@ -5014,7 +5024,7 @@ static void qm_unregister_abnormal_irq(struct hisi_qm *qm)
+ struct pci_dev *pdev = qm->pdev;
+ u32 irq_vector, val;
+
+- if (qm->fun_type == QM_HW_VF)
++ if (qm->fun_type == QM_HW_VF && qm->ver < QM_HW_V3)
+ return;
+
+ val = qm->cap_tables.qm_cap_table[QM_ABNORMAL_IRQ].cap_val;
+@@ -5031,17 +5041,28 @@ static int qm_register_abnormal_irq(struct hisi_qm *qm)
+ u32 irq_vector, val;
+ int ret;
+
+- if (qm->fun_type == QM_HW_VF)
+- return 0;
+-
+ val = qm->cap_tables.qm_cap_table[QM_ABNORMAL_IRQ].cap_val;
+ if (!((val >> QM_IRQ_TYPE_SHIFT) & QM_ABN_IRQ_TYPE_MASK))
+ return 0;
+-
+ irq_vector = val & QM_IRQ_VECTOR_MASK;
++
++ /* For VF, this is a reserved interrupt in V3 version. */
++ if (qm->fun_type == QM_HW_VF) {
++ if (qm->ver < QM_HW_V3)
++ return 0;
++
++ ret = request_irq(pci_irq_vector(pdev, irq_vector), qm_rsvd_irq,
++ IRQF_NO_AUTOEN, qm->dev_name, qm);
++ if (ret) {
++ dev_err(&pdev->dev, "failed to request reserved irq, ret = %d!\n", ret);
++ return ret;
++ }
++ return 0;
++ }
++
+ ret = request_irq(pci_irq_vector(pdev, irq_vector), qm_abnormal_irq, 0, qm->dev_name, qm);
+ if (ret)
+- dev_err(&qm->pdev->dev, "failed to request abnormal irq, ret = %d", ret);
++ dev_err(&qm->pdev->dev, "failed to request abnormal irq, ret = %d!\n", ret);
+
+ return ret;
+ }
+@@ -5407,6 +5428,12 @@ static int hisi_qm_pci_init(struct hisi_qm *qm)
+ pci_set_master(pdev);
+
+ num_vec = qm_get_irq_num(qm);
++ if (!num_vec) {
++ dev_err(dev, "Device irq num is zero!\n");
++ ret = -EINVAL;
++ goto err_get_pci_res;
++ }
++ num_vec = roundup_pow_of_two(num_vec);
+ ret = pci_alloc_irq_vectors(pdev, num_vec, num_vec, PCI_IRQ_MSI);
+ if (ret < 0) {
+ dev_err(dev, "Failed to enable MSI vectors!\n");
+diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c
+index 72cf48d1f3ab86..348f1f52956dcb 100644
+--- a/drivers/crypto/hisilicon/sec2/sec_main.c
++++ b/drivers/crypto/hisilicon/sec2/sec_main.c
+@@ -93,6 +93,16 @@
+ #define SEC_PREFETCH_ENABLE (~(BIT(0) | BIT(1) | BIT(11)))
+ #define SEC_PREFETCH_DISABLE BIT(1)
+ #define SEC_SVA_DISABLE_READY (BIT(7) | BIT(11))
++#define SEC_SVA_PREFETCH_INFO 0x301ED4
++#define SEC_SVA_STALL_NUM GENMASK(23, 8)
++#define SEC_SVA_PREFETCH_NUM GENMASK(2, 0)
++#define SEC_WAIT_SVA_READY 500000
++#define SEC_READ_SVA_STATUS_TIMES 3
++#define SEC_WAIT_US_MIN 10
++#define SEC_WAIT_US_MAX 20
++#define SEC_WAIT_QP_US_MIN 1000
++#define SEC_WAIT_QP_US_MAX 2000
++#define SEC_MAX_WAIT_TIMES 2000
+
+ #define SEC_DELAY_10_US 10
+ #define SEC_POLL_TIMEOUT_US 1000
+@@ -464,6 +474,81 @@ static void sec_set_endian(struct hisi_qm *qm)
+ writel_relaxed(reg, qm->io_base + SEC_CONTROL_REG);
+ }
+
++static int sec_wait_sva_ready(struct hisi_qm *qm, __u32 offset, __u32 mask)
++{
++ u32 val, try_times = 0;
++ u8 count = 0;
++
++ /*
++ * Read the register value every 10-20us. If the value is 0 for three
++ * consecutive times, the SVA module is ready.
++ */
++ do {
++ val = readl(qm->io_base + offset);
++ if (val & mask)
++ count = 0;
++ else if (++count == SEC_READ_SVA_STATUS_TIMES)
++ break;
++
++ usleep_range(SEC_WAIT_US_MIN, SEC_WAIT_US_MAX);
++ } while (++try_times < SEC_WAIT_SVA_READY);
++
++ if (try_times == SEC_WAIT_SVA_READY) {
++ pci_err(qm->pdev, "failed to wait sva prefetch ready\n");
++ return -ETIMEDOUT;
++ }
++
++ return 0;
++}
++
++static void sec_close_sva_prefetch(struct hisi_qm *qm)
++{
++ u32 val;
++ int ret;
++
++ if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
++ return;
++
++ val = readl_relaxed(qm->io_base + SEC_PREFETCH_CFG);
++ val |= SEC_PREFETCH_DISABLE;
++ writel(val, qm->io_base + SEC_PREFETCH_CFG);
++
++ ret = readl_relaxed_poll_timeout(qm->io_base + SEC_SVA_TRANS,
++ val, !(val & SEC_SVA_DISABLE_READY),
++ SEC_DELAY_10_US, SEC_POLL_TIMEOUT_US);
++ if (ret)
++ pci_err(qm->pdev, "failed to close sva prefetch\n");
++
++ (void)sec_wait_sva_ready(qm, SEC_SVA_PREFETCH_INFO, SEC_SVA_STALL_NUM);
++}
++
++static void sec_open_sva_prefetch(struct hisi_qm *qm)
++{
++ u32 val;
++ int ret;
++
++ if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
++ return;
++
++ /* Enable prefetch */
++ val = readl_relaxed(qm->io_base + SEC_PREFETCH_CFG);
++ val &= SEC_PREFETCH_ENABLE;
++ writel(val, qm->io_base + SEC_PREFETCH_CFG);
++
++ ret = readl_relaxed_poll_timeout(qm->io_base + SEC_PREFETCH_CFG,
++ val, !(val & SEC_PREFETCH_DISABLE),
++ SEC_DELAY_10_US, SEC_POLL_TIMEOUT_US);
++ if (ret) {
++ pci_err(qm->pdev, "failed to open sva prefetch\n");
++ sec_close_sva_prefetch(qm);
++ return;
++ }
++
++ ret = sec_wait_sva_ready(qm, SEC_SVA_TRANS, SEC_SVA_PREFETCH_NUM);
++ if (ret)
++ sec_close_sva_prefetch(qm);
++}
++
+ static void sec_engine_sva_config(struct hisi_qm *qm)
+ {
+ u32 reg;
+@@ -497,45 +582,7 @@ static void sec_engine_sva_config(struct hisi_qm *qm)
+ writel_relaxed(reg, qm->io_base +
+ SEC_INTERFACE_USER_CTRL1_REG);
+ }
+-}
+-
+-static void sec_open_sva_prefetch(struct hisi_qm *qm)
+-{
+- u32 val;
+- int ret;
+-
+- if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
+- return;
+-
+- /* Enable prefetch */
+- val = readl_relaxed(qm->io_base + SEC_PREFETCH_CFG);
+- val &= SEC_PREFETCH_ENABLE;
+- writel(val, qm->io_base + SEC_PREFETCH_CFG);
+-
+- ret = readl_relaxed_poll_timeout(qm->io_base + SEC_PREFETCH_CFG,
+- val, !(val & SEC_PREFETCH_DISABLE),
+- SEC_DELAY_10_US, SEC_POLL_TIMEOUT_US);
+- if (ret)
+- pci_err(qm->pdev, "failed to open sva prefetch\n");
+-}
+-
+-static void sec_close_sva_prefetch(struct hisi_qm *qm)
+-{
+- u32 val;
+- int ret;
+-
+- if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
+- return;
+-
+- val = readl_relaxed(qm->io_base + SEC_PREFETCH_CFG);
+- val |= SEC_PREFETCH_DISABLE;
+- writel(val, qm->io_base + SEC_PREFETCH_CFG);
+-
+- ret = readl_relaxed_poll_timeout(qm->io_base + SEC_SVA_TRANS,
+- val, !(val & SEC_SVA_DISABLE_READY),
+- SEC_DELAY_10_US, SEC_POLL_TIMEOUT_US);
+- if (ret)
+- pci_err(qm->pdev, "failed to close sva prefetch\n");
++ sec_open_sva_prefetch(qm);
+ }
+
+ static void sec_enable_clock_gate(struct hisi_qm *qm)
+@@ -1152,7 +1199,6 @@ static int sec_pf_probe_init(struct sec_dev *sec)
+ if (ret)
+ return ret;
+
+- sec_open_sva_prefetch(qm);
+ hisi_qm_dev_err_init(qm);
+ sec_debug_regs_clear(qm);
+ ret = sec_show_last_regs_init(qm);
+diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c
+index d8ba23b7cc7ddf..341c4564e21aa1 100644
+--- a/drivers/crypto/hisilicon/zip/zip_main.c
++++ b/drivers/crypto/hisilicon/zip/zip_main.c
+@@ -95,10 +95,16 @@
+ #define HZIP_PREFETCH_ENABLE (~(BIT(26) | BIT(17) | BIT(0)))
+ #define HZIP_SVA_PREFETCH_DISABLE BIT(26)
+ #define HZIP_SVA_DISABLE_READY (BIT(26) | BIT(30))
++#define HZIP_SVA_PREFETCH_NUM GENMASK(18, 16)
++#define HZIP_SVA_STALL_NUM GENMASK(15, 0)
+ #define HZIP_SHAPER_RATE_COMPRESS 750
+ #define HZIP_SHAPER_RATE_DECOMPRESS 140
+-#define HZIP_DELAY_1_US 1
+-#define HZIP_POLL_TIMEOUT_US 1000
++#define HZIP_DELAY_1_US 1
++#define HZIP_POLL_TIMEOUT_US 1000
++#define HZIP_WAIT_SVA_READY 500000
++#define HZIP_READ_SVA_STATUS_TIMES 3
++#define HZIP_WAIT_US_MIN 10
++#define HZIP_WAIT_US_MAX 20
+
+ /* clock gating */
+ #define HZIP_PEH_CFG_AUTO_GATE 0x3011A8
+@@ -448,10 +454,9 @@ bool hisi_zip_alg_support(struct hisi_qm *qm, u32 alg)
+ return false;
+ }
+
+-static int hisi_zip_set_high_perf(struct hisi_qm *qm)
++static void hisi_zip_set_high_perf(struct hisi_qm *qm)
+ {
+ u32 val;
+- int ret;
+
+ val = readl_relaxed(qm->io_base + HZIP_HIGH_PERF_OFFSET);
+ if (perf_mode == HZIP_HIGH_COMP_PERF)
+@@ -461,16 +466,36 @@ static int hisi_zip_set_high_perf(struct hisi_qm *qm)
+
+ /* Set perf mode */
+ writel(val, qm->io_base + HZIP_HIGH_PERF_OFFSET);
+- ret = readl_relaxed_poll_timeout(qm->io_base + HZIP_HIGH_PERF_OFFSET,
+- val, val == perf_mode, HZIP_DELAY_1_US,
+- HZIP_POLL_TIMEOUT_US);
+- if (ret)
+- pci_err(qm->pdev, "failed to set perf mode\n");
++}
+
+- return ret;
++static int hisi_zip_wait_sva_ready(struct hisi_qm *qm, __u32 offset, __u32 mask)
++{
++ u32 val, try_times = 0;
++ u8 count = 0;
++
++ /*
++ * Read the register value every 10-20us. If the value is 0 for three
++ * consecutive times, the SVA module is ready.
++ */
++ do {
++ val = readl(qm->io_base + offset);
++ if (val & mask)
++ count = 0;
++ else if (++count == HZIP_READ_SVA_STATUS_TIMES)
++ break;
++
++ usleep_range(HZIP_WAIT_US_MIN, HZIP_WAIT_US_MAX);
++ } while (++try_times < HZIP_WAIT_SVA_READY);
++
++ if (try_times == HZIP_WAIT_SVA_READY) {
++ pci_err(qm->pdev, "failed to wait sva prefetch ready\n");
++ return -ETIMEDOUT;
++ }
++
++ return 0;
+ }
+
+-static void hisi_zip_open_sva_prefetch(struct hisi_qm *qm)
++static void hisi_zip_close_sva_prefetch(struct hisi_qm *qm)
+ {
+ u32 val;
+ int ret;
+@@ -478,19 +503,20 @@ static void hisi_zip_open_sva_prefetch(struct hisi_qm *qm)
+ if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
+ return;
+
+- /* Enable prefetch */
+ val = readl_relaxed(qm->io_base + HZIP_PREFETCH_CFG);
+- val &= HZIP_PREFETCH_ENABLE;
++ val |= HZIP_SVA_PREFETCH_DISABLE;
+ writel(val, qm->io_base + HZIP_PREFETCH_CFG);
+
+- ret = readl_relaxed_poll_timeout(qm->io_base + HZIP_PREFETCH_CFG,
+- val, !(val & HZIP_SVA_PREFETCH_DISABLE),
++ ret = readl_relaxed_poll_timeout(qm->io_base + HZIP_SVA_TRANS,
++ val, !(val & HZIP_SVA_DISABLE_READY),
+ HZIP_DELAY_1_US, HZIP_POLL_TIMEOUT_US);
+ if (ret)
+- pci_err(qm->pdev, "failed to open sva prefetch\n");
++ pci_err(qm->pdev, "failed to close sva prefetch\n");
++
++ (void)hisi_zip_wait_sva_ready(qm, HZIP_SVA_TRANS, HZIP_SVA_STALL_NUM);
+ }
+
+-static void hisi_zip_close_sva_prefetch(struct hisi_qm *qm)
++static void hisi_zip_open_sva_prefetch(struct hisi_qm *qm)
+ {
+ u32 val;
+ int ret;
+@@ -498,15 +524,23 @@ static void hisi_zip_close_sva_prefetch(struct hisi_qm *qm)
+ if (!test_bit(QM_SUPPORT_SVA_PREFETCH, &qm->caps))
+ return;
+
++ /* Enable prefetch */
+ val = readl_relaxed(qm->io_base + HZIP_PREFETCH_CFG);
+- val |= HZIP_SVA_PREFETCH_DISABLE;
++ val &= HZIP_PREFETCH_ENABLE;
+ writel(val, qm->io_base + HZIP_PREFETCH_CFG);
+
+- ret = readl_relaxed_poll_timeout(qm->io_base + HZIP_SVA_TRANS,
+- val, !(val & HZIP_SVA_DISABLE_READY),
++ ret = readl_relaxed_poll_timeout(qm->io_base + HZIP_PREFETCH_CFG,
++ val, !(val & HZIP_SVA_PREFETCH_DISABLE),
+ HZIP_DELAY_1_US, HZIP_POLL_TIMEOUT_US);
++ if (ret) {
++ pci_err(qm->pdev, "failed to open sva prefetch\n");
++ hisi_zip_close_sva_prefetch(qm);
++ return;
++ }
++
++ ret = hisi_zip_wait_sva_ready(qm, HZIP_SVA_TRANS, HZIP_SVA_PREFETCH_NUM);
+ if (ret)
+- pci_err(qm->pdev, "failed to close sva prefetch\n");
++ hisi_zip_close_sva_prefetch(qm);
+ }
+
+ static void hisi_zip_enable_clock_gate(struct hisi_qm *qm)
+@@ -530,6 +564,7 @@ static int hisi_zip_set_user_domain_and_cache(struct hisi_qm *qm)
+ void __iomem *base = qm->io_base;
+ u32 dcomp_bm, comp_bm;
+ u32 zip_core_en;
++ int ret;
+
+ /* qm user domain */
+ writel(AXUSER_BASE, base + QM_ARUSER_M_CFG_1);
+@@ -565,6 +600,7 @@ static int hisi_zip_set_user_domain_and_cache(struct hisi_qm *qm)
+ writel(AXUSER_BASE, base + HZIP_DATA_WUSER_32_63);
+ writel(AXUSER_BASE, base + HZIP_SGL_RUSER_32_63);
+ }
++ hisi_zip_open_sva_prefetch(qm);
+
+ /* let's open all compression/decompression cores */
+
+@@ -580,9 +616,18 @@ static int hisi_zip_set_user_domain_and_cache(struct hisi_qm *qm)
+ CQC_CACHE_WB_ENABLE | FIELD_PREP(SQC_CACHE_WB_THRD, 1) |
+ FIELD_PREP(CQC_CACHE_WB_THRD, 1), base + QM_CACHE_CTL);
+
++ hisi_zip_set_high_perf(qm);
+ hisi_zip_enable_clock_gate(qm);
+
+- return hisi_dae_set_user_domain(qm);
++ ret = hisi_dae_set_user_domain(qm);
++ if (ret)
++ goto close_sva_prefetch;
++
++ return 0;
++
++close_sva_prefetch:
++ hisi_zip_close_sva_prefetch(qm);
++ return ret;
+ }
+
+ static void hisi_zip_master_ooo_ctrl(struct hisi_qm *qm, bool enable)
+@@ -1251,11 +1296,6 @@ static int hisi_zip_pf_probe_init(struct hisi_zip *hisi_zip)
+ if (ret)
+ return ret;
+
+- ret = hisi_zip_set_high_perf(qm);
+- if (ret)
+- return ret;
+-
+- hisi_zip_open_sva_prefetch(qm);
+ hisi_qm_dev_err_init(qm);
+ hisi_zip_debug_regs_clear(qm);
+
+diff --git a/drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c b/drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c
+index 8f9e21ced0fe1e..48281d88226038 100644
+--- a/drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c
++++ b/drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c
+@@ -232,7 +232,7 @@ static int kmb_ocs_dma_prepare(struct ahash_request *req)
+ struct device *dev = rctx->hcu_dev->dev;
+ unsigned int remainder = 0;
+ unsigned int total;
+- size_t nents;
++ int nents;
+ size_t count;
+ int rc;
+ int i;
+@@ -253,6 +253,9 @@ static int kmb_ocs_dma_prepare(struct ahash_request *req)
+ /* Determine the number of scatter gather list entries to process. */
+ nents = sg_nents_for_len(req->src, rctx->sg_data_total - remainder);
+
++ if (nents < 0)
++ return nents;
++
+ /* If there are entries to process, map them. */
+ if (nents) {
+ rctx->sg_dma_nents = dma_map_sg(dev, req->src, nents,
+diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c b/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
+index cc47e361089a05..ebdf4efa09d4d7 100644
+--- a/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
++++ b/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
+@@ -1615,7 +1615,7 @@ int otx2_cpt_dl_custom_egrp_create(struct otx2_cptpf_dev *cptpf,
+ return -EINVAL;
+ }
+ err_msg = "Invalid engine group format";
+- strscpy(tmp_buf, ctx->val.vstr, strlen(ctx->val.vstr) + 1);
++ strscpy(tmp_buf, ctx->val.vstr);
+ start = tmp_buf;
+
+ has_se = has_ie = has_ae = false;
+diff --git a/drivers/crypto/nx/nx-common-powernv.c b/drivers/crypto/nx/nx-common-powernv.c
+index fd0a98b2fb1b25..0493041ea08851 100644
+--- a/drivers/crypto/nx/nx-common-powernv.c
++++ b/drivers/crypto/nx/nx-common-powernv.c
+@@ -1043,8 +1043,10 @@ static struct scomp_alg nx842_powernv_alg = {
+ .base.cra_priority = 300,
+ .base.cra_module = THIS_MODULE,
+
+- .alloc_ctx = nx842_powernv_crypto_alloc_ctx,
+- .free_ctx = nx842_crypto_free_ctx,
++ .streams = {
++ .alloc_ctx = nx842_powernv_crypto_alloc_ctx,
++ .free_ctx = nx842_crypto_free_ctx,
++ },
+ .compress = nx842_crypto_compress,
+ .decompress = nx842_crypto_decompress,
+ };
+diff --git a/drivers/crypto/nx/nx-common-pseries.c b/drivers/crypto/nx/nx-common-pseries.c
+index f528e072494a2e..fc0222ebe80721 100644
+--- a/drivers/crypto/nx/nx-common-pseries.c
++++ b/drivers/crypto/nx/nx-common-pseries.c
+@@ -1020,8 +1020,10 @@ static struct scomp_alg nx842_pseries_alg = {
+ .base.cra_priority = 300,
+ .base.cra_module = THIS_MODULE,
+
+- .alloc_ctx = nx842_pseries_crypto_alloc_ctx,
+- .free_ctx = nx842_crypto_free_ctx,
++ .streams = {
++ .alloc_ctx = nx842_pseries_crypto_alloc_ctx,
++ .free_ctx = nx842_crypto_free_ctx,
++ },
+ .compress = nx842_crypto_compress,
+ .decompress = nx842_crypto_decompress,
+ };
+diff --git a/drivers/devfreq/event/rockchip-dfi.c b/drivers/devfreq/event/rockchip-dfi.c
+index 0470d7c175f4f6..54effb63519653 100644
+--- a/drivers/devfreq/event/rockchip-dfi.c
++++ b/drivers/devfreq/event/rockchip-dfi.c
+@@ -116,6 +116,7 @@ struct rockchip_dfi {
+ int buswidth[DMC_MAX_CHANNELS];
+ int ddrmon_stride;
+ bool ddrmon_ctrl_single;
++ unsigned int count_multiplier; /* number of data clocks per count */
+ };
+
+ static int rockchip_dfi_enable(struct rockchip_dfi *dfi)
+@@ -435,7 +436,7 @@ static u64 rockchip_ddr_perf_event_get_count(struct perf_event *event)
+
+ switch (event->attr.config) {
+ case PERF_EVENT_CYCLES:
+- count = total.c[0].clock_cycles;
++ count = total.c[0].clock_cycles * dfi->count_multiplier;
+ break;
+ case PERF_EVENT_READ_BYTES:
+ for (i = 0; i < dfi->max_channels; i++)
+@@ -655,6 +656,9 @@ static int rockchip_ddr_perf_init(struct rockchip_dfi *dfi)
+ break;
+ }
+
++ if (!dfi->count_multiplier)
++ dfi->count_multiplier = 1;
++
+ ret = perf_pmu_register(pmu, "rockchip_ddr", -1);
+ if (ret)
+ return ret;
+@@ -751,6 +755,7 @@ static int rk3588_dfi_init(struct rockchip_dfi *dfi)
+ dfi->max_channels = 4;
+
+ dfi->ddrmon_stride = 0x4000;
++ dfi->count_multiplier = 2;
+
+ return 0;
+ };
+diff --git a/drivers/devfreq/mtk-cci-devfreq.c b/drivers/devfreq/mtk-cci-devfreq.c
+index 22fe9e631f8aaf..5730076846e1be 100644
+--- a/drivers/devfreq/mtk-cci-devfreq.c
++++ b/drivers/devfreq/mtk-cci-devfreq.c
+@@ -386,7 +386,8 @@ static int mtk_ccifreq_probe(struct platform_device *pdev)
+ out_free_resources:
+ if (regulator_is_enabled(drv->proc_reg))
+ regulator_disable(drv->proc_reg);
+- if (drv->sram_reg && regulator_is_enabled(drv->sram_reg))
++ if (!IS_ERR_OR_NULL(drv->sram_reg) &&
++ regulator_is_enabled(drv->sram_reg))
+ regulator_disable(drv->sram_reg);
+
+ return ret;
+diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c
+index bf4171ac191d3f..9d00f247f4e0ea 100644
+--- a/drivers/edac/i10nm_base.c
++++ b/drivers/edac/i10nm_base.c
+@@ -1057,6 +1057,15 @@ static bool i10nm_check_ecc(struct skx_imc *imc, int chan)
+ return !!GET_BITFIELD(mcmtr, 2, 2);
+ }
+
++static bool i10nm_channel_disabled(struct skx_imc *imc, int chan)
++{
++ u32 mcmtr = I10NM_GET_MCMTR(imc, chan);
++
++ edac_dbg(1, "mc%d ch%d mcmtr reg %x\n", imc->mc, chan, mcmtr);
++
++ return (mcmtr == ~0 || GET_BITFIELD(mcmtr, 18, 18));
++}
++
+ static int i10nm_get_dimm_config(struct mem_ctl_info *mci,
+ struct res_config *cfg)
+ {
+@@ -1070,6 +1079,11 @@ static int i10nm_get_dimm_config(struct mem_ctl_info *mci,
+ if (!imc->mbase)
+ continue;
+
++ if (i10nm_channel_disabled(imc, i)) {
++ edac_dbg(1, "mc%d ch%d is disabled.\n", imc->mc, i);
++ continue;
++ }
++
+ ndimms = 0;
+
+ if (res_cfg->type != GNR)
+diff --git a/drivers/firmware/arm_scmi/transports/virtio.c b/drivers/firmware/arm_scmi/transports/virtio.c
+index cb934db9b2b4a2..326c4a93e44b91 100644
+--- a/drivers/firmware/arm_scmi/transports/virtio.c
++++ b/drivers/firmware/arm_scmi/transports/virtio.c
+@@ -871,6 +871,9 @@ static int scmi_vio_probe(struct virtio_device *vdev)
+ /* Ensure initialized scmi_vdev is visible */
+ smp_store_mb(scmi_vdev, vdev);
+
++ /* Set device ready */
++ virtio_device_ready(vdev);
++
+ ret = platform_driver_register(&scmi_virtio_driver);
+ if (ret) {
+ vdev->priv = NULL;
+diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
+index d528c94c5859b5..29e0729299f5bd 100644
+--- a/drivers/firmware/efi/Kconfig
++++ b/drivers/firmware/efi/Kconfig
+@@ -267,9 +267,10 @@ config OVMF_DEBUG_LOG
+ bool "Expose OVMF firmware debug log via sysfs"
+ depends on EFI
+ help
+- Recent OVMF versions (edk2-stable202508 + newer) can write
+- their debug log to a memory buffer. This driver exposes the
+- log content via sysfs (/sys/firmware/efi/ovmf_debug_log).
++ Recent versions of the Open Virtual Machine Firmware
++ (edk2-stable202508 + newer) can write their debug log to a memory
++ buffer. This driver exposes the log content via sysfs
++ (/sys/firmware/efi/ovmf_debug_log).
+
+ config UNACCEPTED_MEMORY
+ bool
+diff --git a/drivers/firmware/meson/Kconfig b/drivers/firmware/meson/Kconfig
+index f2fdd375664822..179f5d46d8ddff 100644
+--- a/drivers/firmware/meson/Kconfig
++++ b/drivers/firmware/meson/Kconfig
+@@ -5,7 +5,7 @@
+ config MESON_SM
+ tristate "Amlogic Secure Monitor driver"
+ depends on ARCH_MESON || COMPILE_TEST
+- default y
++ default ARCH_MESON
+ depends on ARM64_4K_PAGES
+ help
+ Say y here to enable the Amlogic secure monitor driver
+diff --git a/drivers/fwctl/mlx5/main.c b/drivers/fwctl/mlx5/main.c
+index f93aa0cecdb978..4b379f695eb73d 100644
+--- a/drivers/fwctl/mlx5/main.c
++++ b/drivers/fwctl/mlx5/main.c
+@@ -345,7 +345,7 @@ static void *mlx5ctl_fw_rpc(struct fwctl_uctx *uctx, enum fwctl_rpc_scope scope,
+ */
+ if (ret && ret != -EREMOTEIO) {
+ if (rpc_out != rpc_in)
+- kfree(rpc_out);
++ kvfree(rpc_out);
+ return ERR_PTR(ret);
+ }
+ return rpc_out;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+index 395c6be901ce7a..dbbb3407fa13ba 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+@@ -2964,15 +2964,15 @@ long amdgpu_drm_ioctl(struct file *filp,
+ }
+
+ static const struct dev_pm_ops amdgpu_pm_ops = {
+- .prepare = amdgpu_pmops_prepare,
+- .complete = amdgpu_pmops_complete,
+- .suspend = amdgpu_pmops_suspend,
+- .suspend_noirq = amdgpu_pmops_suspend_noirq,
+- .resume = amdgpu_pmops_resume,
+- .freeze = amdgpu_pmops_freeze,
+- .thaw = amdgpu_pmops_thaw,
+- .poweroff = amdgpu_pmops_poweroff,
+- .restore = amdgpu_pmops_restore,
++ .prepare = pm_sleep_ptr(amdgpu_pmops_prepare),
++ .complete = pm_sleep_ptr(amdgpu_pmops_complete),
++ .suspend = pm_sleep_ptr(amdgpu_pmops_suspend),
++ .suspend_noirq = pm_sleep_ptr(amdgpu_pmops_suspend_noirq),
++ .resume = pm_sleep_ptr(amdgpu_pmops_resume),
++ .freeze = pm_sleep_ptr(amdgpu_pmops_freeze),
++ .thaw = pm_sleep_ptr(amdgpu_pmops_thaw),
++ .poweroff = pm_sleep_ptr(amdgpu_pmops_poweroff),
++ .restore = pm_sleep_ptr(amdgpu_pmops_restore),
+ .runtime_suspend = amdgpu_pmops_runtime_suspend,
+ .runtime_resume = amdgpu_pmops_runtime_resume,
+ .runtime_idle = amdgpu_pmops_runtime_idle,
+@@ -3117,7 +3117,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
+ .probe = amdgpu_pci_probe,
+ .remove = amdgpu_pci_remove,
+ .shutdown = amdgpu_pci_shutdown,
+- .driver.pm = &amdgpu_pm_ops,
++ .driver.pm = pm_ptr(&amdgpu_pm_ops),
+ .err_handler = &amdgpu_pci_err_handler,
+ .dev_groups = amdgpu_sysfs_groups,
+ };
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+index 486c3646710cc4..8f6ce948c6841d 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+@@ -364,7 +364,8 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
+
+ /* Allocate ring buffer */
+ if (ring->ring_obj == NULL) {
+- r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw, PAGE_SIZE,
++ r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_bytes,
++ PAGE_SIZE,
+ AMDGPU_GEM_DOMAIN_GTT,
+ &ring->ring_obj,
+ &ring->gpu_addr,
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+index 7670f5d82b9e46..12783ea3ba0f18 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+@@ -211,7 +211,18 @@ struct amdgpu_ring_funcs {
+ bool support_64bit_ptrs;
+ bool no_user_fence;
+ bool secure_submission_supported;
+- unsigned extra_dw;
++
++ /**
++ * @extra_bytes:
++ *
++ * Optional extra space in bytes that is added to the ring size
++ * when allocating the BO that holds the contents of the ring.
++ * This space isn't used for command submission to the ring,
++ * but is just there to satisfy some hardware requirements or
++ * implement workarounds. It's up to the implementation of each
++ * specific ring to initialize this space.
++ */
++ unsigned extra_bytes;
+
+ /* ring read/write ptr handling */
+ u64 (*get_rptr)(struct amdgpu_ring *ring);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+index f1f67521c29cab..affb68eabc4e1c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+@@ -92,6 +92,7 @@ MODULE_FIRMWARE(FIRMWARE_VCN5_0_0);
+ MODULE_FIRMWARE(FIRMWARE_VCN5_0_1);
+
+ static void amdgpu_vcn_idle_work_handler(struct work_struct *work);
++static void amdgpu_vcn_reg_dump_fini(struct amdgpu_device *adev);
+
+ int amdgpu_vcn_early_init(struct amdgpu_device *adev, int i)
+ {
+@@ -285,6 +286,10 @@ int amdgpu_vcn_sw_fini(struct amdgpu_device *adev, int i)
+ amdgpu_ucode_release(&adev->vcn.inst[0].fw);
+ adev->vcn.inst[i].fw = NULL;
+ }
++
++ if (adev->vcn.reg_list)
++ amdgpu_vcn_reg_dump_fini(adev);
++
+ mutex_destroy(&adev->vcn.inst[i].vcn_pg_lock);
+ mutex_destroy(&adev->vcn.inst[i].vcn1_jpeg1_workaround);
+
+@@ -405,6 +410,54 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev, int i)
+ return 0;
+ }
+
++void amdgpu_vcn_get_profile(struct amdgpu_device *adev)
++{
++ int r;
++
++ mutex_lock(&adev->vcn.workload_profile_mutex);
++
++ if (adev->vcn.workload_profile_active) {
++ mutex_unlock(&adev->vcn.workload_profile_mutex);
++ return;
++ }
++ r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
++ true);
++ if (r)
++ dev_warn(adev->dev,
++ "(%d) failed to enable video power profile mode\n", r);
++ else
++ adev->vcn.workload_profile_active = true;
++ mutex_unlock(&adev->vcn.workload_profile_mutex);
++}
++
++void amdgpu_vcn_put_profile(struct amdgpu_device *adev)
++{
++ bool pg = true;
++ int r, i;
++
++ mutex_lock(&adev->vcn.workload_profile_mutex);
++ for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
++ if (adev->vcn.inst[i].cur_state != AMD_PG_STATE_GATE) {
++ pg = false;
++ break;
++ }
++ }
++
++ if (pg) {
++ r = amdgpu_dpm_switch_power_profile(
++ adev, PP_SMC_POWER_PROFILE_VIDEO, false);
++ if (r)
++ dev_warn(
++ adev->dev,
++ "(%d) failed to disable video power profile mode\n",
++ r);
++ else
++ adev->vcn.workload_profile_active = false;
++ }
++
++ mutex_unlock(&adev->vcn.workload_profile_mutex);
++}
++
+ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
+ {
+ struct amdgpu_vcn_inst *vcn_inst =
+@@ -412,7 +465,6 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
+ struct amdgpu_device *adev = vcn_inst->adev;
+ unsigned int fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
+ unsigned int i = vcn_inst->inst, j;
+- int r = 0;
+
+ if (adev->vcn.harvest_config & (1 << i))
+ return;
+@@ -438,16 +490,11 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
+ fences += fence[i];
+
+ if (!fences && !atomic_read(&vcn_inst->total_submission_cnt)) {
++ mutex_lock(&vcn_inst->vcn_pg_lock);
+ vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_GATE);
+- mutex_lock(&adev->vcn.workload_profile_mutex);
+- if (adev->vcn.workload_profile_active) {
+- r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
+- false);
+- if (r)
+- dev_warn(adev->dev, "(%d) failed to disable video power profile mode\n", r);
+- adev->vcn.workload_profile_active = false;
+- }
+- mutex_unlock(&adev->vcn.workload_profile_mutex);
++ mutex_unlock(&vcn_inst->vcn_pg_lock);
++ amdgpu_vcn_put_profile(adev);
++
+ } else {
+ schedule_delayed_work(&vcn_inst->idle_work, VCN_IDLE_TIMEOUT);
+ }
+@@ -457,30 +504,11 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
+ {
+ struct amdgpu_device *adev = ring->adev;
+ struct amdgpu_vcn_inst *vcn_inst = &adev->vcn.inst[ring->me];
+- int r = 0;
+
+ atomic_inc(&vcn_inst->total_submission_cnt);
+
+ cancel_delayed_work_sync(&vcn_inst->idle_work);
+
+- /* We can safely return early here because we've cancelled the
+- * the delayed work so there is no one else to set it to false
+- * and we don't care if someone else sets it to true.
+- */
+- if (adev->vcn.workload_profile_active)
+- goto pg_lock;
+-
+- mutex_lock(&adev->vcn.workload_profile_mutex);
+- if (!adev->vcn.workload_profile_active) {
+- r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
+- true);
+- if (r)
+- dev_warn(adev->dev, "(%d) failed to switch to video power profile mode\n", r);
+- adev->vcn.workload_profile_active = true;
+- }
+- mutex_unlock(&adev->vcn.workload_profile_mutex);
+-
+-pg_lock:
+ mutex_lock(&vcn_inst->vcn_pg_lock);
+ vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_UNGATE);
+
+@@ -508,6 +536,7 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
+ vcn_inst->pause_dpg_mode(vcn_inst, &new_state);
+ }
+ mutex_unlock(&vcn_inst->vcn_pg_lock);
++ amdgpu_vcn_get_profile(adev);
+ }
+
+ void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
+@@ -1527,3 +1556,86 @@ int amdgpu_vcn_ring_reset(struct amdgpu_ring *ring,
+
+ return amdgpu_vcn_reset_engine(adev, ring->me);
+ }
++
++int amdgpu_vcn_reg_dump_init(struct amdgpu_device *adev,
++ const struct amdgpu_hwip_reg_entry *reg, u32 count)
++{
++ adev->vcn.ip_dump = kcalloc(adev->vcn.num_vcn_inst * count,
++ sizeof(uint32_t), GFP_KERNEL);
++ if (!adev->vcn.ip_dump)
++ return -ENOMEM;
++ adev->vcn.reg_list = reg;
++ adev->vcn.reg_count = count;
++
++ return 0;
++}
++
++static void amdgpu_vcn_reg_dump_fini(struct amdgpu_device *adev)
++{
++ kfree(adev->vcn.ip_dump);
++ adev->vcn.ip_dump = NULL;
++ adev->vcn.reg_list = NULL;
++ adev->vcn.reg_count = 0;
++}
++
++void amdgpu_vcn_dump_ip_state(struct amdgpu_ip_block *ip_block)
++{
++ struct amdgpu_device *adev = ip_block->adev;
++ int i, j;
++ bool is_powered;
++ u32 inst_off;
++
++ if (!adev->vcn.ip_dump)
++ return;
++
++ for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
++ if (adev->vcn.harvest_config & (1 << i))
++ continue;
++
++ inst_off = i * adev->vcn.reg_count;
++ /* mmUVD_POWER_STATUS is always readable and is the first in reg_list */
++ adev->vcn.ip_dump[inst_off] =
++ RREG32(SOC15_REG_ENTRY_OFFSET_INST(adev->vcn.reg_list[0], i));
++ is_powered = (adev->vcn.ip_dump[inst_off] &
++ UVD_POWER_STATUS__UVD_POWER_STATUS_TILES_OFF) !=
++ UVD_POWER_STATUS__UVD_POWER_STATUS_TILES_OFF;
++
++ if (is_powered)
++ for (j = 1; j < adev->vcn.reg_count; j++)
++ adev->vcn.ip_dump[inst_off + j] =
++ RREG32(SOC15_REG_ENTRY_OFFSET_INST(adev->vcn.reg_list[j], i));
++ }
++}
++
++void amdgpu_vcn_print_ip_state(struct amdgpu_ip_block *ip_block, struct drm_printer *p)
++{
++ struct amdgpu_device *adev = ip_block->adev;
++ int i, j;
++ bool is_powered;
++ u32 inst_off;
++
++ if (!adev->vcn.ip_dump)
++ return;
++
++ drm_printf(p, "num_instances:%d\n", adev->vcn.num_vcn_inst);
++ for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
++ if (adev->vcn.harvest_config & (1 << i)) {
++ drm_printf(p, "\nHarvested Instance:VCN%d Skipping dump\n", i);
++ continue;
++ }
++
++ inst_off = i * adev->vcn.reg_count;
++ is_powered = (adev->vcn.ip_dump[inst_off] &
++ UVD_POWER_STATUS__UVD_POWER_STATUS_TILES_OFF) !=
++ UVD_POWER_STATUS__UVD_POWER_STATUS_TILES_OFF;
++
++ if (is_powered) {
++ drm_printf(p, "\nActive Instance:VCN%d\n", i);
++ for (j = 0; j < adev->vcn.reg_count; j++)
++ drm_printf(p, "%-50s \t 0x%08x\n", adev->vcn.reg_list[j].reg_name,
++ adev->vcn.ip_dump[inst_off + j]);
++ } else {
++ drm_printf(p, "\nInactive Instance:VCN%d\n", i);
++ }
++ }
++}
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+index 0bc0a94d7cf0fb..6d9acd36041d09 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+@@ -237,6 +237,8 @@
+
+ #define AMDGPU_DRM_KEY_INJECT_WORKAROUND_VCNFW_ASD_HANDSHAKING 2
+
++struct amdgpu_hwip_reg_entry;
++
+ enum amdgpu_vcn_caps {
+ AMDGPU_VCN_RRMT_ENABLED,
+ };
+@@ -362,6 +364,8 @@ struct amdgpu_vcn {
+
+ bool workload_profile_active;
+ struct mutex workload_profile_mutex;
++ u32 reg_count;
++ const struct amdgpu_hwip_reg_entry *reg_list;
+ };
+
+ struct amdgpu_fw_shared_rb_ptrs_struct {
+@@ -557,4 +561,11 @@ int vcn_set_powergating_state(struct amdgpu_ip_block *ip_block,
+ int amdgpu_vcn_ring_reset(struct amdgpu_ring *ring,
+ unsigned int vmid,
+ struct amdgpu_fence *guilty_fence);
++int amdgpu_vcn_reg_dump_init(struct amdgpu_device *adev,
++ const struct amdgpu_hwip_reg_entry *reg, u32 count);
++void amdgpu_vcn_dump_ip_state(struct amdgpu_ip_block *ip_block);
++void amdgpu_vcn_print_ip_state(struct amdgpu_ip_block *ip_block, struct drm_printer *p);
++void amdgpu_vcn_get_profile(struct amdgpu_device *adev);
++void amdgpu_vcn_put_profile(struct amdgpu_device *adev);
++
+ #endif
+diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c
+index 9e428e669ada6f..b5bb7f4d607c14 100644
+--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c
+@@ -557,7 +557,7 @@ static const struct amdgpu_ring_funcs jpeg_v1_0_decode_ring_vm_funcs = {
+ .nop = PACKET0(0x81ff, 0),
+ .support_64bit_ptrs = false,
+ .no_user_fence = true,
+- .extra_dw = 64,
++ .extra_bytes = 256,
+ .get_rptr = jpeg_v1_0_decode_ring_get_rptr,
+ .get_wptr = jpeg_v1_0_decode_ring_get_wptr,
+ .set_wptr = jpeg_v1_0_decode_ring_set_wptr,
+diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
+index b86288a69e7b7b..a78144773fabbe 100644
+--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
++++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
+@@ -444,7 +444,7 @@ static int jpeg_v4_0_3_hw_fini(struct amdgpu_ip_block *ip_block)
+ ret = jpeg_v4_0_3_set_powergating_state(ip_block, AMD_PG_STATE_GATE);
+ }
+
+- if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__JPEG))
++ if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__JPEG) && !amdgpu_sriov_vf(adev))
+ amdgpu_irq_put(adev, &adev->jpeg.inst->ras_poison_irq, 0);
+
+ return ret;
+diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
+index 5dbaebb592b304..2e79a3afc7748a 100644
+--- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
++++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
+@@ -623,7 +623,22 @@ static void uvd_v3_1_enable_mgcg(struct amdgpu_device *adev,
+ *
+ * @ip_block: Pointer to the amdgpu_ip_block for this hw instance.
+ *
+- * Initialize the hardware, boot up the VCPU and do some testing
++ * Initialize the hardware, boot up the VCPU and do some testing.
++ *
++ * On SI, the UVD is meant to be used in a specific power state,
++ * or alternatively the driver can manually enable its clock.
++ * In amdgpu we use the dedicated UVD power state when DPM is enabled.
++ * Calling amdgpu_dpm_enable_uvd makes DPM select the UVD power state
++ * for the SMU and afterwards enables the UVD clock.
++ * This is automatically done by amdgpu_uvd_ring_begin_use when work
++ * is submitted to the UVD ring. Here, we have to call it manually
++ * in order to power up UVD before firmware validation.
++ *
++ * Note that we must not disable the UVD clock here, as that would
++ * cause the ring test to fail. However, UVD is powered off
++ * automatically after the ring test: amdgpu_uvd_ring_end_use calls
++ * the UVD idle work handler which will disable the UVD clock when
++ * all fences are signalled.
+ */
+ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
+ {
+@@ -633,6 +648,15 @@ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
+ int r;
+
+ uvd_v3_1_mc_resume(adev);
++ uvd_v3_1_enable_mgcg(adev, true);
++
++ /* Make sure UVD is powered during FW validation.
++ * It's going to be automatically powered off after the ring test.
++ */
++ if (adev->pm.dpm_enabled)
++ amdgpu_dpm_enable_uvd(adev, true);
++ else
++ amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
+
+ r = uvd_v3_1_fw_validate(adev);
+ if (r) {
+@@ -640,9 +664,6 @@ static int uvd_v3_1_hw_init(struct amdgpu_ip_block *ip_block)
+ return r;
+ }
+
+- uvd_v3_1_enable_mgcg(adev, true);
+- amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
+-
+ uvd_v3_1_start(adev);
+
+ r = amdgpu_ring_test_helper(ring);
+diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+index bc30a5326866c3..f13ed3c1e29c2c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
++++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+@@ -116,7 +116,6 @@ static void vcn_v2_5_idle_work_handler(struct work_struct *work)
+ struct amdgpu_device *adev = vcn_inst->adev;
+ unsigned int fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
+ unsigned int i, j;
+- int r = 0;
+
+ for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
+ struct amdgpu_vcn_inst *v = &adev->vcn.inst[i];
+@@ -149,15 +148,7 @@ static void vcn_v2_5_idle_work_handler(struct work_struct *work)
+ if (!fences && !atomic_read(&adev->vcn.inst[0].total_submission_cnt)) {
+ amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
+ AMD_PG_STATE_GATE);
+- mutex_lock(&adev->vcn.workload_profile_mutex);
+- if (adev->vcn.workload_profile_active) {
+- r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
+- false);
+- if (r)
+- dev_warn(adev->dev, "(%d) failed to disable video power profile mode\n", r);
+- adev->vcn.workload_profile_active = false;
+- }
+- mutex_unlock(&adev->vcn.workload_profile_mutex);
++ amdgpu_vcn_put_profile(adev);
+ } else {
+ schedule_delayed_work(&adev->vcn.inst[0].idle_work, VCN_IDLE_TIMEOUT);
+ }
+@@ -167,7 +158,6 @@ static void vcn_v2_5_ring_begin_use(struct amdgpu_ring *ring)
+ {
+ struct amdgpu_device *adev = ring->adev;
+ struct amdgpu_vcn_inst *v = &adev->vcn.inst[ring->me];
+- int r = 0;
+
+ atomic_inc(&adev->vcn.inst[0].total_submission_cnt);
+
+@@ -177,20 +167,6 @@ static void vcn_v2_5_ring_begin_use(struct amdgpu_ring *ring)
+ * the delayed work so there is no one else to set it to false
+ * and we don't care if someone else sets it to true.
+ */
+- if (adev->vcn.workload_profile_active)
+- goto pg_lock;
+-
+- mutex_lock(&adev->vcn.workload_profile_mutex);
+- if (!adev->vcn.workload_profile_active) {
+- r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
+- true);
+- if (r)
+- dev_warn(adev->dev, "(%d) failed to switch to video power profile mode\n", r);
+- adev->vcn.workload_profile_active = true;
+- }
+- mutex_unlock(&adev->vcn.workload_profile_mutex);
+-
+-pg_lock:
+ mutex_lock(&adev->vcn.inst[0].vcn_pg_lock);
+ amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
+ AMD_PG_STATE_UNGATE);
+@@ -218,6 +194,7 @@ static void vcn_v2_5_ring_begin_use(struct amdgpu_ring *ring)
+ v->pause_dpg_mode(v, &new_state);
+ }
+ mutex_unlock(&adev->vcn.inst[0].vcn_pg_lock);
++ amdgpu_vcn_get_profile(adev);
+ }
+
+ static void vcn_v2_5_ring_end_use(struct amdgpu_ring *ring)
+diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+index 2811226b0ea5dc..866222fc10a050 100644
+--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+@@ -361,7 +361,6 @@ static int vcn_v3_0_sw_fini(struct amdgpu_ip_block *ip_block)
+ return r;
+ }
+
+- kfree(adev->vcn.ip_dump);
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+index 706f3b2f484f7c..ac55549e20be69 100644
+--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+@@ -1984,7 +1984,7 @@ static struct amdgpu_ring_funcs vcn_v4_0_unified_ring_vm_funcs = {
+ .type = AMDGPU_RING_TYPE_VCN_ENC,
+ .align_mask = 0x3f,
+ .nop = VCN_ENC_CMD_NO_OP,
+- .extra_dw = sizeof(struct amdgpu_vcn_rb_metadata),
++ .extra_bytes = sizeof(struct amdgpu_vcn_rb_metadata),
+ .get_rptr = vcn_v4_0_unified_ring_get_rptr,
+ .get_wptr = vcn_v4_0_unified_ring_get_wptr,
+ .set_wptr = vcn_v4_0_unified_ring_set_wptr,
+diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+index 2a3663b551af94..ba944a96c0707c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
++++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+@@ -287,8 +287,6 @@ static int vcn_v4_0_3_sw_fini(struct amdgpu_ip_block *ip_block)
+ return r;
+ }
+
+- kfree(adev->vcn.ip_dump);
+-
+ return 0;
+ }
+
+@@ -391,7 +389,7 @@ static int vcn_v4_0_3_hw_fini(struct amdgpu_ip_block *ip_block)
+ vinst->set_pg_state(vinst, AMD_PG_STATE_GATE);
+ }
+
+- if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__VCN))
++ if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__VCN) && !amdgpu_sriov_vf(adev))
+ amdgpu_irq_put(adev, &adev->vcn.inst->ras_poison_irq, 0);
+
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c
+index caf2d95a85d433..11fec716e846a2 100644
+--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c
++++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c
+@@ -284,8 +284,6 @@ static int vcn_v4_0_5_sw_fini(struct amdgpu_ip_block *ip_block)
+ return r;
+ }
+
+- kfree(adev->vcn.ip_dump);
+-
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+index a0f22ea6d15af7..3d8b20828c0688 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+@@ -4239,7 +4239,7 @@ svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op op, uint64_t start,
+ r = svm_range_get_attr(p, mm, start, size, nattrs, attrs);
+ break;
+ default:
+- r = EINVAL;
++ r = -EINVAL;
+ break;
+ }
+
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+index 4d6bc9fd4faa80..9ac2d41f8fcae1 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+@@ -316,6 +316,9 @@ bool dc_stream_set_cursor_attributes(
+ {
+ bool result = false;
+
++ if (!stream)
++ return false;
++
+ if (dc_stream_check_cursor_attributes(stream, stream->ctx->dc->current_state, attributes)) {
+ stream->cursor_attributes = *attributes;
+ result = true;
+@@ -331,7 +334,10 @@ bool dc_stream_program_cursor_attributes(
+ struct dc *dc;
+ bool reset_idle_optimizations = false;
+
+- dc = stream ? stream->ctx->dc : NULL;
++ if (!stream)
++ return false;
++
++ dc = stream->ctx->dc;
+
+ if (dc_stream_set_cursor_attributes(stream, attributes)) {
+ dc_z10_restore(dc);
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c
+index 9ba6cb67655f4a..6c75aa82327ac1 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c
+@@ -139,7 +139,6 @@ void dml32_rq_dlg_get_rq_reg(display_rq_regs_st *rq_regs,
+ if (dual_plane) {
+ unsigned int p1_pte_row_height_linear = get_dpte_row_height_linear_c(mode_lib, e2e_pipe_param,
+ num_pipes, pipe_idx);
+- ;
+ if (src->sw_mode == dm_sw_linear)
+ ASSERT(p1_pte_row_height_linear >= 8);
+
+diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c
+index 4ea13d0bf815e2..c69194e04ff93e 100644
+--- a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c
++++ b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c
+@@ -1600,19 +1600,17 @@ enum dc_status dce110_apply_single_controller_ctx_to_hw(
+ }
+
+ if (pipe_ctx->stream_res.audio != NULL) {
+- struct audio_output audio_output = {0};
++ build_audio_output(context, pipe_ctx, &pipe_ctx->stream_res.audio_output);
+
+- build_audio_output(context, pipe_ctx, &audio_output);
+-
+- link_hwss->setup_audio_output(pipe_ctx, &audio_output,
++ link_hwss->setup_audio_output(pipe_ctx, &pipe_ctx->stream_res.audio_output,
+ pipe_ctx->stream_res.audio->inst);
+
+ pipe_ctx->stream_res.audio->funcs->az_configure(
+ pipe_ctx->stream_res.audio,
+ pipe_ctx->stream->signal,
+- &audio_output.crtc_info,
++ &pipe_ctx->stream_res.audio_output.crtc_info,
+ &pipe_ctx->stream->audio_info,
+- &audio_output.dp_link_info);
++ &pipe_ctx->stream_res.audio_output.dp_link_info);
+
+ if (dc->config.disable_hbr_audio_dp2)
+ if (pipe_ctx->stream_res.audio->funcs->az_disable_hbr_audio &&
+@@ -2386,9 +2384,7 @@ static void dce110_setup_audio_dto(
+ if (pipe_ctx->stream->signal != SIGNAL_TYPE_HDMI_TYPE_A)
+ continue;
+ if (pipe_ctx->stream_res.audio != NULL) {
+- struct audio_output audio_output;
+-
+- build_audio_output(context, pipe_ctx, &audio_output);
++ build_audio_output(context, pipe_ctx, &pipe_ctx->stream_res.audio_output);
+
+ if (dc->res_pool->dccg && dc->res_pool->dccg->funcs->set_audio_dtbclk_dto) {
+ struct dtbclk_dto_params dto_params = {0};
+@@ -2399,14 +2395,14 @@ static void dce110_setup_audio_dto(
+ pipe_ctx->stream_res.audio->funcs->wall_dto_setup(
+ pipe_ctx->stream_res.audio,
+ pipe_ctx->stream->signal,
+- &audio_output.crtc_info,
+- &audio_output.pll_info);
++ &pipe_ctx->stream_res.audio_output.crtc_info,
++ &pipe_ctx->stream_res.audio_output.pll_info);
+ } else
+ pipe_ctx->stream_res.audio->funcs->wall_dto_setup(
+ pipe_ctx->stream_res.audio,
+ pipe_ctx->stream->signal,
+- &audio_output.crtc_info,
+- &audio_output.pll_info);
++ &pipe_ctx->stream_res.audio_output.crtc_info,
++ &pipe_ctx->stream_res.audio_output.pll_info);
+ break;
+ }
+ }
+@@ -2426,15 +2422,15 @@ static void dce110_setup_audio_dto(
+ continue;
+
+ if (pipe_ctx->stream_res.audio != NULL) {
+- struct audio_output audio_output = {0};
+-
+- build_audio_output(context, pipe_ctx, &audio_output);
++ build_audio_output(context,
++ pipe_ctx,
++ &pipe_ctx->stream_res.audio_output);
+
+ pipe_ctx->stream_res.audio->funcs->wall_dto_setup(
+ pipe_ctx->stream_res.audio,
+ pipe_ctx->stream->signal,
+- &audio_output.crtc_info,
+- &audio_output.pll_info);
++ &pipe_ctx->stream_res.audio_output.crtc_info,
++ &pipe_ctx->stream_res.audio_output.pll_info);
+ break;
+ }
+ }
+diff --git a/drivers/gpu/drm/amd/display/dc/inc/core_types.h b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
+index f0d7185153b2ae..f896cce87b8d45 100644
+--- a/drivers/gpu/drm/amd/display/dc/inc/core_types.h
++++ b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
+@@ -228,8 +228,7 @@ struct resource_funcs {
+ enum dc_status (*update_dc_state_for_encoder_switch)(struct dc_link *link,
+ struct dc_link_settings *link_setting,
+ uint8_t pipe_count,
+- struct pipe_ctx *pipes,
+- struct audio_output *audio_output);
++ struct pipe_ctx *pipes);
+ };
+
+ struct audio_support{
+@@ -361,6 +360,8 @@ struct stream_resource {
+ uint8_t gsl_group;
+
+ struct test_pattern_params test_pattern_params;
++
++ struct audio_output audio_output;
+ };
+
+ struct plane_resource {
+diff --git a/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c b/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c
+index 2956c2b3ad1aad..b12d61701d4d9f 100644
+--- a/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c
++++ b/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c
+@@ -75,7 +75,6 @@ static void dp_retrain_link_dp_test(struct dc_link *link,
+ bool is_hpo_acquired;
+ uint8_t count;
+ int i;
+- struct audio_output audio_output[MAX_PIPES];
+
+ needs_divider_update = (link->dc->link_srv->dp_get_encoding_format(link_setting) !=
+ link->dc->link_srv->dp_get_encoding_format((const struct dc_link_settings *) &link->cur_link_settings));
+@@ -99,7 +98,7 @@ static void dp_retrain_link_dp_test(struct dc_link *link,
+ if (needs_divider_update && link->dc->res_pool->funcs->update_dc_state_for_encoder_switch) {
+ link->dc->res_pool->funcs->update_dc_state_for_encoder_switch(link,
+ link_setting, count,
+- *pipes, &audio_output[0]);
++ *pipes);
+ for (i = 0; i < count; i++) {
+ pipes[i]->clock_source->funcs->program_pix_clk(
+ pipes[i]->clock_source,
+@@ -111,15 +110,16 @@ static void dp_retrain_link_dp_test(struct dc_link *link,
+ const struct link_hwss *link_hwss = get_link_hwss(
+ link, &pipes[i]->link_res);
+
+- link_hwss->setup_audio_output(pipes[i], &audio_output[i],
+- pipes[i]->stream_res.audio->inst);
++ link_hwss->setup_audio_output(pipes[i],
++ &pipes[i]->stream_res.audio_output,
++ pipes[i]->stream_res.audio->inst);
+
+ pipes[i]->stream_res.audio->funcs->az_configure(
+ pipes[i]->stream_res.audio,
+ pipes[i]->stream->signal,
+- &audio_output[i].crtc_info,
++ &pipes[i]->stream_res.audio_output.crtc_info,
+ &pipes[i]->stream->audio_info,
+- &audio_output[i].dp_link_info);
++ &pipes[i]->stream_res.audio_output.dp_link_info);
+
+ if (link->dc->config.disable_hbr_audio_dp2 &&
+ pipes[i]->stream_res.audio->funcs->az_disable_hbr_audio &&
+diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c
+index 3ed7f50554e21e..ca17e5d8fdc2a4 100644
+--- a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c
+@@ -2239,8 +2239,7 @@ struct resource_pool *dcn31_create_resource_pool(
+ enum dc_status dcn31_update_dc_state_for_encoder_switch(struct dc_link *link,
+ struct dc_link_settings *link_setting,
+ uint8_t pipe_count,
+- struct pipe_ctx *pipes,
+- struct audio_output *audio_output)
++ struct pipe_ctx *pipes)
+ {
+ struct dc_state *state = link->dc->current_state;
+ int i;
+@@ -2255,7 +2254,7 @@ enum dc_status dcn31_update_dc_state_for_encoder_switch(struct dc_link *link,
+
+ // Setup audio
+ if (pipes[i].stream_res.audio != NULL)
+- build_audio_output(state, &pipes[i], &audio_output[i]);
++ build_audio_output(state, &pipes[i], &pipes[i].stream_res.audio_output);
+ }
+ #else
+ /* This DCN requires rate divider updates and audio reprogramming to allow DP1<-->DP2 link rate switching,
+diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h
+index c32c85ef0ba477..7e8fde65528f14 100644
+--- a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h
++++ b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h
+@@ -69,8 +69,7 @@ unsigned int dcn31_get_det_buffer_size(
+ enum dc_status dcn31_update_dc_state_for_encoder_switch(struct dc_link *link,
+ struct dc_link_settings *link_setting,
+ uint8_t pipe_count,
+- struct pipe_ctx *pipes,
+- struct audio_output *audio_output);
++ struct pipe_ctx *pipes);
+
+ /*temp: B0 specific before switch to dcn313 headers*/
+ #ifndef regPHYPLLF_PIXCLK_RESYNC_CNTL
+diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
+index 42efe838fa85c5..2d2d2d5e676341 100644
+--- a/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
++++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
+@@ -66,6 +66,13 @@ u32 amdgpu_dpm_get_vblank_time(struct amdgpu_device *adev)
+ (amdgpu_crtc->v_border * 2));
+
+ vblank_time_us = vblank_in_pixels * 1000 / amdgpu_crtc->hw_mode.clock;
++
++ /* we have issues with mclk switching with
++ * refresh rates over 120 hz on the non-DC code.
++ */
++ if (drm_mode_vrefresh(&amdgpu_crtc->hw_mode) > 120)
++ vblank_time_us = 0;
++
+ break;
+ }
+ }
+diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+index 52e732be59e36b..4236700fc1ad1e 100644
+--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
++++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+@@ -3085,7 +3085,13 @@ static bool si_dpm_vblank_too_short(void *handle)
+ /* we never hit the non-gddr5 limit so disable it */
+ u32 switch_limit = adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5 ? 450 : 0;
+
+- if (vblank_time < switch_limit)
++ /* Consider zero vblank time too short and disable MCLK switching.
++ * Note that the vblank time is set to maximum when no displays are attached,
++ * so we'll still enable MCLK switching in that case.
++ */
++ if (vblank_time == 0)
++ return true;
++ else if (vblank_time < switch_limit)
+ return true;
+ else
+ return false;
+@@ -3443,12 +3449,14 @@ static void si_apply_state_adjust_rules(struct amdgpu_device *adev,
+ {
+ struct si_ps *ps = si_get_ps(rps);
+ struct amdgpu_clock_and_voltage_limits *max_limits;
++ struct amdgpu_connector *conn;
+ bool disable_mclk_switching = false;
+ bool disable_sclk_switching = false;
+ u32 mclk, sclk;
+ u16 vddc, vddci, min_vce_voltage = 0;
+ u32 max_sclk_vddc, max_mclk_vddci, max_mclk_vddc;
+ u32 max_sclk = 0, max_mclk = 0;
++ u32 high_pixelclock_count = 0;
+ int i;
+
+ if (adev->asic_type == CHIP_HAINAN) {
+@@ -3476,6 +3484,35 @@ static void si_apply_state_adjust_rules(struct amdgpu_device *adev,
+ }
+ }
+
++ /* We define "high pixelclock" for SI as higher than necessary for 4K 30Hz.
++ * For example, 4K 60Hz and 1080p 144Hz fall into this category.
++ * Find number of such displays connected.
++ */
++ for (i = 0; i < adev->mode_info.num_crtc; i++) {
++ if (!(adev->pm.dpm.new_active_crtcs & (1 << i)) ||
++ !adev->mode_info.crtcs[i]->enabled)
++ continue;
++
++ conn = to_amdgpu_connector(adev->mode_info.crtcs[i]->connector);
++
++ if (conn->pixelclock_for_modeset > 297000)
++ high_pixelclock_count++;
++ }
++
++ /* These are some ad-hoc fixes to some issues observed with SI GPUs.
++ * They are necessary because we don't have something like dce_calcs
++ * for these GPUs to calculate bandwidth requirements.
++ */
++ if (high_pixelclock_count) {
++ /* On Oland, we observe some flickering when two 4K 60Hz
++ * displays are connected, possibly because voltage is too low.
++ * Raise the voltage by requiring a higher SCLK.
++ * (Voltage cannot be adjusted independently without also SCLK.)
++ */
++ if (high_pixelclock_count > 1 && adev->asic_type == CHIP_OLAND)
++ disable_sclk_switching = true;
++ }
++
+ if (rps->vce_active) {
+ rps->evclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].evclk;
+ rps->ecclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].ecclk;
+@@ -5637,14 +5674,10 @@ static int si_populate_smc_t(struct amdgpu_device *adev,
+
+ static int si_disable_ulv(struct amdgpu_device *adev)
+ {
+- struct si_power_info *si_pi = si_get_pi(adev);
+- struct si_ulv_param *ulv = &si_pi->ulv;
+-
+- if (ulv->supported)
+- return (amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
+- 0 : -EINVAL;
++ PPSMC_Result r;
+
+- return 0;
++ r = amdgpu_si_send_msg_to_smc(adev, PPSMC_MSG_DisableULV);
++ return (r == PPSMC_Result_OK) ? 0 : -EINVAL;
+ }
+
+ static bool si_is_state_ulv_compatible(struct amdgpu_device *adev,
+@@ -5817,9 +5850,9 @@ static int si_upload_smc_data(struct amdgpu_device *adev)
+ {
+ struct amdgpu_crtc *amdgpu_crtc = NULL;
+ int i;
+-
+- if (adev->pm.dpm.new_active_crtc_count == 0)
+- return 0;
++ u32 crtc_index = 0;
++ u32 mclk_change_block_cp_min = 0;
++ u32 mclk_change_block_cp_max = 0;
+
+ for (i = 0; i < adev->mode_info.num_crtc; i++) {
+ if (adev->pm.dpm.new_active_crtcs & (1 << i)) {
+@@ -5828,26 +5861,31 @@ static int si_upload_smc_data(struct amdgpu_device *adev)
+ }
+ }
+
+- if (amdgpu_crtc == NULL)
+- return 0;
++ /* When a display is plugged in, program these so that the SMC
++ * performs MCLK switching when it doesn't cause flickering.
++ * When no display is plugged in, there is no need to restrict
++ * MCLK switching, so program them to zero.
++ */
++ if (adev->pm.dpm.new_active_crtc_count && amdgpu_crtc) {
++ crtc_index = amdgpu_crtc->crtc_id;
+
+- if (amdgpu_crtc->line_time <= 0)
+- return 0;
++ if (amdgpu_crtc->line_time) {
++ mclk_change_block_cp_min = 200 / amdgpu_crtc->line_time;
++ mclk_change_block_cp_max = 100 / amdgpu_crtc->line_time;
++ }
++ }
+
+- if (si_write_smc_soft_register(adev,
+- SI_SMC_SOFT_REGISTER_crtc_index,
+- amdgpu_crtc->crtc_id) != PPSMC_Result_OK)
+- return 0;
++ si_write_smc_soft_register(adev,
++ SI_SMC_SOFT_REGISTER_crtc_index,
++ crtc_index);
+
+- if (si_write_smc_soft_register(adev,
+- SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
+- amdgpu_crtc->wm_high / amdgpu_crtc->line_time) != PPSMC_Result_OK)
+- return 0;
++ si_write_smc_soft_register(adev,
++ SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min,
++ mclk_change_block_cp_min);
+
+- if (si_write_smc_soft_register(adev,
+- SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
+- amdgpu_crtc->wm_low / amdgpu_crtc->line_time) != PPSMC_Result_OK)
+- return 0;
++ si_write_smc_soft_register(adev,
++ SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max,
++ mclk_change_block_cp_max);
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
+index b9e0ca85226a60..a6d6e62071a0e7 100644
+--- a/drivers/gpu/drm/bridge/Kconfig
++++ b/drivers/gpu/drm/bridge/Kconfig
+@@ -122,6 +122,7 @@ config DRM_ITE_IT6505
+ select EXTCON
+ select CRYPTO
+ select CRYPTO_HASH
++ select REGMAP_I2C
+ help
+ ITE IT6505 DisplayPort bridge chip driver.
+
+diff --git a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
+index a57ca8c3bdaea9..695b6246b280f9 100644
+--- a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
++++ b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
+@@ -997,10 +997,10 @@ static int cdns_dsi_bridge_atomic_check(struct drm_bridge *bridge,
+ struct cdns_dsi_input *input = bridge_to_cdns_dsi_input(bridge);
+ struct cdns_dsi *dsi = input_to_dsi(input);
+ struct cdns_dsi_bridge_state *dsi_state = to_cdns_dsi_bridge_state(bridge_state);
+- const struct drm_display_mode *mode = &crtc_state->mode;
++ const struct drm_display_mode *adjusted_mode = &crtc_state->adjusted_mode;
+ struct cdns_dsi_cfg *dsi_cfg = &dsi_state->dsi_cfg;
+
+- return cdns_dsi_check_conf(dsi, mode, dsi_cfg, false);
++ return cdns_dsi_check_conf(dsi, adjusted_mode, dsi_cfg, true);
+ }
+
+ static struct drm_bridge_state *
+diff --git a/drivers/gpu/drm/display/drm_bridge_connector.c b/drivers/gpu/drm/display/drm_bridge_connector.c
+index 5eb7e9bfe36116..8c915427d05384 100644
+--- a/drivers/gpu/drm/display/drm_bridge_connector.c
++++ b/drivers/gpu/drm/display/drm_bridge_connector.c
+@@ -816,6 +816,8 @@ struct drm_connector *drm_bridge_connector_init(struct drm_device *drm,
+
+ if (bridge_connector->bridge_hdmi_cec &&
+ bridge_connector->bridge_hdmi_cec->ops & DRM_BRIDGE_OP_HDMI_CEC_NOTIFIER) {
++ bridge = bridge_connector->bridge_hdmi_cec;
++
+ ret = drmm_connector_hdmi_cec_notifier_register(connector,
+ NULL,
+ bridge->hdmi_cec_dev);
+@@ -825,6 +827,8 @@ struct drm_connector *drm_bridge_connector_init(struct drm_device *drm,
+
+ if (bridge_connector->bridge_hdmi_cec &&
+ bridge_connector->bridge_hdmi_cec->ops & DRM_BRIDGE_OP_HDMI_CEC_ADAPTER) {
++ bridge = bridge_connector->bridge_hdmi_cec;
++
+ ret = drmm_connector_hdmi_cec_register(connector,
+ &drm_bridge_connector_hdmi_cec_funcs,
+ bridge->hdmi_cec_adapter_name,
+diff --git a/drivers/gpu/drm/display/drm_dp_helper.c b/drivers/gpu/drm/display/drm_dp_helper.c
+index 1ecc3df7e3167d..4aaeae4fa03c36 100644
+--- a/drivers/gpu/drm/display/drm_dp_helper.c
++++ b/drivers/gpu/drm/display/drm_dp_helper.c
+@@ -3962,6 +3962,7 @@ int drm_edp_backlight_set_level(struct drm_dp_aux *aux, const struct drm_edp_bac
+ int ret;
+ unsigned int offset = DP_EDP_BACKLIGHT_BRIGHTNESS_MSB;
+ u8 buf[3] = { 0 };
++ size_t len = 2;
+
+ /* The panel uses the PWM for controlling brightness levels */
+ if (!(bl->aux_set || bl->luminance_set))
+@@ -3974,6 +3975,7 @@ int drm_edp_backlight_set_level(struct drm_dp_aux *aux, const struct drm_edp_bac
+ buf[1] = (level & 0x00ff00) >> 8;
+ buf[2] = (level & 0xff0000) >> 16;
+ offset = DP_EDP_PANEL_TARGET_LUMINANCE_VALUE;
++ len = 3;
+ } else if (bl->lsb_reg_used) {
+ buf[0] = (level & 0xff00) >> 8;
+ buf[1] = (level & 0x00ff);
+@@ -3981,7 +3983,7 @@ int drm_edp_backlight_set_level(struct drm_dp_aux *aux, const struct drm_edp_bac
+ buf[0] = level;
+ }
+
+- ret = drm_dp_dpcd_write_data(aux, offset, buf, sizeof(buf));
++ ret = drm_dp_dpcd_write_data(aux, offset, buf, len);
+ if (ret < 0) {
+ drm_err(aux->drm_dev,
+ "%s: Failed to write aux backlight level: %d\n",
+diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
+index ecc73d52bfae41..85dbdaa4a2e258 100644
+--- a/drivers/gpu/drm/drm_atomic_uapi.c
++++ b/drivers/gpu/drm/drm_atomic_uapi.c
+@@ -1078,19 +1078,20 @@ int drm_atomic_set_property(struct drm_atomic_state *state,
+ }
+
+ if (async_flip) {
+- /* check if the prop does a nop change */
+- if ((prop != config->prop_fb_id &&
+- prop != config->prop_in_fence_fd &&
+- prop != config->prop_fb_damage_clips)) {
+- ret = drm_atomic_plane_get_property(plane, plane_state,
+- prop, &old_val);
+- ret = drm_atomic_check_prop_changes(ret, old_val, prop_value, prop);
+- }
++ /* no-op changes are always allowed */
++ ret = drm_atomic_plane_get_property(plane, plane_state,
++ prop, &old_val);
++ ret = drm_atomic_check_prop_changes(ret, old_val, prop_value, prop);
+
+- /* ask the driver if this non-primary plane is supported */
+- if (plane->type != DRM_PLANE_TYPE_PRIMARY) {
+- ret = -EINVAL;
++ /* fail everything that isn't no-op or a pure flip */
++ if (ret && prop != config->prop_fb_id &&
++ prop != config->prop_in_fence_fd &&
++ prop != config->prop_fb_damage_clips) {
++ break;
++ }
+
++ if (ret && plane->type != DRM_PLANE_TYPE_PRIMARY) {
++ /* ask the driver if this non-primary plane is supported */
+ if (plane_funcs && plane_funcs->atomic_async_check)
+ ret = plane_funcs->atomic_async_check(plane, state, true);
+
+diff --git a/drivers/gpu/drm/drm_panel.c b/drivers/gpu/drm/drm_panel.c
+index c8bb28dccdc1b3..d1e6598ea3bc02 100644
+--- a/drivers/gpu/drm/drm_panel.c
++++ b/drivers/gpu/drm/drm_panel.c
+@@ -134,6 +134,9 @@ void drm_panel_prepare(struct drm_panel *panel)
+ panel->prepared = true;
+
+ list_for_each_entry(follower, &panel->followers, list) {
++ if (!follower->funcs->panel_prepared)
++ continue;
++
+ ret = follower->funcs->panel_prepared(follower);
+ if (ret < 0)
+ dev_info(panel->dev, "%ps failed: %d\n",
+@@ -179,6 +182,9 @@ void drm_panel_unprepare(struct drm_panel *panel)
+ mutex_lock(&panel->follower_lock);
+
+ list_for_each_entry(follower, &panel->followers, list) {
++ if (!follower->funcs->panel_unpreparing)
++ continue;
++
+ ret = follower->funcs->panel_unpreparing(follower);
+ if (ret < 0)
+ dev_info(panel->dev, "%ps failed: %d\n",
+@@ -209,6 +215,7 @@ EXPORT_SYMBOL(drm_panel_unprepare);
+ */
+ void drm_panel_enable(struct drm_panel *panel)
+ {
++ struct drm_panel_follower *follower;
+ int ret;
+
+ if (!panel)
+@@ -219,10 +226,12 @@ void drm_panel_enable(struct drm_panel *panel)
+ return;
+ }
+
++ mutex_lock(&panel->follower_lock);
++
+ if (panel->funcs && panel->funcs->enable) {
+ ret = panel->funcs->enable(panel);
+ if (ret < 0)
+- return;
++ goto exit;
+ }
+ panel->enabled = true;
+
+@@ -230,6 +239,19 @@ void drm_panel_enable(struct drm_panel *panel)
+ if (ret < 0)
+ DRM_DEV_INFO(panel->dev, "failed to enable backlight: %d\n",
+ ret);
++
++ list_for_each_entry(follower, &panel->followers, list) {
++ if (!follower->funcs->panel_enabled)
++ continue;
++
++ ret = follower->funcs->panel_enabled(follower);
++ if (ret < 0)
++ dev_info(panel->dev, "%ps failed: %d\n",
++ follower->funcs->panel_enabled, ret);
++ }
++
++exit:
++ mutex_unlock(&panel->follower_lock);
+ }
+ EXPORT_SYMBOL(drm_panel_enable);
+
+@@ -243,6 +265,7 @@ EXPORT_SYMBOL(drm_panel_enable);
+ */
+ void drm_panel_disable(struct drm_panel *panel)
+ {
++ struct drm_panel_follower *follower;
+ int ret;
+
+ if (!panel)
+@@ -262,6 +285,18 @@ void drm_panel_disable(struct drm_panel *panel)
+ return;
+ }
+
++ mutex_lock(&panel->follower_lock);
++
++ list_for_each_entry(follower, &panel->followers, list) {
++ if (!follower->funcs->panel_disabling)
++ continue;
++
++ ret = follower->funcs->panel_disabling(follower);
++ if (ret < 0)
++ dev_info(panel->dev, "%ps failed: %d\n",
++ follower->funcs->panel_disabling, ret);
++ }
++
+ ret = backlight_disable(panel->backlight);
+ if (ret < 0)
+ DRM_DEV_INFO(panel->dev, "failed to disable backlight: %d\n",
+@@ -270,9 +305,12 @@ void drm_panel_disable(struct drm_panel *panel)
+ if (panel->funcs && panel->funcs->disable) {
+ ret = panel->funcs->disable(panel);
+ if (ret < 0)
+- return;
++ goto exit;
+ }
+ panel->enabled = false;
++
++exit:
++ mutex_unlock(&panel->follower_lock);
+ }
+ EXPORT_SYMBOL(drm_panel_disable);
+
+@@ -539,13 +577,13 @@ EXPORT_SYMBOL(drm_is_panel_follower);
+ * @follower_dev: The 'struct device' for the follower.
+ * @follower: The panel follower descriptor for the follower.
+ *
+- * A panel follower is called right after preparing the panel and right before
+- * unpreparing the panel. It's primary intention is to power on an associated
+- * touchscreen, though it could be used for any similar devices. Multiple
+- * devices are allowed the follow the same panel.
++ * A panel follower is called right after preparing/enabling the panel and right
++ * before unpreparing/disabling the panel. It's primary intention is to power on
++ * an associated touchscreen, though it could be used for any similar devices.
++ * Multiple devices are allowed the follow the same panel.
+ *
+- * If a follower is added to a panel that's already been turned on, the
+- * follower's prepare callback is called right away.
++ * If a follower is added to a panel that's already been prepared/enabled, the
++ * follower's prepared/enabled callback is called right away.
+ *
+ * The "panel" property of the follower points to the panel to be followed.
+ *
+@@ -569,12 +607,18 @@ int drm_panel_add_follower(struct device *follower_dev,
+ mutex_lock(&panel->follower_lock);
+
+ list_add_tail(&follower->list, &panel->followers);
+- if (panel->prepared) {
++ if (panel->prepared && follower->funcs->panel_prepared) {
+ ret = follower->funcs->panel_prepared(follower);
+ if (ret < 0)
+ dev_info(panel->dev, "%ps failed: %d\n",
+ follower->funcs->panel_prepared, ret);
+ }
++ if (panel->enabled && follower->funcs->panel_enabled) {
++ ret = follower->funcs->panel_enabled(follower);
++ if (ret < 0)
++ dev_info(panel->dev, "%ps failed: %d\n",
++ follower->funcs->panel_enabled, ret);
++ }
+
+ mutex_unlock(&panel->follower_lock);
+
+@@ -587,7 +631,8 @@ EXPORT_SYMBOL(drm_panel_add_follower);
+ * @follower: The panel follower descriptor for the follower.
+ *
+ * Undo drm_panel_add_follower(). This includes calling the follower's
+- * unprepare function if we're removed from a panel that's currently prepared.
++ * unpreparing/disabling function if we're removed from a panel that's currently
++ * prepared/enabled.
+ *
+ * Return: 0 or an error code.
+ */
+@@ -598,7 +643,13 @@ void drm_panel_remove_follower(struct drm_panel_follower *follower)
+
+ mutex_lock(&panel->follower_lock);
+
+- if (panel->prepared) {
++ if (panel->enabled && follower->funcs->panel_disabling) {
++ ret = follower->funcs->panel_disabling(follower);
++ if (ret < 0)
++ dev_info(panel->dev, "%ps failed: %d\n",
++ follower->funcs->panel_disabling, ret);
++ }
++ if (panel->prepared && follower->funcs->panel_unpreparing) {
+ ret = follower->funcs->panel_unpreparing(follower);
+ if (ret < 0)
+ dev_info(panel->dev, "%ps failed: %d\n",
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+index 56a5b596554db8..46f348972a975d 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+@@ -446,7 +446,7 @@ static void _dpu_encoder_phys_wb_handle_wbdone_timeout(
+ static int dpu_encoder_phys_wb_wait_for_commit_done(
+ struct dpu_encoder_phys *phys_enc)
+ {
+- unsigned long ret;
++ int ret;
+ struct dpu_encoder_wait_info wait_info;
+ struct dpu_encoder_phys_wb *wb_enc = to_dpu_encoder_phys_wb(phys_enc);
+
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+index 6859e8ef6b0559..f54cf0faa1c7c8 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+@@ -922,6 +922,9 @@ static int dpu_plane_is_multirect_capable(struct dpu_hw_sspp *sspp,
+ if (MSM_FORMAT_IS_YUV(fmt))
+ return false;
+
++ if (!sspp)
++ return true;
++
+ if (!test_bit(DPU_SSPP_SMART_DMA_V1, &sspp->cap->features) &&
+ !test_bit(DPU_SSPP_SMART_DMA_V2, &sspp->cap->features))
+ return false;
+@@ -1028,6 +1031,7 @@ static int dpu_plane_try_multirect_shared(struct dpu_plane_state *pstate,
+ prev_pipe->multirect_mode != DPU_SSPP_MULTIRECT_NONE)
+ return false;
+
++ /* Do not validate SSPP of current plane when it is not ready */
+ if (!dpu_plane_is_multirect_capable(pipe->sspp, pipe_cfg, fmt) ||
+ !dpu_plane_is_multirect_capable(prev_pipe->sspp, prev_pipe_cfg, prev_fmt))
+ return false;
+diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+index 0952c7f18abdca..4d1ea9b2619170 100644
+--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
++++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+@@ -463,9 +463,9 @@ static int mdp4_kms_init(struct drm_device *dev)
+ ret = PTR_ERR(mmu);
+ goto fail;
+ } else if (!mmu) {
+- DRM_DEV_INFO(dev->dev, "no iommu, fallback to phys "
+- "contig buffers for scanout\n");
+- vm = NULL;
++ DRM_DEV_INFO(dev->dev, "no IOMMU, bailing out\n");
++ ret = -ENODEV;
++ goto fail;
+ } else {
+ vm = msm_gem_vm_create(dev, mmu, "mdp4",
+ 0x1000, 0x100000000 - 0x1000,
+diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
+index 9dcc7a596a11d9..7e977fec410079 100644
+--- a/drivers/gpu/drm/msm/msm_drv.c
++++ b/drivers/gpu/drm/msm/msm_drv.c
+@@ -826,6 +826,7 @@ static const struct file_operations fops = {
+
+ #define DRIVER_FEATURES_KMS ( \
+ DRIVER_GEM | \
++ DRIVER_GEM_GPUVA | \
+ DRIVER_ATOMIC | \
+ DRIVER_MODESET | \
+ 0 )
+diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
+index 00d0f3b7ba327d..381a0853c05ba3 100644
+--- a/drivers/gpu/drm/msm/msm_gem_vma.c
++++ b/drivers/gpu/drm/msm/msm_gem_vma.c
+@@ -1023,6 +1023,7 @@ vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args
+ struct drm_device *dev = job->vm->drm;
+ int ret = 0;
+ int cnt = 0;
++ int i = -1;
+
+ if (args->nr_ops == 1) {
+ /* Single op case, the op is inlined: */
+@@ -1056,11 +1057,12 @@ vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args
+
+ spin_lock(&file->table_lock);
+
+- for (unsigned i = 0; i < args->nr_ops; i++) {
++ for (i = 0; i < args->nr_ops; i++) {
++ struct msm_vm_bind_op *op = &job->ops[i];
+ struct drm_gem_object *obj;
+
+- if (!job->ops[i].handle) {
+- job->ops[i].obj = NULL;
++ if (!op->handle) {
++ op->obj = NULL;
+ continue;
+ }
+
+@@ -1068,16 +1070,22 @@ vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args
+ * normally use drm_gem_object_lookup(), but for bulk lookup
+ * all under single table_lock just hit object_idr directly:
+ */
+- obj = idr_find(&file->object_idr, job->ops[i].handle);
++ obj = idr_find(&file->object_idr, op->handle);
+ if (!obj) {
+- ret = UERR(EINVAL, dev, "invalid handle %u at index %u\n", job->ops[i].handle, i);
++ ret = UERR(EINVAL, dev, "invalid handle %u at index %u\n", op->handle, i);
+ goto out_unlock;
+ }
+
+ drm_gem_object_get(obj);
+
+- job->ops[i].obj = obj;
++ op->obj = obj;
+ cnt++;
++
++ if ((op->range + op->obj_offset) > obj->size) {
++ ret = UERR(EINVAL, dev, "invalid range: %016llx + %016llx > %016zx\n",
++ op->range, op->obj_offset, obj->size);
++ goto out_unlock;
++ }
+ }
+
+ *nr_bos = cnt;
+@@ -1085,6 +1093,17 @@ vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args
+ out_unlock:
+ spin_unlock(&file->table_lock);
+
++ if (ret) {
++ for (; i >= 0; i--) {
++ struct msm_vm_bind_op *op = &job->ops[i];
++
++ if (!op->obj)
++ continue;
++
++ drm_gem_object_put(op->obj);
++ op->obj = NULL;
++ }
++ }
+ out:
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
+index 56828d218e88a5..4c4dcb095c4df9 100644
+--- a/drivers/gpu/drm/msm/msm_kms.c
++++ b/drivers/gpu/drm/msm/msm_kms.c
+@@ -195,14 +195,13 @@ struct drm_gpuvm *msm_kms_init_vm(struct drm_device *dev)
+ iommu_dev = mdp_dev;
+ else
+ iommu_dev = mdss_dev;
+-
+ mmu = msm_iommu_disp_new(iommu_dev, 0);
+ if (IS_ERR(mmu))
+ return ERR_CAST(mmu);
+
+ if (!mmu) {
+- drm_info(dev, "no IOMMU, fallback to phys contig buffers for scanout\n");
+- return NULL;
++ drm_info(dev, "no IOMMU, bailing out\n");
++ return ERR_PTR(-ENODEV);
+ }
+
+ vm = msm_gem_vm_create(dev, mmu, "mdp_kms",
+diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c
+index 9a56e208cbddbc..d0aa602ecc9de8 100644
+--- a/drivers/gpu/drm/panel/panel-edp.c
++++ b/drivers/gpu/drm/panel/panel-edp.c
+@@ -1736,10 +1736,11 @@ static const struct panel_delay delay_200_500_e50 = {
+ .enable = 50,
+ };
+
+-static const struct panel_delay delay_200_500_e50_p2e200 = {
++static const struct panel_delay delay_200_500_e50_d50_p2e200 = {
+ .hpd_absent = 200,
+ .unprepare = 500,
+ .enable = 50,
++ .disable = 50,
+ .prepare_to_enable = 200,
+ };
+
+@@ -1828,6 +1829,13 @@ static const struct panel_delay delay_50_500_e200_d200_po2e335 = {
+ .powered_on_to_enable = 335,
+ };
+
++static const struct panel_delay delay_200_500_e50_d100 = {
++ .hpd_absent = 200,
++ .unprepare = 500,
++ .enable = 50,
++ .disable = 100,
++};
++
+ #define EDP_PANEL_ENTRY(vend_chr_0, vend_chr_1, vend_chr_2, product_id, _delay, _name) \
+ { \
+ .ident = { \
+@@ -1934,13 +1942,13 @@ static const struct edp_panel_entry edp_panels[] = {
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x09dd, &delay_200_500_e50, "NT116WHM-N21"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a1b, &delay_200_500_e50, "NV133WUM-N63"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a36, &delay_200_500_e200, "Unknown"),
+- EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a3e, &delay_200_500_e80, "NV116WHM-N49"),
++ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a3e, &delay_200_500_e80_d50, "NV116WHM-N49"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a5d, &delay_200_500_e50, "NV116WHM-N45"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0ac5, &delay_200_500_e50, "NV116WHM-N4C"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0ae8, &delay_200_500_e50_p2e80, "NV140WUM-N41"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b09, &delay_200_500_e50_po2e200, "NV140FHM-NZ"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b1e, &delay_200_500_e80, "NE140QDM-N6A"),
+- EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b34, &delay_200_500_e80, "NV122WUM-N41"),
++ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b34, &delay_200_500_e80_d50, "NV122WUM-N41"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b43, &delay_200_500_e200, "NV140FHM-T09"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b56, &delay_200_500_e80, "NT140FHM-N47"),
+ EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b66, &delay_200_500_e80, "NE140WUM-N6G"),
+@@ -1979,12 +1987,12 @@ static const struct edp_panel_entry edp_panels[] = {
+ EDP_PANEL_ENTRY('C', 'M', 'N', 0x14e5, &delay_200_500_e80_d50, "N140HGA-EA1"),
+ EDP_PANEL_ENTRY('C', 'M', 'N', 0x162b, &delay_200_500_e80_d50, "N160JCE-ELL"),
+
+- EDP_PANEL_ENTRY('C', 'S', 'O', 0x1200, &delay_200_500_e50_p2e200, "MNC207QS1-1"),
+- EDP_PANEL_ENTRY('C', 'S', 'O', 0x1413, &delay_200_500_e50_p2e200, "MNE007JA1-2"),
++ EDP_PANEL_ENTRY('C', 'S', 'O', 0x1200, &delay_200_500_e50_d50_p2e200, "MNC207QS1-1"),
++ EDP_PANEL_ENTRY('C', 'S', 'O', 0x1413, &delay_200_500_e50_d50_p2e200, "MNE007JA1-2"),
+
+ EDP_PANEL_ENTRY('C', 'S', 'W', 0x1100, &delay_200_500_e80_d50, "MNB601LS1-1"),
+ EDP_PANEL_ENTRY('C', 'S', 'W', 0x1103, &delay_200_500_e80_d50, "MNB601LS1-3"),
+- EDP_PANEL_ENTRY('C', 'S', 'W', 0x1104, &delay_200_500_e50, "MNB601LS1-4"),
++ EDP_PANEL_ENTRY('C', 'S', 'W', 0x1104, &delay_200_500_e50_d100, "MNB601LS1-4"),
+ EDP_PANEL_ENTRY('C', 'S', 'W', 0x1448, &delay_200_500_e50, "MNE007QS3-7"),
+ EDP_PANEL_ENTRY('C', 'S', 'W', 0x1457, &delay_80_500_e80_p2e200, "MNE007QS3-8"),
+
+diff --git a/drivers/gpu/drm/panel/panel-novatek-nt35560.c b/drivers/gpu/drm/panel/panel-novatek-nt35560.c
+index 98f0782c841114..17898a29efe87f 100644
+--- a/drivers/gpu/drm/panel/panel-novatek-nt35560.c
++++ b/drivers/gpu/drm/panel/panel-novatek-nt35560.c
+@@ -161,7 +161,7 @@ static int nt35560_set_brightness(struct backlight_device *bl)
+ par = 0x00;
+ ret = mipi_dsi_dcs_write(dsi, MIPI_DCS_WRITE_CONTROL_DISPLAY,
+ &par, 1);
+- if (ret) {
++ if (ret < 0) {
+ dev_err(nt->dev, "failed to disable display backlight (%d)\n", ret);
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/radeon/r600_cs.c b/drivers/gpu/drm/radeon/r600_cs.c
+index ac77d1246b9453..811265648a5828 100644
+--- a/drivers/gpu/drm/radeon/r600_cs.c
++++ b/drivers/gpu/drm/radeon/r600_cs.c
+@@ -1408,7 +1408,7 @@ static void r600_texture_size(unsigned nfaces, unsigned blevel, unsigned llevel,
+ unsigned block_align, unsigned height_align, unsigned base_align,
+ unsigned *l0_size, unsigned *mipmap_size)
+ {
+- unsigned offset, i, level;
++ unsigned offset, i;
+ unsigned width, height, depth, size;
+ unsigned blocksize;
+ unsigned nbx, nby;
+@@ -1420,7 +1420,7 @@ static void r600_texture_size(unsigned nfaces, unsigned blevel, unsigned llevel,
+ w0 = r600_mip_minify(w0, 0);
+ h0 = r600_mip_minify(h0, 0);
+ d0 = r600_mip_minify(d0, 0);
+- for(i = 0, offset = 0, level = blevel; i < nlevels; i++, level++) {
++ for (i = 0, offset = 0; i < nlevels; i++) {
+ width = r600_mip_minify(w0, i);
+ nbx = r600_fmt_get_nblocksx(format, width);
+
+diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
+index 65acffc3fea828..8e9ae7d980eb2e 100644
+--- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
++++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
+@@ -219,7 +219,7 @@ mock_sched_timedout_job(struct drm_sched_job *sched_job)
+ unsigned long flags;
+
+ if (job->flags & DRM_MOCK_SCHED_JOB_DONT_RESET) {
+- job->flags &= ~DRM_MOCK_SCHED_JOB_DONT_RESET;
++ job->flags |= DRM_MOCK_SCHED_JOB_RESET_SKIPPED;
+ return DRM_GPU_SCHED_STAT_NO_HANG;
+ }
+
+diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h b/drivers/gpu/drm/scheduler/tests/sched_tests.h
+index 63d4f2ac707497..5b262126b7760f 100644
+--- a/drivers/gpu/drm/scheduler/tests/sched_tests.h
++++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h
+@@ -95,9 +95,10 @@ struct drm_mock_sched_job {
+
+ struct completion done;
+
+-#define DRM_MOCK_SCHED_JOB_DONE 0x1
+-#define DRM_MOCK_SCHED_JOB_TIMEDOUT 0x2
+-#define DRM_MOCK_SCHED_JOB_DONT_RESET 0x4
++#define DRM_MOCK_SCHED_JOB_DONE 0x1
++#define DRM_MOCK_SCHED_JOB_TIMEDOUT 0x2
++#define DRM_MOCK_SCHED_JOB_DONT_RESET 0x4
++#define DRM_MOCK_SCHED_JOB_RESET_SKIPPED 0x8
+ unsigned long flags;
+
+ struct list_head link;
+diff --git a/drivers/gpu/drm/scheduler/tests/tests_basic.c b/drivers/gpu/drm/scheduler/tests/tests_basic.c
+index 55eb142bd7c5df..82a41a456b0a85 100644
+--- a/drivers/gpu/drm/scheduler/tests/tests_basic.c
++++ b/drivers/gpu/drm/scheduler/tests/tests_basic.c
+@@ -317,8 +317,8 @@ static void drm_sched_skip_reset(struct kunit *test)
+ KUNIT_ASSERT_FALSE(test, done);
+
+ KUNIT_ASSERT_EQ(test,
+- job->flags & DRM_MOCK_SCHED_JOB_DONT_RESET,
+- 0);
++ job->flags & DRM_MOCK_SCHED_JOB_RESET_SKIPPED,
++ DRM_MOCK_SCHED_JOB_RESET_SKIPPED);
+
+ i = drm_mock_sched_advance(sched, 1);
+ KUNIT_ASSERT_EQ(test, i, 1);
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
+index c2294abbe75344..00be92da55097b 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
+@@ -538,7 +538,7 @@ static void vmw_event_fence_action_seq_passed(struct dma_fence *f,
+ if (likely(eaction->tv_sec != NULL)) {
+ struct timespec64 ts;
+
+- ktime_to_timespec64(f->timestamp);
++ ts = ktime_to_timespec64(f->timestamp);
+ /* monotonic time, so no y2038 overflow */
+ *eaction->tv_sec = ts.tv_sec;
+ *eaction->tv_usec = ts.tv_nsec / NSEC_PER_USEC;
+diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
+index 149798754570d3..ded5348d190c51 100644
+--- a/drivers/hid/hid-ids.h
++++ b/drivers/hid/hid-ids.h
+@@ -1296,6 +1296,8 @@
+
+ #define USB_VENDOR_ID_STEELSERIES 0x1038
+ #define USB_DEVICE_ID_STEELSERIES_SRWS1 0x1410
++#define USB_DEVICE_ID_STEELSERIES_ARCTIS_1 0x12b6
++#define USB_DEVICE_ID_STEELSERIES_ARCTIS_9 0x12c2
+
+ #define USB_VENDOR_ID_SUN 0x0430
+ #define USB_DEVICE_ID_RARITAN_KVM_DONGLE 0xcdab
+diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c
+index f619ed10535d74..ffd034566e2e1e 100644
+--- a/drivers/hid/hid-quirks.c
++++ b/drivers/hid/hid-quirks.c
+@@ -695,6 +695,8 @@ static const struct hid_device_id hid_have_special_driver[] = {
+ #endif
+ #if IS_ENABLED(CONFIG_HID_STEELSERIES)
+ { HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, USB_DEVICE_ID_STEELSERIES_SRWS1) },
++ { HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, USB_DEVICE_ID_STEELSERIES_ARCTIS_1) },
++ { HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, USB_DEVICE_ID_STEELSERIES_ARCTIS_9) },
+ #endif
+ #if IS_ENABLED(CONFIG_HID_SUNPLUS)
+ { HID_USB_DEVICE(USB_VENDOR_ID_SUNPLUS, USB_DEVICE_ID_SUNPLUS_WDESKTOP) },
+diff --git a/drivers/hid/hid-steelseries.c b/drivers/hid/hid-steelseries.c
+index d4bd7848b8c665..f98435631aa180 100644
+--- a/drivers/hid/hid-steelseries.c
++++ b/drivers/hid/hid-steelseries.c
+@@ -249,11 +249,11 @@ static int steelseries_srws1_probe(struct hid_device *hdev,
+ {
+ int ret, i;
+ struct led_classdev *led;
++ struct steelseries_srws1_data *drv_data;
+ size_t name_sz;
+ char *name;
+
+- struct steelseries_srws1_data *drv_data = kzalloc(sizeof(*drv_data), GFP_KERNEL);
+-
++ drv_data = devm_kzalloc(&hdev->dev, sizeof(*drv_data), GFP_KERNEL);
+ if (drv_data == NULL) {
+ hid_err(hdev, "can't alloc SRW-S1 memory\n");
+ return -ENOMEM;
+@@ -264,18 +264,18 @@ static int steelseries_srws1_probe(struct hid_device *hdev,
+ ret = hid_parse(hdev);
+ if (ret) {
+ hid_err(hdev, "parse failed\n");
+- goto err_free;
++ goto err;
+ }
+
+ if (!hid_validate_values(hdev, HID_OUTPUT_REPORT, 0, 0, 16)) {
+ ret = -ENODEV;
+- goto err_free;
++ goto err;
+ }
+
+ ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT);
+ if (ret) {
+ hid_err(hdev, "hw start failed\n");
+- goto err_free;
++ goto err;
+ }
+
+ /* register led subsystem */
+@@ -288,10 +288,10 @@ static int steelseries_srws1_probe(struct hid_device *hdev,
+ name_sz = strlen(hdev->uniq) + 16;
+
+ /* 'ALL', for setting all LEDs simultaneously */
+- led = kzalloc(sizeof(struct led_classdev)+name_sz, GFP_KERNEL);
++ led = devm_kzalloc(&hdev->dev, sizeof(struct led_classdev)+name_sz, GFP_KERNEL);
+ if (!led) {
+ hid_err(hdev, "can't allocate memory for LED ALL\n");
+- goto err_led;
++ goto out;
+ }
+
+ name = (void *)(&led[1]);
+@@ -303,16 +303,18 @@ static int steelseries_srws1_probe(struct hid_device *hdev,
+ led->brightness_set = steelseries_srws1_led_all_set_brightness;
+
+ drv_data->led[SRWS1_NUMBER_LEDS] = led;
+- ret = led_classdev_register(&hdev->dev, led);
+- if (ret)
+- goto err_led;
++ ret = devm_led_classdev_register(&hdev->dev, led);
++ if (ret) {
++ hid_err(hdev, "failed to register LED %d. Aborting.\n", SRWS1_NUMBER_LEDS);
++ goto out; /* let the driver continue without LEDs */
++ }
+
+ /* Each individual LED */
+ for (i = 0; i < SRWS1_NUMBER_LEDS; i++) {
+- led = kzalloc(sizeof(struct led_classdev)+name_sz, GFP_KERNEL);
++ led = devm_kzalloc(&hdev->dev, sizeof(struct led_classdev)+name_sz, GFP_KERNEL);
+ if (!led) {
+ hid_err(hdev, "can't allocate memory for LED %d\n", i);
+- goto err_led;
++ break;
+ }
+
+ name = (void *)(&led[1]);
+@@ -324,53 +326,18 @@ static int steelseries_srws1_probe(struct hid_device *hdev,
+ led->brightness_set = steelseries_srws1_led_set_brightness;
+
+ drv_data->led[i] = led;
+- ret = led_classdev_register(&hdev->dev, led);
++ ret = devm_led_classdev_register(&hdev->dev, led);
+
+ if (ret) {
+ hid_err(hdev, "failed to register LED %d. Aborting.\n", i);
+-err_led:
+- /* Deregister all LEDs (if any) */
+- for (i = 0; i < SRWS1_NUMBER_LEDS + 1; i++) {
+- led = drv_data->led[i];
+- drv_data->led[i] = NULL;
+- if (!led)
+- continue;
+- led_classdev_unregister(led);
+- kfree(led);
+- }
+- goto out; /* but let the driver continue without LEDs */
++ break; /* but let the driver continue without LEDs */
+ }
+ }
+ out:
+ return 0;
+-err_free:
+- kfree(drv_data);
++err:
+ return ret;
+ }
+-
+-static void steelseries_srws1_remove(struct hid_device *hdev)
+-{
+- int i;
+- struct led_classdev *led;
+-
+- struct steelseries_srws1_data *drv_data = hid_get_drvdata(hdev);
+-
+- if (drv_data) {
+- /* Deregister LEDs (if any) */
+- for (i = 0; i < SRWS1_NUMBER_LEDS + 1; i++) {
+- led = drv_data->led[i];
+- drv_data->led[i] = NULL;
+- if (!led)
+- continue;
+- led_classdev_unregister(led);
+- kfree(led);
+- }
+-
+- }
+-
+- hid_hw_stop(hdev);
+- kfree(drv_data);
+-}
+ #endif
+
+ #define STEELSERIES_HEADSET_BATTERY_TIMEOUT_MS 3000
+@@ -405,13 +372,12 @@ static int steelseries_headset_request_battery(struct hid_device *hdev,
+
+ static void steelseries_headset_fetch_battery(struct hid_device *hdev)
+ {
+- struct steelseries_device *sd = hid_get_drvdata(hdev);
+ int ret = 0;
+
+- if (sd->quirks & STEELSERIES_ARCTIS_1)
++ if (hdev->product == USB_DEVICE_ID_STEELSERIES_ARCTIS_1)
+ ret = steelseries_headset_request_battery(hdev,
+ arctis_1_battery_request, sizeof(arctis_1_battery_request));
+- else if (sd->quirks & STEELSERIES_ARCTIS_9)
++ else if (hdev->product == USB_DEVICE_ID_STEELSERIES_ARCTIS_9)
+ ret = steelseries_headset_request_battery(hdev,
+ arctis_9_battery_request, sizeof(arctis_9_battery_request));
+
+@@ -567,14 +533,7 @@ static int steelseries_probe(struct hid_device *hdev, const struct hid_device_id
+ struct steelseries_device *sd;
+ int ret;
+
+- sd = devm_kzalloc(&hdev->dev, sizeof(*sd), GFP_KERNEL);
+- if (!sd)
+- return -ENOMEM;
+- hid_set_drvdata(hdev, sd);
+- sd->hdev = hdev;
+- sd->quirks = id->driver_data;
+-
+- if (sd->quirks & STEELSERIES_SRWS1) {
++ if (hdev->product == USB_DEVICE_ID_STEELSERIES_SRWS1) {
+ #if IS_BUILTIN(CONFIG_LEDS_CLASS) || \
+ (IS_MODULE(CONFIG_LEDS_CLASS) && IS_MODULE(CONFIG_HID_STEELSERIES))
+ return steelseries_srws1_probe(hdev, id);
+@@ -583,6 +542,13 @@ static int steelseries_probe(struct hid_device *hdev, const struct hid_device_id
+ #endif
+ }
+
++ sd = devm_kzalloc(&hdev->dev, sizeof(*sd), GFP_KERNEL);
++ if (!sd)
++ return -ENOMEM;
++ hid_set_drvdata(hdev, sd);
++ sd->hdev = hdev;
++ sd->quirks = id->driver_data;
++
+ ret = hid_parse(hdev);
+ if (ret)
+ return ret;
+@@ -610,17 +576,19 @@ static int steelseries_probe(struct hid_device *hdev, const struct hid_device_id
+
+ static void steelseries_remove(struct hid_device *hdev)
+ {
+- struct steelseries_device *sd = hid_get_drvdata(hdev);
++ struct steelseries_device *sd;
+ unsigned long flags;
+
+- if (sd->quirks & STEELSERIES_SRWS1) {
++ if (hdev->product == USB_DEVICE_ID_STEELSERIES_SRWS1) {
+ #if IS_BUILTIN(CONFIG_LEDS_CLASS) || \
+ (IS_MODULE(CONFIG_LEDS_CLASS) && IS_MODULE(CONFIG_HID_STEELSERIES))
+- steelseries_srws1_remove(hdev);
++ hid_hw_stop(hdev);
+ #endif
+ return;
+ }
+
++ sd = hid_get_drvdata(hdev);
++
+ spin_lock_irqsave(&sd->lock, flags);
+ sd->removed = true;
+ spin_unlock_irqrestore(&sd->lock, flags);
+@@ -667,10 +635,10 @@ static int steelseries_headset_raw_event(struct hid_device *hdev,
+ unsigned long flags;
+
+ /* Not a headset */
+- if (sd->quirks & STEELSERIES_SRWS1)
++ if (hdev->product == USB_DEVICE_ID_STEELSERIES_SRWS1)
+ return 0;
+
+- if (sd->quirks & STEELSERIES_ARCTIS_1) {
++ if (hdev->product == USB_DEVICE_ID_STEELSERIES_ARCTIS_1) {
+ hid_dbg(sd->hdev,
+ "Parsing raw event for Arctis 1 headset (%*ph)\n", size, read_buf);
+ if (size < ARCTIS_1_BATTERY_RESPONSE_LEN ||
+@@ -688,7 +656,7 @@ static int steelseries_headset_raw_event(struct hid_device *hdev,
+ }
+ }
+
+- if (sd->quirks & STEELSERIES_ARCTIS_9) {
++ if (hdev->product == USB_DEVICE_ID_STEELSERIES_ARCTIS_9) {
+ hid_dbg(sd->hdev,
+ "Parsing raw event for Arctis 9 headset (%*ph)\n", size, read_buf);
+ if (size < ARCTIS_9_BATTERY_RESPONSE_LEN) {
+@@ -757,11 +725,11 @@ static const struct hid_device_id steelseries_devices[] = {
+ .driver_data = STEELSERIES_SRWS1 },
+
+ { /* SteelSeries Arctis 1 Wireless for XBox */
+- HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, 0x12b6),
+- .driver_data = STEELSERIES_ARCTIS_1 },
++ HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, USB_DEVICE_ID_STEELSERIES_ARCTIS_1),
++ .driver_data = STEELSERIES_ARCTIS_1 },
+
+ { /* SteelSeries Arctis 9 Wireless for XBox */
+- HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, 0x12c2),
++ HID_USB_DEVICE(USB_VENDOR_ID_STEELSERIES, USB_DEVICE_ID_STEELSERIES_ARCTIS_9),
+ .driver_data = STEELSERIES_ARCTIS_9 },
+
+ { }
+diff --git a/drivers/hid/hidraw.c b/drivers/hid/hidraw.c
+index c887f48756f4be..bbd6f23bce7895 100644
+--- a/drivers/hid/hidraw.c
++++ b/drivers/hid/hidraw.c
+@@ -394,27 +394,15 @@ static int hidraw_revoke(struct hidraw_list *list)
+ return 0;
+ }
+
+-static long hidraw_ioctl(struct file *file, unsigned int cmd,
+- unsigned long arg)
++static long hidraw_fixed_size_ioctl(struct file *file, struct hidraw *dev, unsigned int cmd,
++ void __user *arg)
+ {
+- struct inode *inode = file_inode(file);
+- unsigned int minor = iminor(inode);
+- long ret = 0;
+- struct hidraw *dev;
+- struct hidraw_list *list = file->private_data;
+- void __user *user_arg = (void __user*) arg;
+-
+- down_read(&minors_rwsem);
+- dev = hidraw_table[minor];
+- if (!dev || !dev->exist || hidraw_is_revoked(list)) {
+- ret = -ENODEV;
+- goto out;
+- }
++ struct hid_device *hid = dev->hid;
+
+ switch (cmd) {
+ case HIDIOCGRDESCSIZE:
+- if (put_user(dev->hid->rsize, (int __user *)arg))
+- ret = -EFAULT;
++ if (put_user(hid->rsize, (int __user *)arg))
++ return -EFAULT;
+ break;
+
+ case HIDIOCGRDESC:
+@@ -422,113 +410,145 @@ static long hidraw_ioctl(struct file *file, unsigned int cmd,
+ __u32 len;
+
+ if (get_user(len, (int __user *)arg))
+- ret = -EFAULT;
+- else if (len > HID_MAX_DESCRIPTOR_SIZE - 1)
+- ret = -EINVAL;
+- else if (copy_to_user(user_arg + offsetof(
+- struct hidraw_report_descriptor,
+- value[0]),
+- dev->hid->rdesc,
+- min(dev->hid->rsize, len)))
+- ret = -EFAULT;
++ return -EFAULT;
++
++ if (len > HID_MAX_DESCRIPTOR_SIZE - 1)
++ return -EINVAL;
++
++ if (copy_to_user(arg + offsetof(
++ struct hidraw_report_descriptor,
++ value[0]),
++ hid->rdesc,
++ min(hid->rsize, len)))
++ return -EFAULT;
++
+ break;
+ }
+ case HIDIOCGRAWINFO:
+ {
+ struct hidraw_devinfo dinfo;
+
+- dinfo.bustype = dev->hid->bus;
+- dinfo.vendor = dev->hid->vendor;
+- dinfo.product = dev->hid->product;
+- if (copy_to_user(user_arg, &dinfo, sizeof(dinfo)))
+- ret = -EFAULT;
++ dinfo.bustype = hid->bus;
++ dinfo.vendor = hid->vendor;
++ dinfo.product = hid->product;
++ if (copy_to_user(arg, &dinfo, sizeof(dinfo)))
++ return -EFAULT;
+ break;
+ }
+ case HIDIOCREVOKE:
+ {
+- if (user_arg)
+- ret = -EINVAL;
+- else
+- ret = hidraw_revoke(list);
+- break;
++ struct hidraw_list *list = file->private_data;
++
++ if (arg)
++ return -EINVAL;
++
++ return hidraw_revoke(list);
+ }
+ default:
+- {
+- struct hid_device *hid = dev->hid;
+- if (_IOC_TYPE(cmd) != 'H') {
+- ret = -EINVAL;
+- break;
+- }
++ /*
++ * None of the above ioctls can return -EAGAIN, so
++ * use it as a marker that we need to check variable
++ * length ioctls.
++ */
++ return -EAGAIN;
++ }
+
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCSFEATURE(0))) {
+- int len = _IOC_SIZE(cmd);
+- ret = hidraw_send_report(file, user_arg, len, HID_FEATURE_REPORT);
+- break;
+- }
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGFEATURE(0))) {
+- int len = _IOC_SIZE(cmd);
+- ret = hidraw_get_report(file, user_arg, len, HID_FEATURE_REPORT);
+- break;
+- }
++ return 0;
++}
+
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCSINPUT(0))) {
+- int len = _IOC_SIZE(cmd);
+- ret = hidraw_send_report(file, user_arg, len, HID_INPUT_REPORT);
+- break;
+- }
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGINPUT(0))) {
+- int len = _IOC_SIZE(cmd);
+- ret = hidraw_get_report(file, user_arg, len, HID_INPUT_REPORT);
+- break;
+- }
++static long hidraw_rw_variable_size_ioctl(struct file *file, struct hidraw *dev, unsigned int cmd,
++ void __user *user_arg)
++{
++ int len = _IOC_SIZE(cmd);
++
++ switch (cmd & ~IOCSIZE_MASK) {
++ case HIDIOCSFEATURE(0):
++ return hidraw_send_report(file, user_arg, len, HID_FEATURE_REPORT);
++ case HIDIOCGFEATURE(0):
++ return hidraw_get_report(file, user_arg, len, HID_FEATURE_REPORT);
++ case HIDIOCSINPUT(0):
++ return hidraw_send_report(file, user_arg, len, HID_INPUT_REPORT);
++ case HIDIOCGINPUT(0):
++ return hidraw_get_report(file, user_arg, len, HID_INPUT_REPORT);
++ case HIDIOCSOUTPUT(0):
++ return hidraw_send_report(file, user_arg, len, HID_OUTPUT_REPORT);
++ case HIDIOCGOUTPUT(0):
++ return hidraw_get_report(file, user_arg, len, HID_OUTPUT_REPORT);
++ }
+
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCSOUTPUT(0))) {
+- int len = _IOC_SIZE(cmd);
+- ret = hidraw_send_report(file, user_arg, len, HID_OUTPUT_REPORT);
+- break;
+- }
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGOUTPUT(0))) {
+- int len = _IOC_SIZE(cmd);
+- ret = hidraw_get_report(file, user_arg, len, HID_OUTPUT_REPORT);
+- break;
+- }
++ return -EINVAL;
++}
+
+- /* Begin Read-only ioctls. */
+- if (_IOC_DIR(cmd) != _IOC_READ) {
+- ret = -EINVAL;
+- break;
+- }
++static long hidraw_ro_variable_size_ioctl(struct file *file, struct hidraw *dev, unsigned int cmd,
++ void __user *user_arg)
++{
++ struct hid_device *hid = dev->hid;
++ int len = _IOC_SIZE(cmd);
++ int field_len;
++
++ switch (cmd & ~IOCSIZE_MASK) {
++ case HIDIOCGRAWNAME(0):
++ field_len = strlen(hid->name) + 1;
++ if (len > field_len)
++ len = field_len;
++ return copy_to_user(user_arg, hid->name, len) ? -EFAULT : len;
++ case HIDIOCGRAWPHYS(0):
++ field_len = strlen(hid->phys) + 1;
++ if (len > field_len)
++ len = field_len;
++ return copy_to_user(user_arg, hid->phys, len) ? -EFAULT : len;
++ case HIDIOCGRAWUNIQ(0):
++ field_len = strlen(hid->uniq) + 1;
++ if (len > field_len)
++ len = field_len;
++ return copy_to_user(user_arg, hid->uniq, len) ? -EFAULT : len;
++ }
+
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGRAWNAME(0))) {
+- int len = strlen(hid->name) + 1;
+- if (len > _IOC_SIZE(cmd))
+- len = _IOC_SIZE(cmd);
+- ret = copy_to_user(user_arg, hid->name, len) ?
+- -EFAULT : len;
+- break;
+- }
++ return -EINVAL;
++}
+
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGRAWPHYS(0))) {
+- int len = strlen(hid->phys) + 1;
+- if (len > _IOC_SIZE(cmd))
+- len = _IOC_SIZE(cmd);
+- ret = copy_to_user(user_arg, hid->phys, len) ?
+- -EFAULT : len;
+- break;
+- }
++static long hidraw_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
++{
++ struct inode *inode = file_inode(file);
++ unsigned int minor = iminor(inode);
++ struct hidraw *dev;
++ struct hidraw_list *list = file->private_data;
++ void __user *user_arg = (void __user *)arg;
++ int ret;
+
+- if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGRAWUNIQ(0))) {
+- int len = strlen(hid->uniq) + 1;
+- if (len > _IOC_SIZE(cmd))
+- len = _IOC_SIZE(cmd);
+- ret = copy_to_user(user_arg, hid->uniq, len) ?
+- -EFAULT : len;
+- break;
+- }
+- }
++ down_read(&minors_rwsem);
++ dev = hidraw_table[minor];
++ if (!dev || !dev->exist || hidraw_is_revoked(list)) {
++ ret = -ENODEV;
++ goto out;
++ }
++
++ if (_IOC_TYPE(cmd) != 'H') {
++ ret = -EINVAL;
++ goto out;
++ }
+
++ if (_IOC_NR(cmd) > HIDIOCTL_LAST || _IOC_NR(cmd) == 0) {
+ ret = -ENOTTY;
++ goto out;
+ }
++
++ ret = hidraw_fixed_size_ioctl(file, dev, cmd, user_arg);
++ if (ret != -EAGAIN)
++ goto out;
++
++ switch (_IOC_DIR(cmd)) {
++ case (_IOC_READ | _IOC_WRITE):
++ ret = hidraw_rw_variable_size_ioctl(file, dev, cmd, user_arg);
++ break;
++ case _IOC_READ:
++ ret = hidraw_ro_variable_size_ioctl(file, dev, cmd, user_arg);
++ break;
++ default:
++ /* Any other IOC_DIR is wrong */
++ ret = -EINVAL;
++ }
++
+ out:
+ up_read(&minors_rwsem);
+ return ret;
+diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
+index d3912e3f2f13ae..30ebde1273be33 100644
+--- a/drivers/hid/i2c-hid/i2c-hid-core.c
++++ b/drivers/hid/i2c-hid/i2c-hid-core.c
+@@ -112,9 +112,9 @@ struct i2c_hid {
+
+ struct i2chid_ops *ops;
+ struct drm_panel_follower panel_follower;
+- struct work_struct panel_follower_prepare_work;
++ struct work_struct panel_follower_work;
+ bool is_panel_follower;
+- bool prepare_work_finished;
++ bool panel_follower_work_finished;
+ };
+
+ static const struct i2c_hid_quirks {
+@@ -1110,10 +1110,10 @@ static int i2c_hid_core_probe_panel_follower(struct i2c_hid *ihid)
+ return ret;
+ }
+
+-static void ihid_core_panel_prepare_work(struct work_struct *work)
++static void ihid_core_panel_follower_work(struct work_struct *work)
+ {
+ struct i2c_hid *ihid = container_of(work, struct i2c_hid,
+- panel_follower_prepare_work);
++ panel_follower_work);
+ struct hid_device *hid = ihid->hid;
+ int ret;
+
+@@ -1130,7 +1130,7 @@ static void ihid_core_panel_prepare_work(struct work_struct *work)
+ if (ret)
+ dev_warn(&ihid->client->dev, "Power on failed: %d\n", ret);
+ else
+- WRITE_ONCE(ihid->prepare_work_finished, true);
++ WRITE_ONCE(ihid->panel_follower_work_finished, true);
+
+ /*
+ * The work APIs provide a number of memory ordering guarantees
+@@ -1139,12 +1139,12 @@ static void ihid_core_panel_prepare_work(struct work_struct *work)
+ * guarantee that a write that happened in the work is visible after
+ * cancel_work_sync(). We'll add a write memory barrier here to match
+ * with i2c_hid_core_panel_unpreparing() to ensure that our write to
+- * prepare_work_finished is visible there.
++ * panel_follower_work_finished is visible there.
+ */
+ smp_wmb();
+ }
+
+-static int i2c_hid_core_panel_prepared(struct drm_panel_follower *follower)
++static int i2c_hid_core_panel_follower_resume(struct drm_panel_follower *follower)
+ {
+ struct i2c_hid *ihid = container_of(follower, struct i2c_hid, panel_follower);
+
+@@ -1152,29 +1152,36 @@ static int i2c_hid_core_panel_prepared(struct drm_panel_follower *follower)
+ * Powering on a touchscreen can be a slow process. Queue the work to
+ * the system workqueue so we don't block the panel's power up.
+ */
+- WRITE_ONCE(ihid->prepare_work_finished, false);
+- schedule_work(&ihid->panel_follower_prepare_work);
++ WRITE_ONCE(ihid->panel_follower_work_finished, false);
++ schedule_work(&ihid->panel_follower_work);
+
+ return 0;
+ }
+
+-static int i2c_hid_core_panel_unpreparing(struct drm_panel_follower *follower)
++static int i2c_hid_core_panel_follower_suspend(struct drm_panel_follower *follower)
+ {
+ struct i2c_hid *ihid = container_of(follower, struct i2c_hid, panel_follower);
+
+- cancel_work_sync(&ihid->panel_follower_prepare_work);
++ cancel_work_sync(&ihid->panel_follower_work);
+
+- /* Match with ihid_core_panel_prepare_work() */
++ /* Match with ihid_core_panel_follower_work() */
+ smp_rmb();
+- if (!READ_ONCE(ihid->prepare_work_finished))
++ if (!READ_ONCE(ihid->panel_follower_work_finished))
+ return 0;
+
+ return i2c_hid_core_suspend(ihid, true);
+ }
+
+-static const struct drm_panel_follower_funcs i2c_hid_core_panel_follower_funcs = {
+- .panel_prepared = i2c_hid_core_panel_prepared,
+- .panel_unpreparing = i2c_hid_core_panel_unpreparing,
++static const struct drm_panel_follower_funcs
++ i2c_hid_core_panel_follower_prepare_funcs = {
++ .panel_prepared = i2c_hid_core_panel_follower_resume,
++ .panel_unpreparing = i2c_hid_core_panel_follower_suspend,
++};
++
++static const struct drm_panel_follower_funcs
++ i2c_hid_core_panel_follower_enable_funcs = {
++ .panel_enabled = i2c_hid_core_panel_follower_resume,
++ .panel_disabling = i2c_hid_core_panel_follower_suspend,
+ };
+
+ static int i2c_hid_core_register_panel_follower(struct i2c_hid *ihid)
+@@ -1182,7 +1189,10 @@ static int i2c_hid_core_register_panel_follower(struct i2c_hid *ihid)
+ struct device *dev = &ihid->client->dev;
+ int ret;
+
+- ihid->panel_follower.funcs = &i2c_hid_core_panel_follower_funcs;
++ if (ihid->hid->initial_quirks & HID_QUIRK_POWER_ON_AFTER_BACKLIGHT)
++ ihid->panel_follower.funcs = &i2c_hid_core_panel_follower_enable_funcs;
++ else
++ ihid->panel_follower.funcs = &i2c_hid_core_panel_follower_prepare_funcs;
+
+ /*
+ * If we're not in control of our own power up/power down then we can't
+@@ -1237,7 +1247,7 @@ int i2c_hid_core_probe(struct i2c_client *client, struct i2chid_ops *ops,
+ init_waitqueue_head(&ihid->wait);
+ mutex_init(&ihid->cmd_lock);
+ mutex_init(&ihid->reset_lock);
+- INIT_WORK(&ihid->panel_follower_prepare_work, ihid_core_panel_prepare_work);
++ INIT_WORK(&ihid->panel_follower_work, ihid_core_panel_follower_work);
+
+ /* we need to allocate the command buffer without knowing the maximum
+ * size of the reports. Let's use HID_MIN_BUFFER_SIZE, then we do the
+diff --git a/drivers/hid/i2c-hid/i2c-hid-of-elan.c b/drivers/hid/i2c-hid/i2c-hid-of-elan.c
+index 3fcff6daa0d3a6..0215f217f6d863 100644
+--- a/drivers/hid/i2c-hid/i2c-hid-of-elan.c
++++ b/drivers/hid/i2c-hid/i2c-hid-of-elan.c
+@@ -8,6 +8,7 @@
+ #include <linux/delay.h>
+ #include <linux/device.h>
+ #include <linux/gpio/consumer.h>
++#include <linux/hid.h>
+ #include <linux/i2c.h>
+ #include <linux/kernel.h>
+ #include <linux/module.h>
+@@ -23,6 +24,7 @@ struct elan_i2c_hid_chip_data {
+ unsigned int post_power_delay_ms;
+ u16 hid_descriptor_address;
+ const char *main_supply_name;
++ bool power_after_backlight;
+ };
+
+ struct i2c_hid_of_elan {
+@@ -97,6 +99,7 @@ static int i2c_hid_of_elan_probe(struct i2c_client *client)
+ {
+ struct i2c_hid_of_elan *ihid_elan;
+ int ret;
++ u32 quirks = 0;
+
+ ihid_elan = devm_kzalloc(&client->dev, sizeof(*ihid_elan), GFP_KERNEL);
+ if (!ihid_elan)
+@@ -131,8 +134,12 @@ static int i2c_hid_of_elan_probe(struct i2c_client *client)
+ }
+ }
+
++ if (ihid_elan->chip_data->power_after_backlight)
++ quirks = HID_QUIRK_POWER_ON_AFTER_BACKLIGHT;
++
+ ret = i2c_hid_core_probe(client, &ihid_elan->ops,
+- ihid_elan->chip_data->hid_descriptor_address, 0);
++ ihid_elan->chip_data->hid_descriptor_address,
++ quirks);
+ if (ret)
+ goto err_deassert_reset;
+
+@@ -150,6 +157,7 @@ static const struct elan_i2c_hid_chip_data elan_ekth6915_chip_data = {
+ .post_gpio_reset_on_delay_ms = 300,
+ .hid_descriptor_address = 0x0001,
+ .main_supply_name = "vcc33",
++ .power_after_backlight = true,
+ };
+
+ static const struct elan_i2c_hid_chip_data elan_ekth6a12nay_chip_data = {
+@@ -157,6 +165,7 @@ static const struct elan_i2c_hid_chip_data elan_ekth6a12nay_chip_data = {
+ .post_gpio_reset_on_delay_ms = 300,
+ .hid_descriptor_address = 0x0001,
+ .main_supply_name = "vcc33",
++ .power_after_backlight = true,
+ };
+
+ static const struct elan_i2c_hid_chip_data ilitek_ili9882t_chip_data = {
+diff --git a/drivers/hwmon/asus-ec-sensors.c b/drivers/hwmon/asus-ec-sensors.c
+index 4ac554731e98a7..f43efb80aabf39 100644
+--- a/drivers/hwmon/asus-ec-sensors.c
++++ b/drivers/hwmon/asus-ec-sensors.c
+@@ -396,7 +396,7 @@ static const struct ec_board_info board_info_pro_art_x870E_creator_wifi = {
+ .sensors = SENSOR_TEMP_CPU | SENSOR_TEMP_CPU_PACKAGE |
+ SENSOR_TEMP_MB | SENSOR_TEMP_VRM |
+ SENSOR_TEMP_T_SENSOR | SENSOR_FAN_CPU_OPT,
+- .mutex_path = ACPI_GLOBAL_LOCK_PSEUDO_PATH,
++ .mutex_path = ASUS_HW_ACCESS_MUTEX_SB_PCI0_SBRG_SIO1_MUT0,
+ .family = family_amd_800_series,
+ };
+
+diff --git a/drivers/hwmon/mlxreg-fan.c b/drivers/hwmon/mlxreg-fan.c
+index c25a54d5b39ad5..0ba9195c9d713e 100644
+--- a/drivers/hwmon/mlxreg-fan.c
++++ b/drivers/hwmon/mlxreg-fan.c
+@@ -113,8 +113,8 @@ struct mlxreg_fan {
+ int divider;
+ };
+
+-static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
+- unsigned long state);
++static int _mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
++ unsigned long state, bool thermal);
+
+ static int
+ mlxreg_fan_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+@@ -224,8 +224,9 @@ mlxreg_fan_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+ * last thermal state.
+ */
+ if (pwm->last_hwmon_state >= pwm->last_thermal_state)
+- return mlxreg_fan_set_cur_state(pwm->cdev,
+- pwm->last_hwmon_state);
++ return _mlxreg_fan_set_cur_state(pwm->cdev,
++ pwm->last_hwmon_state,
++ false);
+ return 0;
+ }
+ return regmap_write(fan->regmap, pwm->reg, val);
+@@ -357,9 +358,8 @@ static int mlxreg_fan_get_cur_state(struct thermal_cooling_device *cdev,
+ return 0;
+ }
+
+-static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
+- unsigned long state)
+-
++static int _mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
++ unsigned long state, bool thermal)
+ {
+ struct mlxreg_fan_pwm *pwm = cdev->devdata;
+ struct mlxreg_fan *fan = pwm->fan;
+@@ -369,7 +369,8 @@ static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
+ return -EINVAL;
+
+ /* Save thermal state. */
+- pwm->last_thermal_state = state;
++ if (thermal)
++ pwm->last_thermal_state = state;
+
+ state = max_t(unsigned long, state, pwm->last_hwmon_state);
+ err = regmap_write(fan->regmap, pwm->reg,
+@@ -381,6 +382,13 @@ static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
+ return 0;
+ }
+
++static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
++ unsigned long state)
++
++{
++ return _mlxreg_fan_set_cur_state(cdev, state, true);
++}
++
+ static const struct thermal_cooling_device_ops mlxreg_fan_cooling_ops = {
+ .get_max_state = mlxreg_fan_get_max_state,
+ .get_cur_state = mlxreg_fan_get_cur_state,
+diff --git a/drivers/hwtracing/coresight/coresight-catu.c b/drivers/hwtracing/coresight/coresight-catu.c
+index 5058432233da19..4c345ff2cff141 100644
+--- a/drivers/hwtracing/coresight/coresight-catu.c
++++ b/drivers/hwtracing/coresight/coresight-catu.c
+@@ -520,6 +520,10 @@ static int __catu_probe(struct device *dev, struct resource *res)
+ struct coresight_platform_data *pdata = NULL;
+ void __iomem *base;
+
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
++
+ catu_desc.name = coresight_alloc_device_name(&catu_devs, dev);
+ if (!catu_desc.name)
+ return -ENOMEM;
+@@ -632,7 +636,7 @@ static int catu_platform_probe(struct platform_device *pdev)
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(&pdev->dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+
+ pm_runtime_get_noresume(&pdev->dev);
+ pm_runtime_set_active(&pdev->dev);
+@@ -641,11 +645,8 @@ static int catu_platform_probe(struct platform_device *pdev)
+ dev_set_drvdata(&pdev->dev, drvdata);
+ ret = __catu_probe(&pdev->dev, res);
+ pm_runtime_put(&pdev->dev);
+- if (ret) {
++ if (ret)
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+- }
+
+ return ret;
+ }
+@@ -659,8 +660,6 @@ static void catu_platform_remove(struct platform_device *pdev)
+
+ __catu_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ #ifdef CONFIG_PM
+@@ -668,18 +667,26 @@ static int catu_runtime_suspend(struct device *dev)
+ {
+ struct catu_drvdata *drvdata = dev_get_drvdata(dev);
+
+- if (drvdata && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_disable_unprepare(drvdata->pclk);
++ clk_disable_unprepare(drvdata->atclk);
++ clk_disable_unprepare(drvdata->pclk);
++
+ return 0;
+ }
+
+ static int catu_runtime_resume(struct device *dev)
+ {
+ struct catu_drvdata *drvdata = dev_get_drvdata(dev);
++ int ret;
+
+- if (drvdata && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_prepare_enable(drvdata->pclk);
+- return 0;
++ ret = clk_prepare_enable(drvdata->pclk);
++ if (ret)
++ return ret;
++
++ ret = clk_prepare_enable(drvdata->atclk);
++ if (ret)
++ clk_disable_unprepare(drvdata->pclk);
++
++ return ret;
+ }
+ #endif
+
+diff --git a/drivers/hwtracing/coresight/coresight-catu.h b/drivers/hwtracing/coresight/coresight-catu.h
+index 755776cd19c5bb..6e6b7aac206dca 100644
+--- a/drivers/hwtracing/coresight/coresight-catu.h
++++ b/drivers/hwtracing/coresight/coresight-catu.h
+@@ -62,6 +62,7 @@
+
+ struct catu_drvdata {
+ struct clk *pclk;
++ struct clk *atclk;
+ void __iomem *base;
+ struct coresight_device *csdev;
+ int irq;
+diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
+index fa758cc2182755..1accd7cbd54bf0 100644
+--- a/drivers/hwtracing/coresight/coresight-core.c
++++ b/drivers/hwtracing/coresight/coresight-core.c
+@@ -3,6 +3,7 @@
+ * Copyright (c) 2012, The Linux Foundation. All rights reserved.
+ */
+
++#include <linux/bitfield.h>
+ #include <linux/build_bug.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+@@ -1374,8 +1375,9 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ goto out_unlock;
+ }
+
+- if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
+- csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) {
++ if ((csdev->type == CORESIGHT_DEV_TYPE_SINK ||
++ csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) &&
++ sink_ops(csdev)->alloc_buffer) {
+ ret = etm_perf_add_symlink_sink(csdev);
+
+ if (ret) {
+diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c b/drivers/hwtracing/coresight/coresight-cpu-debug.c
+index a871d997330b09..e39dfb886688e1 100644
+--- a/drivers/hwtracing/coresight/coresight-cpu-debug.c
++++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c
+@@ -699,7 +699,7 @@ static int debug_platform_probe(struct platform_device *pdev)
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(&pdev->dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+
+ dev_set_drvdata(&pdev->dev, drvdata);
+ pm_runtime_get_noresume(&pdev->dev);
+@@ -710,8 +710,6 @@ static int debug_platform_probe(struct platform_device *pdev)
+ if (ret) {
+ pm_runtime_put_noidle(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+ return ret;
+ }
+@@ -725,8 +723,6 @@ static void debug_platform_remove(struct platform_device *pdev)
+
+ __debug_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ #ifdef CONFIG_ACPI
+diff --git a/drivers/hwtracing/coresight/coresight-ctcu-core.c b/drivers/hwtracing/coresight/coresight-ctcu-core.c
+index c6bafc96db9633..de279efe340581 100644
+--- a/drivers/hwtracing/coresight/coresight-ctcu-core.c
++++ b/drivers/hwtracing/coresight/coresight-ctcu-core.c
+@@ -209,7 +209,7 @@ static int ctcu_probe(struct platform_device *pdev)
+
+ drvdata->apb_clk = coresight_get_enable_apb_pclk(dev);
+ if (IS_ERR(drvdata->apb_clk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->apb_clk);
+
+ cfgs = of_device_get_match_data(dev);
+ if (cfgs) {
+@@ -233,12 +233,8 @@ static int ctcu_probe(struct platform_device *pdev)
+ desc.access = CSDEV_ACCESS_IOMEM(base);
+
+ drvdata->csdev = coresight_register(&desc);
+- if (IS_ERR(drvdata->csdev)) {
+- if (!IS_ERR_OR_NULL(drvdata->apb_clk))
+- clk_put(drvdata->apb_clk);
+-
++ if (IS_ERR(drvdata->csdev))
+ return PTR_ERR(drvdata->csdev);
+- }
+
+ return 0;
+ }
+@@ -275,8 +271,6 @@ static void ctcu_platform_remove(struct platform_device *pdev)
+
+ ctcu_remove(pdev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->apb_clk))
+- clk_put(drvdata->apb_clk);
+ }
+
+ #ifdef CONFIG_PM
+diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
+index d5efb085b30d36..8e81b41eb22264 100644
+--- a/drivers/hwtracing/coresight/coresight-etb10.c
++++ b/drivers/hwtracing/coresight/coresight-etb10.c
+@@ -730,12 +730,10 @@ static int etb_probe(struct amba_device *adev, const struct amba_id *id)
+ if (!drvdata)
+ return -ENOMEM;
+
+- drvdata->atclk = devm_clk_get(&adev->dev, "atclk"); /* optional */
+- if (!IS_ERR(drvdata->atclk)) {
+- ret = clk_prepare_enable(drvdata->atclk);
+- if (ret)
+- return ret;
+- }
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
++
+ dev_set_drvdata(dev, drvdata);
+
+ /* validity for the resource is already checked by the AMBA core */
+diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
+index 1c6204e1442211..baba2245b1dfb3 100644
+--- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
++++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
+@@ -832,12 +832,9 @@ static int etm_probe(struct amba_device *adev, const struct amba_id *id)
+
+ spin_lock_init(&drvdata->spinlock);
+
+- drvdata->atclk = devm_clk_get(&adev->dev, "atclk"); /* optional */
+- if (!IS_ERR(drvdata->atclk)) {
+- ret = clk_prepare_enable(drvdata->atclk);
+- if (ret)
+- return ret;
+- }
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
+
+ drvdata->cpu = coresight_get_cpu(dev);
+ if (drvdata->cpu < 0)
+diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
+index 42e5d37403addc..4b98a7bf4cb731 100644
+--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
++++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
+@@ -4,6 +4,7 @@
+ */
+
+ #include <linux/acpi.h>
++#include <linux/bitfield.h>
+ #include <linux/bitops.h>
+ #include <linux/kernel.h>
+ #include <linux/kvm_host.h>
+@@ -528,7 +529,8 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
+ etm4x_relaxed_write32(csa, config->seq_rst, TRCSEQRSTEVR);
+ etm4x_relaxed_write32(csa, config->seq_state, TRCSEQSTR);
+ }
+- etm4x_relaxed_write32(csa, config->ext_inp, TRCEXTINSELR);
++ if (drvdata->numextinsel)
++ etm4x_relaxed_write32(csa, config->ext_inp, TRCEXTINSELR);
+ for (i = 0; i < drvdata->nr_cntr; i++) {
+ etm4x_relaxed_write32(csa, config->cntrldvr[i], TRCCNTRLDVRn(i));
+ etm4x_relaxed_write32(csa, config->cntr_ctrl[i], TRCCNTCTLRn(i));
+@@ -1423,6 +1425,7 @@ static void etm4_init_arch_data(void *info)
+ etmidr5 = etm4x_relaxed_read32(csa, TRCIDR5);
+ /* NUMEXTIN, bits[8:0] number of external inputs implemented */
+ drvdata->nr_ext_inp = FIELD_GET(TRCIDR5_NUMEXTIN_MASK, etmidr5);
++ drvdata->numextinsel = FIELD_GET(TRCIDR5_NUMEXTINSEL_MASK, etmidr5);
+ /* TRACEIDSIZE, bits[21:16] indicates the trace ID width */
+ drvdata->trcid_size = FIELD_GET(TRCIDR5_TRACEIDSIZE_MASK, etmidr5);
+ /* ATBTRIG, bit[22] implementation can support ATB triggers? */
+@@ -1852,7 +1855,9 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata)
+ state->trcseqrstevr = etm4x_read32(csa, TRCSEQRSTEVR);
+ state->trcseqstr = etm4x_read32(csa, TRCSEQSTR);
+ }
+- state->trcextinselr = etm4x_read32(csa, TRCEXTINSELR);
++
++ if (drvdata->numextinsel)
++ state->trcextinselr = etm4x_read32(csa, TRCEXTINSELR);
+
+ for (i = 0; i < drvdata->nr_cntr; i++) {
+ state->trccntrldvr[i] = etm4x_read32(csa, TRCCNTRLDVRn(i));
+@@ -1984,7 +1989,8 @@ static void __etm4_cpu_restore(struct etmv4_drvdata *drvdata)
+ etm4x_relaxed_write32(csa, state->trcseqrstevr, TRCSEQRSTEVR);
+ etm4x_relaxed_write32(csa, state->trcseqstr, TRCSEQSTR);
+ }
+- etm4x_relaxed_write32(csa, state->trcextinselr, TRCEXTINSELR);
++ if (drvdata->numextinsel)
++ etm4x_relaxed_write32(csa, state->trcextinselr, TRCEXTINSELR);
+
+ for (i = 0; i < drvdata->nr_cntr; i++) {
+ etm4x_relaxed_write32(csa, state->trccntrldvr[i], TRCCNTRLDVRn(i));
+@@ -2215,6 +2221,10 @@ static int etm4_probe(struct device *dev)
+ if (WARN_ON(!drvdata))
+ return -ENOMEM;
+
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
++
+ if (pm_save_enable == PARAM_PM_SAVE_FIRMWARE)
+ pm_save_enable = coresight_loses_context_with_cpu(dev) ?
+ PARAM_PM_SAVE_SELF_HOSTED : PARAM_PM_SAVE_NEVER;
+@@ -2299,14 +2309,12 @@ static int etm4_probe_platform_dev(struct platform_device *pdev)
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(&pdev->dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+
+ if (res) {
+ drvdata->base = devm_ioremap_resource(&pdev->dev, res);
+- if (IS_ERR(drvdata->base)) {
+- clk_put(drvdata->pclk);
++ if (IS_ERR(drvdata->base))
+ return PTR_ERR(drvdata->base);
+- }
+ }
+
+ dev_set_drvdata(&pdev->dev, drvdata);
+@@ -2413,9 +2421,6 @@ static void etm4_remove_platform_dev(struct platform_device *pdev)
+ if (drvdata)
+ etm4_remove_dev(drvdata);
+ pm_runtime_disable(&pdev->dev);
+-
+- if (drvdata && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ static const struct amba_id etm4_ids[] = {
+@@ -2463,8 +2468,8 @@ static int etm4_runtime_suspend(struct device *dev)
+ {
+ struct etmv4_drvdata *drvdata = dev_get_drvdata(dev);
+
+- if (drvdata->pclk && !IS_ERR(drvdata->pclk))
+- clk_disable_unprepare(drvdata->pclk);
++ clk_disable_unprepare(drvdata->atclk);
++ clk_disable_unprepare(drvdata->pclk);
+
+ return 0;
+ }
+@@ -2472,11 +2477,17 @@ static int etm4_runtime_suspend(struct device *dev)
+ static int etm4_runtime_resume(struct device *dev)
+ {
+ struct etmv4_drvdata *drvdata = dev_get_drvdata(dev);
++ int ret;
+
+- if (drvdata->pclk && !IS_ERR(drvdata->pclk))
+- clk_prepare_enable(drvdata->pclk);
++ ret = clk_prepare_enable(drvdata->pclk);
++ if (ret)
++ return ret;
+
+- return 0;
++ ret = clk_prepare_enable(drvdata->atclk);
++ if (ret)
++ clk_disable_unprepare(drvdata->pclk);
++
++ return ret;
+ }
+ #endif
+
+diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
+index ab251865b893d8..e9eeea6240d557 100644
+--- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
++++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
+@@ -4,6 +4,7 @@
+ * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
+ */
+
++#include <linux/bitfield.h>
+ #include <linux/coresight.h>
+ #include <linux/pid_namespace.h>
+ #include <linux/pm_runtime.h>
+diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
+index ac649515054d90..13ec9ecef46f5b 100644
+--- a/drivers/hwtracing/coresight/coresight-etm4x.h
++++ b/drivers/hwtracing/coresight/coresight-etm4x.h
+@@ -162,6 +162,7 @@
+ #define TRCIDR4_NUMVMIDC_MASK GENMASK(31, 28)
+
+ #define TRCIDR5_NUMEXTIN_MASK GENMASK(8, 0)
++#define TRCIDR5_NUMEXTINSEL_MASK GENMASK(11, 9)
+ #define TRCIDR5_TRACEIDSIZE_MASK GENMASK(21, 16)
+ #define TRCIDR5_ATBTRIG BIT(22)
+ #define TRCIDR5_LPOVERRIDE BIT(23)
+@@ -919,7 +920,8 @@ struct etmv4_save_state {
+
+ /**
+ * struct etm4_drvdata - specifics associated to an ETM component
+- * @pclk APB clock if present, otherwise NULL
++ * @pclk: APB clock if present, otherwise NULL
++ * @atclk: Optional clock for the core parts of the ETMv4.
+ * @base: Memory mapped base address for this component.
+ * @csdev: Component vitals needed by the framework.
+ * @spinlock: Only one at a time pls.
+@@ -988,6 +990,7 @@ struct etmv4_save_state {
+ */
+ struct etmv4_drvdata {
+ struct clk *pclk;
++ struct clk *atclk;
+ void __iomem *base;
+ struct coresight_device *csdev;
+ raw_spinlock_t spinlock;
+@@ -999,6 +1002,7 @@ struct etmv4_drvdata {
+ u8 nr_cntr;
+ u8 nr_ext_inp;
+ u8 numcidc;
++ u8 numextinsel;
+ u8 numvmidc;
+ u8 nrseqstate;
+ u8 nr_event;
+diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c
+index b1922dbe9292b0..b044a4125310ba 100644
+--- a/drivers/hwtracing/coresight/coresight-funnel.c
++++ b/drivers/hwtracing/coresight/coresight-funnel.c
+@@ -213,7 +213,6 @@ ATTRIBUTE_GROUPS(coresight_funnel);
+
+ static int funnel_probe(struct device *dev, struct resource *res)
+ {
+- int ret;
+ void __iomem *base;
+ struct coresight_platform_data *pdata = NULL;
+ struct funnel_drvdata *drvdata;
+@@ -231,16 +230,13 @@ static int funnel_probe(struct device *dev, struct resource *res)
+ if (!drvdata)
+ return -ENOMEM;
+
+- drvdata->atclk = devm_clk_get(dev, "atclk"); /* optional */
+- if (!IS_ERR(drvdata->atclk)) {
+- ret = clk_prepare_enable(drvdata->atclk);
+- if (ret)
+- return ret;
+- }
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+
+ /*
+ * Map the device base for dynamic-funnel, which has been
+@@ -248,10 +244,8 @@ static int funnel_probe(struct device *dev, struct resource *res)
+ */
+ if (res) {
+ base = devm_ioremap_resource(dev, res);
+- if (IS_ERR(base)) {
+- ret = PTR_ERR(base);
+- goto out_disable_clk;
+- }
++ if (IS_ERR(base))
++ return PTR_ERR(base);
+ drvdata->base = base;
+ desc.groups = coresight_funnel_groups;
+ desc.access = CSDEV_ACCESS_IOMEM(base);
+@@ -261,10 +255,9 @@ static int funnel_probe(struct device *dev, struct resource *res)
+ dev_set_drvdata(dev, drvdata);
+
+ pdata = coresight_get_platform_data(dev);
+- if (IS_ERR(pdata)) {
+- ret = PTR_ERR(pdata);
+- goto out_disable_clk;
+- }
++ if (IS_ERR(pdata))
++ return PTR_ERR(pdata);
++
+ dev->platform_data = pdata;
+
+ raw_spin_lock_init(&drvdata->spinlock);
+@@ -274,19 +267,10 @@ static int funnel_probe(struct device *dev, struct resource *res)
+ desc.pdata = pdata;
+ desc.dev = dev;
+ drvdata->csdev = coresight_register(&desc);
+- if (IS_ERR(drvdata->csdev)) {
+- ret = PTR_ERR(drvdata->csdev);
+- goto out_disable_clk;
+- }
++ if (IS_ERR(drvdata->csdev))
++ return PTR_ERR(drvdata->csdev);
+
+- ret = 0;
+-
+-out_disable_clk:
+- if (ret && !IS_ERR_OR_NULL(drvdata->atclk))
+- clk_disable_unprepare(drvdata->atclk);
+- if (ret && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_disable_unprepare(drvdata->pclk);
+- return ret;
++ return 0;
+ }
+
+ static int funnel_remove(struct device *dev)
+@@ -355,8 +339,6 @@ static void funnel_platform_remove(struct platform_device *pdev)
+
+ funnel_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ static const struct of_device_id funnel_match[] = {
+diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c
+index 06efd2b01a0f71..9e8bd36e7a9a2f 100644
+--- a/drivers/hwtracing/coresight/coresight-replicator.c
++++ b/drivers/hwtracing/coresight/coresight-replicator.c
+@@ -219,7 +219,6 @@ static const struct attribute_group *replicator_groups[] = {
+
+ static int replicator_probe(struct device *dev, struct resource *res)
+ {
+- int ret = 0;
+ struct coresight_platform_data *pdata = NULL;
+ struct replicator_drvdata *drvdata;
+ struct coresight_desc desc = { 0 };
+@@ -238,16 +237,13 @@ static int replicator_probe(struct device *dev, struct resource *res)
+ if (!drvdata)
+ return -ENOMEM;
+
+- drvdata->atclk = devm_clk_get(dev, "atclk"); /* optional */
+- if (!IS_ERR(drvdata->atclk)) {
+- ret = clk_prepare_enable(drvdata->atclk);
+- if (ret)
+- return ret;
+- }
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+
+ /*
+ * Map the device base for dynamic-replicator, which has been
+@@ -255,10 +251,8 @@ static int replicator_probe(struct device *dev, struct resource *res)
+ */
+ if (res) {
+ base = devm_ioremap_resource(dev, res);
+- if (IS_ERR(base)) {
+- ret = PTR_ERR(base);
+- goto out_disable_clk;
+- }
++ if (IS_ERR(base))
++ return PTR_ERR(base);
+ drvdata->base = base;
+ desc.groups = replicator_groups;
+ desc.access = CSDEV_ACCESS_IOMEM(base);
+@@ -272,10 +266,8 @@ static int replicator_probe(struct device *dev, struct resource *res)
+ dev_set_drvdata(dev, drvdata);
+
+ pdata = coresight_get_platform_data(dev);
+- if (IS_ERR(pdata)) {
+- ret = PTR_ERR(pdata);
+- goto out_disable_clk;
+- }
++ if (IS_ERR(pdata))
++ return PTR_ERR(pdata);
+ dev->platform_data = pdata;
+
+ raw_spin_lock_init(&drvdata->spinlock);
+@@ -286,19 +278,11 @@ static int replicator_probe(struct device *dev, struct resource *res)
+ desc.dev = dev;
+
+ drvdata->csdev = coresight_register(&desc);
+- if (IS_ERR(drvdata->csdev)) {
+- ret = PTR_ERR(drvdata->csdev);
+- goto out_disable_clk;
+- }
++ if (IS_ERR(drvdata->csdev))
++ return PTR_ERR(drvdata->csdev);
+
+ replicator_reset(drvdata);
+-
+-out_disable_clk:
+- if (ret && !IS_ERR_OR_NULL(drvdata->atclk))
+- clk_disable_unprepare(drvdata->atclk);
+- if (ret && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_disable_unprepare(drvdata->pclk);
+- return ret;
++ return 0;
+ }
+
+ static int replicator_remove(struct device *dev)
+@@ -335,8 +319,6 @@ static void replicator_platform_remove(struct platform_device *pdev)
+
+ replicator_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ #ifdef CONFIG_PM
+diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c
+index e45c6c7204b449..57fbe3ad0fb205 100644
+--- a/drivers/hwtracing/coresight/coresight-stm.c
++++ b/drivers/hwtracing/coresight/coresight-stm.c
+@@ -842,16 +842,13 @@ static int __stm_probe(struct device *dev, struct resource *res)
+ if (!drvdata)
+ return -ENOMEM;
+
+- drvdata->atclk = devm_clk_get(dev, "atclk"); /* optional */
+- if (!IS_ERR(drvdata->atclk)) {
+- ret = clk_prepare_enable(drvdata->atclk);
+- if (ret)
+- return ret;
+- }
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+ dev_set_drvdata(dev, drvdata);
+
+ base = devm_ioremap_resource(dev, res);
+@@ -1033,8 +1030,6 @@ static void stm_platform_remove(struct platform_device *pdev)
+
+ __stm_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ #ifdef CONFIG_ACPI
+diff --git a/drivers/hwtracing/coresight/coresight-syscfg.c b/drivers/hwtracing/coresight/coresight-syscfg.c
+index 83dad24e0116d4..6836b05986e809 100644
+--- a/drivers/hwtracing/coresight/coresight-syscfg.c
++++ b/drivers/hwtracing/coresight/coresight-syscfg.c
+@@ -395,7 +395,7 @@ static void cscfg_remove_owned_csdev_configs(struct coresight_device *csdev, voi
+ if (list_empty(&csdev->config_csdev_list))
+ return;
+
+- guard(raw_spinlock_irqsave)(&csdev->cscfg_csdev_lock);
++ guard(raw_spinlock_irqsave)(&csdev->cscfg_csdev_lock);
+
+ list_for_each_entry_safe(config_csdev, tmp, &csdev->config_csdev_list, node) {
+ if (config_csdev->config_desc->load_owner == load_owner)
+diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
+index 88afb16bb6bec3..e867198b03e828 100644
+--- a/drivers/hwtracing/coresight/coresight-tmc-core.c
++++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
+@@ -789,6 +789,10 @@ static int __tmc_probe(struct device *dev, struct resource *res)
+ struct coresight_desc desc = { 0 };
+ struct coresight_dev_list *dev_list = NULL;
+
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
++
+ ret = -ENOMEM;
+
+ /* Validity for the resource is already checked by the AMBA core */
+@@ -987,7 +991,7 @@ static int tmc_platform_probe(struct platform_device *pdev)
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(&pdev->dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+
+ dev_set_drvdata(&pdev->dev, drvdata);
+ pm_runtime_get_noresume(&pdev->dev);
+@@ -1011,8 +1015,6 @@ static void tmc_platform_remove(struct platform_device *pdev)
+
+ __tmc_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ #ifdef CONFIG_PM
+@@ -1020,18 +1022,26 @@ static int tmc_runtime_suspend(struct device *dev)
+ {
+ struct tmc_drvdata *drvdata = dev_get_drvdata(dev);
+
+- if (drvdata && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_disable_unprepare(drvdata->pclk);
++ clk_disable_unprepare(drvdata->atclk);
++ clk_disable_unprepare(drvdata->pclk);
++
+ return 0;
+ }
+
+ static int tmc_runtime_resume(struct device *dev)
+ {
+ struct tmc_drvdata *drvdata = dev_get_drvdata(dev);
++ int ret;
+
+- if (drvdata && !IS_ERR_OR_NULL(drvdata->pclk))
+- clk_prepare_enable(drvdata->pclk);
+- return 0;
++ ret = clk_prepare_enable(drvdata->pclk);
++ if (ret)
++ return ret;
++
++ ret = clk_prepare_enable(drvdata->atclk);
++ if (ret)
++ clk_disable_unprepare(drvdata->pclk);
++
++ return ret;
+ }
+ #endif
+
+diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
+index 6541a27a018e6c..cbb4ba43915855 100644
+--- a/drivers/hwtracing/coresight/coresight-tmc.h
++++ b/drivers/hwtracing/coresight/coresight-tmc.h
+@@ -210,6 +210,7 @@ struct tmc_resrv_buf {
+
+ /**
+ * struct tmc_drvdata - specifics associated to an TMC component
++ * @atclk: optional clock for the core parts of the TMC.
+ * @pclk: APB clock if present, otherwise NULL
+ * @base: memory mapped base address for this component.
+ * @csdev: component vitals needed by the framework.
+@@ -244,6 +245,7 @@ struct tmc_resrv_buf {
+ * Used by ETR/ETF.
+ */
+ struct tmc_drvdata {
++ struct clk *atclk;
+ struct clk *pclk;
+ void __iomem *base;
+ struct coresight_device *csdev;
+diff --git a/drivers/hwtracing/coresight/coresight-tpda.c b/drivers/hwtracing/coresight/coresight-tpda.c
+index 0633f04beb240b..333b3cb236859f 100644
+--- a/drivers/hwtracing/coresight/coresight-tpda.c
++++ b/drivers/hwtracing/coresight/coresight-tpda.c
+@@ -71,6 +71,8 @@ static int tpdm_read_element_size(struct tpda_drvdata *drvdata,
+ if (tpdm_data->dsb) {
+ rc = fwnode_property_read_u32(dev_fwnode(csdev->dev.parent),
+ "qcom,dsb-element-bits", &drvdata->dsb_esize);
++ if (rc)
++ goto out;
+ }
+
+ if (tpdm_data->cmb) {
+@@ -78,6 +80,7 @@ static int tpdm_read_element_size(struct tpda_drvdata *drvdata,
+ "qcom,cmb-element-bits", &drvdata->cmb_esize);
+ }
+
++out:
+ if (rc)
+ dev_warn_once(&csdev->dev,
+ "Failed to read TPDM Element size: %d\n", rc);
+diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c
+index 3e015928842808..8d6179c83e5d31 100644
+--- a/drivers/hwtracing/coresight/coresight-tpiu.c
++++ b/drivers/hwtracing/coresight/coresight-tpiu.c
+@@ -128,7 +128,6 @@ static const struct coresight_ops tpiu_cs_ops = {
+
+ static int __tpiu_probe(struct device *dev, struct resource *res)
+ {
+- int ret;
+ void __iomem *base;
+ struct coresight_platform_data *pdata = NULL;
+ struct tpiu_drvdata *drvdata;
+@@ -144,16 +143,13 @@ static int __tpiu_probe(struct device *dev, struct resource *res)
+
+ spin_lock_init(&drvdata->spinlock);
+
+- drvdata->atclk = devm_clk_get(dev, "atclk"); /* optional */
+- if (!IS_ERR(drvdata->atclk)) {
+- ret = clk_prepare_enable(drvdata->atclk);
+- if (ret)
+- return ret;
+- }
++ drvdata->atclk = devm_clk_get_optional_enabled(dev, "atclk");
++ if (IS_ERR(drvdata->atclk))
++ return PTR_ERR(drvdata->atclk);
+
+ drvdata->pclk = coresight_get_enable_apb_pclk(dev);
+ if (IS_ERR(drvdata->pclk))
+- return -ENODEV;
++ return PTR_ERR(drvdata->pclk);
+ dev_set_drvdata(dev, drvdata);
+
+ /* Validity for the resource is already checked by the AMBA core */
+@@ -293,8 +289,6 @@ static void tpiu_platform_remove(struct platform_device *pdev)
+
+ __tpiu_remove(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+- if (!IS_ERR_OR_NULL(drvdata->pclk))
+- clk_put(drvdata->pclk);
+ }
+
+ #ifdef CONFIG_ACPI
+diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
+index 8267dd1a2130d3..43643d2c5bdd0a 100644
+--- a/drivers/hwtracing/coresight/coresight-trbe.c
++++ b/drivers/hwtracing/coresight/coresight-trbe.c
+@@ -23,7 +23,8 @@
+ #include "coresight-self-hosted-trace.h"
+ #include "coresight-trbe.h"
+
+-#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
++#define PERF_IDX2OFF(idx, buf) \
++ ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT))
+
+ /*
+ * A padding packet that will help the user space tools
+@@ -257,6 +258,7 @@ static void trbe_drain_and_disable_local(struct trbe_cpudata *cpudata)
+ static void trbe_reset_local(struct trbe_cpudata *cpudata)
+ {
+ write_sysreg_s(0, SYS_TRBLIMITR_EL1);
++ isb();
+ trbe_drain_buffer();
+ write_sysreg_s(0, SYS_TRBPTR_EL1);
+ write_sysreg_s(0, SYS_TRBBASER_EL1);
+@@ -747,12 +749,12 @@ static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
+
+ buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, trbe_alloc_node(event));
+ if (!buf)
+- return ERR_PTR(-ENOMEM);
++ return NULL;
+
+ pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL);
+ if (!pglist) {
+ kfree(buf);
+- return ERR_PTR(-ENOMEM);
++ return NULL;
+ }
+
+ for (i = 0; i < nr_pages; i++)
+@@ -762,7 +764,7 @@ static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
+ if (!buf->trbe_base) {
+ kfree(pglist);
+ kfree(buf);
+- return ERR_PTR(-ENOMEM);
++ return NULL;
+ }
+ buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
+ buf->trbe_write = buf->trbe_base;
+@@ -1279,7 +1281,7 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
+ * into the device for that purpose.
+ */
+ desc.pdata = devm_kzalloc(dev, sizeof(*desc.pdata), GFP_KERNEL);
+- if (IS_ERR(desc.pdata))
++ if (!desc.pdata)
+ goto cpu_clear;
+
+ desc.type = CORESIGHT_DEV_TYPE_SINK;
+diff --git a/drivers/hwtracing/coresight/ultrasoc-smb.h b/drivers/hwtracing/coresight/ultrasoc-smb.h
+index c4c111275627b1..323f0ccb6878cb 100644
+--- a/drivers/hwtracing/coresight/ultrasoc-smb.h
++++ b/drivers/hwtracing/coresight/ultrasoc-smb.h
+@@ -7,6 +7,7 @@
+ #ifndef _ULTRASOC_SMB_H
+ #define _ULTRASOC_SMB_H
+
++#include <linux/bitfield.h>
+ #include <linux/miscdevice.h>
+ #include <linux/spinlock.h>
+
+diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
+index a35e4c64a1d46f..e37210d6c5f264 100644
+--- a/drivers/i2c/busses/i2c-designware-platdrv.c
++++ b/drivers/i2c/busses/i2c-designware-platdrv.c
+@@ -314,6 +314,7 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
+
+ exit_probe:
+ dw_i2c_plat_pm_cleanup(dev);
++ i2c_dw_prepare_clk(dev, false);
+ exit_reset:
+ reset_control_assert(dev->rst);
+ return ret;
+@@ -331,9 +332,11 @@ static void dw_i2c_plat_remove(struct platform_device *pdev)
+ i2c_dw_disable(dev);
+
+ pm_runtime_dont_use_autosuspend(device);
+- pm_runtime_put_sync(device);
++ pm_runtime_put_noidle(device);
+ dw_i2c_plat_pm_cleanup(dev);
+
++ i2c_dw_prepare_clk(dev, false);
++
+ i2c_dw_remove_lock_support(dev);
+
+ reset_control_assert(dev->rst);
+diff --git a/drivers/i2c/busses/i2c-k1.c b/drivers/i2c/busses/i2c-k1.c
+index b68a21fff0b56b..6b918770e612e0 100644
+--- a/drivers/i2c/busses/i2c-k1.c
++++ b/drivers/i2c/busses/i2c-k1.c
+@@ -3,6 +3,7 @@
+ * Copyright (C) 2024-2025 Troy Mitchell <troymitchell988@gmail.com>
+ */
+
++#include <linux/bitfield.h>
+ #include <linux/clk.h>
+ #include <linux/i2c.h>
+ #include <linux/iopoll.h>
+@@ -14,6 +15,7 @@
+ #define SPACEMIT_ICR 0x0 /* Control register */
+ #define SPACEMIT_ISR 0x4 /* Status register */
+ #define SPACEMIT_IDBR 0xc /* Data buffer register */
++#define SPACEMIT_IRCR 0x18 /* Reset cycle counter */
+ #define SPACEMIT_IBMR 0x1c /* Bus monitor register */
+
+ /* SPACEMIT_ICR register fields */
+@@ -25,7 +27,8 @@
+ #define SPACEMIT_CR_MODE_FAST BIT(8) /* bus mode (master operation) */
+ /* Bit 9 is reserved */
+ #define SPACEMIT_CR_UR BIT(10) /* unit reset */
+-/* Bits 11-12 are reserved */
++#define SPACEMIT_CR_RSTREQ BIT(11) /* i2c bus reset request */
++/* Bit 12 is reserved */
+ #define SPACEMIT_CR_SCLE BIT(13) /* master clock enable */
+ #define SPACEMIT_CR_IUE BIT(14) /* unit enable */
+ /* Bits 15-17 are reserved */
+@@ -76,6 +79,10 @@
+ SPACEMIT_SR_GCAD | SPACEMIT_SR_IRF | SPACEMIT_SR_ITE | \
+ SPACEMIT_SR_ALD)
+
++#define SPACEMIT_RCR_SDA_GLITCH_NOFIX BIT(7) /* bypass the SDA glitch fix */
++/* the cycles of SCL during bus reset */
++#define SPACEMIT_RCR_FIELD_RST_CYC GENMASK(3, 0)
++
+ /* SPACEMIT_IBMR register fields */
+ #define SPACEMIT_BMR_SDA BIT(0) /* SDA line level */
+ #define SPACEMIT_BMR_SCL BIT(1) /* SCL line level */
+@@ -88,6 +95,8 @@
+
+ #define SPACEMIT_SR_ERR (SPACEMIT_SR_BED | SPACEMIT_SR_RXOV | SPACEMIT_SR_ALD)
+
++#define SPACEMIT_BUS_RESET_CLK_CNT_MAX 9
++
+ enum spacemit_i2c_state {
+ SPACEMIT_STATE_IDLE,
+ SPACEMIT_STATE_START,
+@@ -160,6 +169,7 @@ static int spacemit_i2c_handle_err(struct spacemit_i2c_dev *i2c)
+ static void spacemit_i2c_conditionally_reset_bus(struct spacemit_i2c_dev *i2c)
+ {
+ u32 status;
++ u8 clk_cnt;
+
+ /* if bus is locked, reset unit. 0: locked */
+ status = readl(i2c->base + SPACEMIT_IBMR);
+@@ -169,9 +179,21 @@ static void spacemit_i2c_conditionally_reset_bus(struct spacemit_i2c_dev *i2c)
+ spacemit_i2c_reset(i2c);
+ usleep_range(10, 20);
+
+- /* check scl status again */
++ for (clk_cnt = 0; clk_cnt < SPACEMIT_BUS_RESET_CLK_CNT_MAX; clk_cnt++) {
++ status = readl(i2c->base + SPACEMIT_IBMR);
++ if (status & SPACEMIT_BMR_SDA)
++ return;
++
++ /* There's nothing left to save here, we are about to exit */
++ writel(FIELD_PREP(SPACEMIT_RCR_FIELD_RST_CYC, 1),
++ i2c->base + SPACEMIT_IRCR);
++ writel(SPACEMIT_CR_RSTREQ, i2c->base + SPACEMIT_ICR);
++ usleep_range(20, 30);
++ }
++
++ /* check sda again here */
+ status = readl(i2c->base + SPACEMIT_IBMR);
+- if (!(status & SPACEMIT_BMR_SCL))
++ if (!(status & SPACEMIT_BMR_SDA))
+ dev_warn_ratelimited(i2c->dev, "unit reset failed\n");
+ }
+
+@@ -237,6 +259,14 @@ static void spacemit_i2c_init(struct spacemit_i2c_dev *i2c)
+ val |= SPACEMIT_CR_MSDE | SPACEMIT_CR_MSDIE;
+
+ writel(val, i2c->base + SPACEMIT_ICR);
++
++ /*
++ * The glitch fix in the K1 I2C controller introduces a delay
++ * on restart signals, so we disable the fix here.
++ */
++ val = readl(i2c->base + SPACEMIT_IRCR);
++ val |= SPACEMIT_RCR_SDA_GLITCH_NOFIX;
++ writel(val, i2c->base + SPACEMIT_IRCR);
+ }
+
+ static inline void
+@@ -267,19 +297,6 @@ static void spacemit_i2c_start(struct spacemit_i2c_dev *i2c)
+ writel(val, i2c->base + SPACEMIT_ICR);
+ }
+
+-static void spacemit_i2c_stop(struct spacemit_i2c_dev *i2c)
+-{
+- u32 val;
+-
+- val = readl(i2c->base + SPACEMIT_ICR);
+- val |= SPACEMIT_CR_STOP | SPACEMIT_CR_ALDIE | SPACEMIT_CR_TB;
+-
+- if (i2c->read)
+- val |= SPACEMIT_CR_ACKNAK;
+-
+- writel(val, i2c->base + SPACEMIT_ICR);
+-}
+-
+ static int spacemit_i2c_xfer_msg(struct spacemit_i2c_dev *i2c)
+ {
+ unsigned long time_left;
+@@ -412,7 +429,6 @@ static irqreturn_t spacemit_i2c_irq_handler(int irq, void *devid)
+
+ val = readl(i2c->base + SPACEMIT_ICR);
+ val &= ~(SPACEMIT_CR_TB | SPACEMIT_CR_ACKNAK | SPACEMIT_CR_STOP | SPACEMIT_CR_START);
+- writel(val, i2c->base + SPACEMIT_ICR);
+
+ switch (i2c->state) {
+ case SPACEMIT_STATE_START:
+@@ -429,14 +445,16 @@ static irqreturn_t spacemit_i2c_irq_handler(int irq, void *devid)
+ }
+
+ if (i2c->state != SPACEMIT_STATE_IDLE) {
++ val |= SPACEMIT_CR_TB | SPACEMIT_CR_ALDIE;
++
+ if (spacemit_i2c_is_last_msg(i2c)) {
+ /* trigger next byte with stop */
+- spacemit_i2c_stop(i2c);
+- } else {
+- /* trigger next byte */
+- val |= SPACEMIT_CR_ALDIE | SPACEMIT_CR_TB;
+- writel(val, i2c->base + SPACEMIT_ICR);
++ val |= SPACEMIT_CR_STOP;
++
++ if (i2c->read)
++ val |= SPACEMIT_CR_ACKNAK;
+ }
++ writel(val, i2c->base + SPACEMIT_ICR);
+ }
+
+ err_out:
+@@ -476,12 +494,13 @@ static int spacemit_i2c_xfer(struct i2c_adapter *adapt, struct i2c_msg *msgs, in
+ spacemit_i2c_enable(i2c);
+
+ ret = spacemit_i2c_wait_bus_idle(i2c);
+- if (!ret)
++ if (!ret) {
+ ret = spacemit_i2c_xfer_msg(i2c);
+- else if (ret < 0)
+- dev_dbg(i2c->dev, "i2c transfer error: %d\n", ret);
+- else
++ if (ret < 0)
++ dev_dbg(i2c->dev, "i2c transfer error: %d\n", ret);
++ } else {
+ spacemit_i2c_check_bus_release(i2c);
++ }
+
+ spacemit_i2c_disable(i2c);
+
+diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
+index ab456c3717db18..dee40704825cb4 100644
+--- a/drivers/i2c/busses/i2c-mt65xx.c
++++ b/drivers/i2c/busses/i2c-mt65xx.c
+@@ -1243,6 +1243,7 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ {
+ int ret;
+ int left_num = num;
++ bool write_then_read_en = false;
+ struct mtk_i2c *i2c = i2c_get_adapdata(adap);
+
+ ret = clk_bulk_enable(I2C_MT65XX_CLK_MAX, i2c->clocks);
+@@ -1256,6 +1257,7 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ if (!(msgs[0].flags & I2C_M_RD) && (msgs[1].flags & I2C_M_RD) &&
+ msgs[0].addr == msgs[1].addr) {
+ i2c->auto_restart = 0;
++ write_then_read_en = true;
+ }
+ }
+
+@@ -1280,12 +1282,10 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ else
+ i2c->op = I2C_MASTER_WR;
+
+- if (!i2c->auto_restart) {
+- if (num > 1) {
+- /* combined two messages into one transaction */
+- i2c->op = I2C_MASTER_WRRD;
+- left_num--;
+- }
++ if (write_then_read_en) {
++ /* combined two messages into one transaction */
++ i2c->op = I2C_MASTER_WRRD;
++ left_num--;
+ }
+
+ /* always use DMA mode. */
+@@ -1293,7 +1293,10 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ if (ret < 0)
+ goto err_exit;
+
+- msgs++;
++ if (i2c->op == I2C_MASTER_WRRD)
++ msgs += 2;
++ else
++ msgs++;
+ }
+ /* the return value is number of executed messages */
+ ret = num;
+diff --git a/drivers/i3c/internals.h b/drivers/i3c/internals.h
+index 0d857cc68cc5d4..79ceaa5f5afd6f 100644
+--- a/drivers/i3c/internals.h
++++ b/drivers/i3c/internals.h
+@@ -38,7 +38,11 @@ static inline void i3c_writel_fifo(void __iomem *addr, const void *buf,
+ u32 tmp = 0;
+
+ memcpy(&tmp, buf + (nbytes & ~3), nbytes & 3);
+- writel(tmp, addr);
++ /*
++ * writesl() instead of writel() to keep FIFO
++ * byteorder on big-endian targets
++ */
++ writesl(addr, &tmp, 1);
+ }
+ }
+
+@@ -55,7 +59,11 @@ static inline void i3c_readl_fifo(const void __iomem *addr, void *buf,
+ if (nbytes & 3) {
+ u32 tmp;
+
+- tmp = readl(addr);
++ /*
++ * readsl() instead of readl() to keep FIFO
++ * byteorder on big-endian targets
++ */
++ readsl(addr, &tmp, 1);
+ memcpy(buf + (nbytes & ~3), &tmp, nbytes & 3);
+ }
+ }
+diff --git a/drivers/i3c/master/svc-i3c-master.c b/drivers/i3c/master/svc-i3c-master.c
+index 701ae165b25b79..9641e66a4e5f2d 100644
+--- a/drivers/i3c/master/svc-i3c-master.c
++++ b/drivers/i3c/master/svc-i3c-master.c
+@@ -417,6 +417,7 @@ static int svc_i3c_master_handle_ibi(struct svc_i3c_master *master,
+ SVC_I3C_MSTATUS_COMPLETE(val), 0, 1000);
+ if (ret) {
+ dev_err(master->dev, "Timeout when polling for COMPLETE\n");
++ i3c_generic_ibi_recycle_slot(data->ibi_pool, slot);
+ return ret;
+ }
+
+@@ -517,9 +518,24 @@ static void svc_i3c_master_ibi_isr(struct svc_i3c_master *master)
+ */
+ writel(SVC_I3C_MINT_IBIWON, master->regs + SVC_I3C_MSTATUS);
+
+- /* Acknowledge the incoming interrupt with the AUTOIBI mechanism */
+- writel(SVC_I3C_MCTRL_REQUEST_AUTO_IBI |
+- SVC_I3C_MCTRL_IBIRESP_AUTO,
++ /*
++ * Write REQUEST_START_ADDR request to emit broadcast address for arbitration,
++ * instend of using AUTO_IBI.
++ *
++ * Using AutoIBI request may cause controller to remain in AutoIBI state when
++ * there is a glitch on SDA line (high->low->high).
++ * 1. SDA high->low, raising an interrupt to execute IBI isr.
++ * 2. SDA low->high.
++ * 3. IBI isr writes an AutoIBI request.
++ * 4. The controller will not start AutoIBI process because SDA is not low.
++ * 5. IBIWON polling times out.
++ * 6. Controller reamins in AutoIBI state and doesn't accept EmitStop request.
++ */
++ writel(SVC_I3C_MCTRL_REQUEST_START_ADDR |
++ SVC_I3C_MCTRL_TYPE_I3C |
++ SVC_I3C_MCTRL_IBIRESP_MANUAL |
++ SVC_I3C_MCTRL_DIR(SVC_I3C_MCTRL_DIR_WRITE) |
++ SVC_I3C_MCTRL_ADDR(I3C_BROADCAST_ADDR),
+ master->regs + SVC_I3C_MCTRL);
+
+ /* Wait for IBIWON, should take approximately 100us */
+@@ -539,10 +555,15 @@ static void svc_i3c_master_ibi_isr(struct svc_i3c_master *master)
+ switch (ibitype) {
+ case SVC_I3C_MSTATUS_IBITYPE_IBI:
+ dev = svc_i3c_master_dev_from_addr(master, ibiaddr);
+- if (!dev || !is_events_enabled(master, SVC_I3C_EVENT_IBI))
++ if (!dev || !is_events_enabled(master, SVC_I3C_EVENT_IBI)) {
+ svc_i3c_master_nack_ibi(master);
+- else
++ } else {
++ if (dev->info.bcr & I3C_BCR_IBI_PAYLOAD)
++ svc_i3c_master_ack_ibi(master, true);
++ else
++ svc_i3c_master_ack_ibi(master, false);
+ svc_i3c_master_handle_ibi(master, dev);
++ }
+ break;
+ case SVC_I3C_MSTATUS_IBITYPE_HOT_JOIN:
+ if (is_events_enabled(master, SVC_I3C_EVENT_HOTJOIN))
+diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
+index c174ebb7d5e6d1..642beb4b3360d2 100644
+--- a/drivers/iio/inkern.c
++++ b/drivers/iio/inkern.c
+@@ -11,6 +11,7 @@
+ #include <linux/mutex.h>
+ #include <linux/property.h>
+ #include <linux/slab.h>
++#include <linux/units.h>
+
+ #include <linux/iio/iio.h>
+ #include <linux/iio/iio-opaque.h>
+@@ -604,7 +605,7 @@ static int iio_convert_raw_to_processed_unlocked(struct iio_channel *chan,
+ {
+ int scale_type, scale_val, scale_val2;
+ int offset_type, offset_val, offset_val2;
+- s64 raw64 = raw;
++ s64 denominator, raw64 = raw;
+
+ offset_type = iio_channel_read(chan, &offset_val, &offset_val2,
+ IIO_CHAN_INFO_OFFSET);
+@@ -639,7 +640,7 @@ static int iio_convert_raw_to_processed_unlocked(struct iio_channel *chan,
+ * If no channel scaling is available apply consumer scale to
+ * raw value and return.
+ */
+- *processed = raw * scale;
++ *processed = raw64 * scale;
+ return 0;
+ }
+
+@@ -648,20 +649,19 @@ static int iio_convert_raw_to_processed_unlocked(struct iio_channel *chan,
+ *processed = raw64 * scale_val * scale;
+ break;
+ case IIO_VAL_INT_PLUS_MICRO:
+- if (scale_val2 < 0)
+- *processed = -raw64 * scale_val * scale;
+- else
+- *processed = raw64 * scale_val * scale;
+- *processed += div_s64(raw64 * (s64)scale_val2 * scale,
+- 1000000LL);
+- break;
+ case IIO_VAL_INT_PLUS_NANO:
+- if (scale_val2 < 0)
+- *processed = -raw64 * scale_val * scale;
+- else
+- *processed = raw64 * scale_val * scale;
+- *processed += div_s64(raw64 * (s64)scale_val2 * scale,
+- 1000000000LL);
++ switch (scale_type) {
++ case IIO_VAL_INT_PLUS_MICRO:
++ denominator = MICRO;
++ break;
++ case IIO_VAL_INT_PLUS_NANO:
++ denominator = NANO;
++ break;
++ }
++ *processed = raw64 * scale * abs(scale_val);
++ *processed += div_s64(raw64 * scale * abs(scale_val2), denominator);
++ if (scale_val < 0 || scale_val2 < 0)
++ *processed *= -1;
+ break;
+ case IIO_VAL_FRACTIONAL:
+ *processed = div_s64(raw64 * (s64)scale_val * scale,
+diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
+index be0743dac3fff3..929e89841c12a6 100644
+--- a/drivers/infiniband/core/addr.c
++++ b/drivers/infiniband/core/addr.c
+@@ -454,14 +454,10 @@ static int addr_resolve_neigh(const struct dst_entry *dst,
+ {
+ int ret = 0;
+
+- if (ndev_flags & IFF_LOOPBACK) {
++ if (ndev_flags & IFF_LOOPBACK)
+ memcpy(addr->dst_dev_addr, addr->src_dev_addr, MAX_ADDR_LEN);
+- } else {
+- if (!(ndev_flags & IFF_NOARP)) {
+- /* If the device doesn't do ARP internally */
+- ret = fetch_ha(dst, addr, dst_in, seq);
+- }
+- }
++ else
++ ret = fetch_ha(dst, addr, dst_in, seq);
+ return ret;
+ }
+
+diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
+index 92678e438ff4d5..01bede8ba10553 100644
+--- a/drivers/infiniband/core/cm.c
++++ b/drivers/infiniband/core/cm.c
+@@ -1049,8 +1049,8 @@ static noinline void cm_destroy_id_wait_timeout(struct ib_cm_id *cm_id,
+ struct cm_id_private *cm_id_priv;
+
+ cm_id_priv = container_of(cm_id, struct cm_id_private, id);
+- pr_err("%s: cm_id=%p timed out. state %d -> %d, refcnt=%d\n", __func__,
+- cm_id, old_state, cm_id->state, refcount_read(&cm_id_priv->refcount));
++ pr_err_ratelimited("%s: cm_id=%p timed out. state %d -> %d, refcnt=%d\n", __func__,
++ cm_id, old_state, cm_id->state, refcount_read(&cm_id_priv->refcount));
+ }
+
+ static void cm_destroy_id(struct ib_cm_id *cm_id, int err)
+diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
+index 53571e6b3162ca..66df5bed6a5627 100644
+--- a/drivers/infiniband/core/sa_query.c
++++ b/drivers/infiniband/core/sa_query.c
+@@ -1013,6 +1013,8 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
+ if (timeout > IB_SA_LOCAL_SVC_TIMEOUT_MAX)
+ timeout = IB_SA_LOCAL_SVC_TIMEOUT_MAX;
+
++ spin_lock_irqsave(&ib_nl_request_lock, flags);
++
+ delta = timeout - sa_local_svc_timeout_ms;
+ if (delta < 0)
+ abs_delta = -delta;
+@@ -1020,7 +1022,6 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
+ abs_delta = delta;
+
+ if (delta != 0) {
+- spin_lock_irqsave(&ib_nl_request_lock, flags);
+ sa_local_svc_timeout_ms = timeout;
+ list_for_each_entry(query, &ib_nl_request_list, list) {
+ if (delta < 0 && abs_delta > query->timeout)
+@@ -1038,9 +1039,10 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
+ if (delay)
+ mod_delayed_work(ib_nl_wq, &ib_nl_timed_work,
+ (unsigned long)delay);
+- spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+ }
+
++ spin_unlock_irqrestore(&ib_nl_request_lock, flags);
++
+ settimeout_out:
+ return 0;
+ }
+diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
+index d456e4fde3e1fe..5fabd39b7492af 100644
+--- a/drivers/infiniband/hw/mlx5/main.c
++++ b/drivers/infiniband/hw/mlx5/main.c
+@@ -13,6 +13,7 @@
+ #include <linux/dma-mapping.h>
+ #include <linux/slab.h>
+ #include <linux/bitmap.h>
++#include <linux/log2.h>
+ #include <linux/sched.h>
+ #include <linux/sched/mm.h>
+ #include <linux/sched/task.h>
+@@ -883,6 +884,51 @@ static void fill_esw_mgr_reg_c0(struct mlx5_core_dev *mdev,
+ resp->reg_c0.mask = mlx5_eswitch_get_vport_metadata_mask();
+ }
+
++/*
++ * Calculate maximum SQ overhead across all QP types.
++ * Other QP types (REG_UMR, UC, RC, UD/SMI/GSI, XRC_TGT)
++ * have smaller overhead than the types calculated below,
++ * so they are implicitly included.
++ */
++static u32 mlx5_ib_calc_max_sq_overhead(void)
++{
++ u32 max_overhead_xrc, overhead_ud_lso, a, b;
++
++ /* XRC_INI */
++ max_overhead_xrc = sizeof(struct mlx5_wqe_xrc_seg);
++ max_overhead_xrc += sizeof(struct mlx5_wqe_ctrl_seg);
++ a = sizeof(struct mlx5_wqe_atomic_seg) +
++ sizeof(struct mlx5_wqe_raddr_seg);
++ b = sizeof(struct mlx5_wqe_umr_ctrl_seg) +
++ sizeof(struct mlx5_mkey_seg) +
++ MLX5_IB_SQ_UMR_INLINE_THRESHOLD / MLX5_IB_UMR_OCTOWORD;
++ max_overhead_xrc += max(a, b);
++
++ /* UD with LSO */
++ overhead_ud_lso = sizeof(struct mlx5_wqe_ctrl_seg);
++ overhead_ud_lso += sizeof(struct mlx5_wqe_eth_pad);
++ overhead_ud_lso += sizeof(struct mlx5_wqe_eth_seg);
++ overhead_ud_lso += sizeof(struct mlx5_wqe_datagram_seg);
++
++ return max(max_overhead_xrc, overhead_ud_lso);
++}
++
++static u32 mlx5_ib_calc_max_qp_wr(struct mlx5_ib_dev *dev)
++{
++ struct mlx5_core_dev *mdev = dev->mdev;
++ u32 max_wqe_bb_units = 1 << MLX5_CAP_GEN(mdev, log_max_qp_sz);
++ u32 max_wqe_size;
++ /* max QP overhead + 1 SGE, no inline, no special features */
++ max_wqe_size = mlx5_ib_calc_max_sq_overhead() +
++ sizeof(struct mlx5_wqe_data_seg);
++
++ max_wqe_size = roundup_pow_of_two(max_wqe_size);
++
++ max_wqe_size = ALIGN(max_wqe_size, MLX5_SEND_WQE_BB);
++
++ return (max_wqe_bb_units * MLX5_SEND_WQE_BB) / max_wqe_size;
++}
++
+ static int mlx5_ib_query_device(struct ib_device *ibdev,
+ struct ib_device_attr *props,
+ struct ib_udata *uhw)
+@@ -1041,7 +1087,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
+ props->max_mr_size = ~0ull;
+ props->page_size_cap = ~(min_page_size - 1);
+ props->max_qp = 1 << MLX5_CAP_GEN(mdev, log_max_qp);
+- props->max_qp_wr = 1 << MLX5_CAP_GEN(mdev, log_max_qp_sz);
++ props->max_qp_wr = mlx5_ib_calc_max_qp_wr(dev);
+ max_rq_sg = MLX5_CAP_GEN(mdev, max_wqe_sz_rq) /
+ sizeof(struct mlx5_wqe_data_seg);
+ max_sq_desc = min_t(int, MLX5_CAP_GEN(mdev, max_wqe_sz_sq), 512);
+@@ -1793,7 +1839,8 @@ static void deallocate_uars(struct mlx5_ib_dev *dev,
+ }
+
+ static int mlx5_ib_enable_lb_mp(struct mlx5_core_dev *master,
+- struct mlx5_core_dev *slave)
++ struct mlx5_core_dev *slave,
++ struct mlx5_ib_lb_state *lb_state)
+ {
+ int err;
+
+@@ -1805,6 +1852,7 @@ static int mlx5_ib_enable_lb_mp(struct mlx5_core_dev *master,
+ if (err)
+ goto out;
+
++ lb_state->force_enable = true;
+ return 0;
+
+ out:
+@@ -1813,16 +1861,22 @@ static int mlx5_ib_enable_lb_mp(struct mlx5_core_dev *master,
+ }
+
+ static void mlx5_ib_disable_lb_mp(struct mlx5_core_dev *master,
+- struct mlx5_core_dev *slave)
++ struct mlx5_core_dev *slave,
++ struct mlx5_ib_lb_state *lb_state)
+ {
+ mlx5_nic_vport_update_local_lb(slave, false);
+ mlx5_nic_vport_update_local_lb(master, false);
++
++ lb_state->force_enable = false;
+ }
+
+ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
+ {
+ int err = 0;
+
++ if (dev->lb.force_enable)
++ return 0;
++
+ mutex_lock(&dev->lb.mutex);
+ if (td)
+ dev->lb.user_td++;
+@@ -1844,6 +1898,9 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
+
+ void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
+ {
++ if (dev->lb.force_enable)
++ return;
++
+ mutex_lock(&dev->lb.mutex);
+ if (td)
+ dev->lb.user_td--;
+@@ -3523,7 +3580,7 @@ static void mlx5_ib_unbind_slave_port(struct mlx5_ib_dev *ibdev,
+
+ lockdep_assert_held(&mlx5_ib_multiport_mutex);
+
+- mlx5_ib_disable_lb_mp(ibdev->mdev, mpi->mdev);
++ mlx5_ib_disable_lb_mp(ibdev->mdev, mpi->mdev, &ibdev->lb);
+
+ mlx5_core_mp_event_replay(ibdev->mdev,
+ MLX5_DRIVER_EVENT_AFFILIATION_REMOVED,
+@@ -3620,7 +3677,7 @@ static bool mlx5_ib_bind_slave_port(struct mlx5_ib_dev *ibdev,
+ MLX5_DRIVER_EVENT_AFFILIATION_DONE,
+ &key);
+
+- err = mlx5_ib_enable_lb_mp(ibdev->mdev, mpi->mdev);
++ err = mlx5_ib_enable_lb_mp(ibdev->mdev, mpi->mdev, &ibdev->lb);
+ if (err)
+ goto unbind;
+
+diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
+index 7ffc7ee92cf035..15e3962633dc33 100644
+--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
++++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
+@@ -1109,6 +1109,7 @@ struct mlx5_ib_lb_state {
+ u32 user_td;
+ int qps;
+ bool enabled;
++ bool force_enable;
+ };
+
+ struct mlx5_ib_pf_eq {
+@@ -1802,6 +1803,10 @@ mlx5_umem_mkc_find_best_pgsz(struct mlx5_ib_dev *dev, struct ib_umem *umem,
+
+ bitmap = GENMASK_ULL(max_log_entity_size_cap, min_log_entity_size_cap);
+
++ /* In KSM mode HW requires IOVA and mkey's page size to be aligned */
++ if (access_mode == MLX5_MKC_ACCESS_MODE_KSM && iova)
++ bitmap &= GENMASK_ULL(__ffs64(iova), 0);
++
+ return ib_umem_find_best_pgsz(umem, bitmap, iova);
+ }
+
+diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
+index 6f8f353e95838a..f522820b950c71 100644
+--- a/drivers/infiniband/sw/rxe/rxe_task.c
++++ b/drivers/infiniband/sw/rxe/rxe_task.c
+@@ -132,8 +132,12 @@ static void do_task(struct rxe_task *task)
+ * yield the cpu and reschedule the task
+ */
+ if (!ret) {
+- task->state = TASK_STATE_IDLE;
+- resched = 1;
++ if (task->state != TASK_STATE_DRAINING) {
++ task->state = TASK_STATE_IDLE;
++ resched = 1;
++ } else {
++ cont = 1;
++ }
+ goto exit;
+ }
+
+diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
+index 35c3bde0d00af8..efa2f097b58289 100644
+--- a/drivers/infiniband/sw/siw/siw_verbs.c
++++ b/drivers/infiniband/sw/siw/siw_verbs.c
+@@ -769,7 +769,7 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
+ struct siw_wqe *wqe = tx_wqe(qp);
+
+ unsigned long flags;
+- int rv = 0;
++ int rv = 0, imm_err = 0;
+
+ if (wr && !rdma_is_kernel_res(&qp->base_qp.res)) {
+ siw_dbg_qp(qp, "wr must be empty for user mapped sq\n");
+@@ -955,9 +955,17 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
+ * Send directly if SQ processing is not in progress.
+ * Eventual immediate errors (rv < 0) do not affect the involved
+ * RI resources (Verbs, 8.3.1) and thus do not prevent from SQ
+- * processing, if new work is already pending. But rv must be passed
+- * to caller.
++ * processing, if new work is already pending. But rv and pointer
++ * to failed work request must be passed to caller.
+ */
++ if (unlikely(rv < 0)) {
++ /*
++ * Immediate error
++ */
++ siw_dbg_qp(qp, "Immediate error %d\n", rv);
++ imm_err = rv;
++ *bad_wr = wr;
++ }
+ if (wqe->wr_status != SIW_WR_IDLE) {
+ spin_unlock_irqrestore(&qp->sq_lock, flags);
+ goto skip_direct_sending;
+@@ -982,15 +990,10 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
+
+ up_read(&qp->state_lock);
+
+- if (rv >= 0)
+- return 0;
+- /*
+- * Immediate error
+- */
+- siw_dbg_qp(qp, "error %d\n", rv);
++ if (unlikely(imm_err))
++ return imm_err;
+
+- *bad_wr = wr;
+- return rv;
++ return (rv >= 0) ? 0 : rv;
+ }
+
+ /*
+diff --git a/drivers/input/misc/uinput.c b/drivers/input/misc/uinput.c
+index 2c51ea9d01d777..13336a2fd49c8a 100644
+--- a/drivers/input/misc/uinput.c
++++ b/drivers/input/misc/uinput.c
+@@ -775,6 +775,7 @@ static int uinput_ff_upload_to_user(char __user *buffer,
+ if (in_compat_syscall()) {
+ struct uinput_ff_upload_compat ff_up_compat;
+
++ memset(&ff_up_compat, 0, sizeof(ff_up_compat));
+ ff_up_compat.request_id = ff_up->request_id;
+ ff_up_compat.retval = ff_up->retval;
+ /*
+diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c b/drivers/input/touchscreen/atmel_mxt_ts.c
+index 322d5a3d40a093..ef6e2c3374ff68 100644
+--- a/drivers/input/touchscreen/atmel_mxt_ts.c
++++ b/drivers/input/touchscreen/atmel_mxt_ts.c
+@@ -3317,7 +3317,7 @@ static int mxt_probe(struct i2c_client *client)
+ if (data->reset_gpio) {
+ /* Wait a while and then de-assert the RESET GPIO line */
+ msleep(MXT_RESET_GPIO_TIME);
+- gpiod_set_value(data->reset_gpio, 0);
++ gpiod_set_value_cansleep(data->reset_gpio, 0);
+ msleep(MXT_RESET_INVALID_CHG);
+ }
+
+diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
+index affbf4a1558dee..5aa7f46a420b58 100644
+--- a/drivers/iommu/intel/debugfs.c
++++ b/drivers/iommu/intel/debugfs.c
+@@ -435,8 +435,21 @@ static int domain_translation_struct_show(struct seq_file *m,
+ }
+ pgd &= VTD_PAGE_MASK;
+ } else { /* legacy mode */
+- pgd = context->lo & VTD_PAGE_MASK;
+- agaw = context->hi & 7;
++ u8 tt = (u8)(context->lo & GENMASK_ULL(3, 2)) >> 2;
++
++ /*
++ * According to Translation Type(TT),
++ * get the page table pointer(SSPTPTR).
++ */
++ switch (tt) {
++ case CONTEXT_TT_MULTI_LEVEL:
++ case CONTEXT_TT_DEV_IOTLB:
++ pgd = context->lo & VTD_PAGE_MASK;
++ agaw = context->hi & 7;
++ break;
++ default:
++ goto iommu_unlock;
++ }
+ }
+
+ seq_printf(m, "Device %04x:%02x:%02x.%x ",
+diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
+index d09b9287165927..2c261c069001c5 100644
+--- a/drivers/iommu/intel/iommu.h
++++ b/drivers/iommu/intel/iommu.h
+@@ -541,7 +541,8 @@ enum {
+ #define pasid_supported(iommu) (sm_supported(iommu) && \
+ ecap_pasid((iommu)->ecap))
+ #define ssads_supported(iommu) (sm_supported(iommu) && \
+- ecap_slads((iommu)->ecap))
++ ecap_slads((iommu)->ecap) && \
++ ecap_smpwc(iommu->ecap))
+ #define nested_supported(iommu) (sm_supported(iommu) && \
+ ecap_nest((iommu)->ecap))
+
+diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h
+index e236b932e7668a..c95394cd03a770 100644
+--- a/drivers/iommu/iommu-priv.h
++++ b/drivers/iommu/iommu-priv.h
+@@ -37,6 +37,8 @@ void iommu_device_unregister_bus(struct iommu_device *iommu,
+ const struct bus_type *bus,
+ struct notifier_block *nb);
+
++int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu);
++
+ struct iommu_attach_handle *iommu_attach_handle_get(struct iommu_group *group,
+ ioasid_t pasid,
+ unsigned int type);
+diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
+index 060ebe330ee163..59244c744eabd2 100644
+--- a/drivers/iommu/iommu.c
++++ b/drivers/iommu/iommu.c
+@@ -304,6 +304,7 @@ void iommu_device_unregister_bus(struct iommu_device *iommu,
+ struct notifier_block *nb)
+ {
+ bus_unregister_notifier(bus, nb);
++ fwnode_remove_software_node(iommu->fwnode);
+ iommu_device_unregister(iommu);
+ }
+ EXPORT_SYMBOL_GPL(iommu_device_unregister_bus);
+@@ -326,6 +327,12 @@ int iommu_device_register_bus(struct iommu_device *iommu,
+ if (err)
+ return err;
+
++ iommu->fwnode = fwnode_create_software_node(NULL, NULL);
++ if (IS_ERR(iommu->fwnode)) {
++ bus_unregister_notifier(bus, nb);
++ return PTR_ERR(iommu->fwnode);
++ }
++
+ spin_lock(&iommu_device_lock);
+ list_add_tail(&iommu->list, &iommu_device_list);
+ spin_unlock(&iommu_device_lock);
+@@ -335,9 +342,28 @@ int iommu_device_register_bus(struct iommu_device *iommu,
+ iommu_device_unregister_bus(iommu, bus, nb);
+ return err;
+ }
++ WRITE_ONCE(iommu->ready, true);
+ return 0;
+ }
+ EXPORT_SYMBOL_GPL(iommu_device_register_bus);
++
++int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu)
++{
++ int rc;
++
++ mutex_lock(&iommu_probe_device_lock);
++ rc = iommu_fwspec_init(dev, iommu->fwnode);
++ mutex_unlock(&iommu_probe_device_lock);
++
++ if (rc)
++ return rc;
++
++ rc = device_add(dev);
++ if (rc)
++ iommu_fwspec_free(dev);
++ return rc;
++}
++EXPORT_SYMBOL_GPL(iommu_mock_device_add);
+ #endif
+
+ static struct dev_iommu *dev_iommu_get(struct device *dev)
+diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
+index 61686603c76934..de178827a078a9 100644
+--- a/drivers/iommu/iommufd/selftest.c
++++ b/drivers/iommu/iommufd/selftest.c
+@@ -1126,7 +1126,7 @@ static struct mock_dev *mock_dev_create(unsigned long dev_flags)
+ goto err_put;
+ }
+
+- rc = device_add(&mdev->dev);
++ rc = iommu_mock_device_add(&mdev->dev, &mock_iommu.iommu_dev);
+ if (rc)
+ goto err_put;
+ return mdev;
+diff --git a/drivers/irqchip/irq-gic-v5-its.c b/drivers/irqchip/irq-gic-v5-its.c
+index 9290ac741949ca..2fb58d76f52147 100644
+--- a/drivers/irqchip/irq-gic-v5-its.c
++++ b/drivers/irqchip/irq-gic-v5-its.c
+@@ -191,9 +191,9 @@ static int gicv5_its_create_itt_two_level(struct gicv5_its_chip_data *its,
+ unsigned int num_events)
+ {
+ unsigned int l1_bits, l2_bits, span, events_per_l2_table;
+- unsigned int i, complete_tables, final_span, num_ents;
++ unsigned int complete_tables, final_span, num_ents;
+ __le64 *itt_l1, *itt_l2, **l2ptrs;
+- int ret;
++ int i, ret;
+ u64 val;
+
+ ret = gicv5_its_l2sz_to_l2_bits(itt_l2sz);
+@@ -949,15 +949,18 @@ static int gicv5_its_irq_domain_alloc(struct irq_domain *domain, unsigned int vi
+ device_id = its_dev->device_id;
+
+ for (i = 0; i < nr_irqs; i++) {
+- lpi = gicv5_alloc_lpi();
++ ret = gicv5_alloc_lpi();
+ if (ret < 0) {
+ pr_debug("Failed to find free LPI!\n");
+- goto out_eventid;
++ goto out_free_irqs;
+ }
++ lpi = ret;
+
+ ret = irq_domain_alloc_irqs_parent(domain, virq + i, 1, &lpi);
+- if (ret)
+- goto out_free_lpi;
++ if (ret) {
++ gicv5_free_lpi(lpi);
++ goto out_free_irqs;
++ }
+
+ /*
+ * Store eventid and deviceid into the hwirq for later use.
+@@ -977,8 +980,13 @@ static int gicv5_its_irq_domain_alloc(struct irq_domain *domain, unsigned int vi
+
+ return 0;
+
+-out_free_lpi:
+- gicv5_free_lpi(lpi);
++out_free_irqs:
++ while (--i >= 0) {
++ irqd = irq_domain_get_irq_data(domain, virq + i);
++ gicv5_free_lpi(irqd->parent_data->hwirq);
++ irq_domain_reset_irq_data(irqd);
++ irq_domain_free_irqs_parent(domain, virq + i, 1);
++ }
+ out_eventid:
+ gicv5_its_free_eventid(its_dev, event_id_base, nr_irqs);
+ return ret;
+diff --git a/drivers/irqchip/irq-sg2042-msi.c b/drivers/irqchip/irq-sg2042-msi.c
+index bcfddc51bc6a18..2fd4d94f9bd76d 100644
+--- a/drivers/irqchip/irq-sg2042-msi.c
++++ b/drivers/irqchip/irq-sg2042-msi.c
+@@ -85,6 +85,8 @@ static void sg2042_msi_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *m
+
+ static const struct irq_chip sg2042_msi_middle_irq_chip = {
+ .name = "SG2042 MSI",
++ .irq_startup = irq_chip_startup_parent,
++ .irq_shutdown = irq_chip_shutdown_parent,
+ .irq_ack = sg2042_msi_irq_ack,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+@@ -114,6 +116,8 @@ static void sg2044_msi_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *m
+
+ static struct irq_chip sg2044_msi_middle_irq_chip = {
+ .name = "SG2044 MSI",
++ .irq_startup = irq_chip_startup_parent,
++ .irq_shutdown = irq_chip_shutdown_parent,
+ .irq_ack = sg2044_msi_irq_ack,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+@@ -185,8 +189,10 @@ static const struct irq_domain_ops sg204x_msi_middle_domain_ops = {
+ .select = msi_lib_irq_domain_select,
+ };
+
+-#define SG2042_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS | \
+- MSI_FLAG_USE_DEF_CHIP_OPS)
++#define SG2042_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS | \
++ MSI_FLAG_USE_DEF_CHIP_OPS | \
++ MSI_FLAG_PCI_MSI_MASK_PARENT | \
++ MSI_FLAG_PCI_MSI_STARTUP_PARENT)
+
+ #define SG2042_MSI_FLAGS_SUPPORTED MSI_GENERIC_FLAGS_MASK
+
+@@ -200,10 +206,12 @@ static const struct msi_parent_ops sg2042_msi_parent_ops = {
+ .init_dev_msi_info = msi_lib_init_dev_msi_info,
+ };
+
+-#define SG2044_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS | \
+- MSI_FLAG_USE_DEF_CHIP_OPS)
++#define SG2044_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS | \
++ MSI_FLAG_USE_DEF_CHIP_OPS | \
++ MSI_FLAG_PCI_MSI_MASK_PARENT | \
++ MSI_FLAG_PCI_MSI_STARTUP_PARENT)
+
+-#define SG2044_MSI_FLAGS_SUPPORTED (MSI_GENERIC_FLAGS_MASK | \
++#define SG2044_MSI_FLAGS_SUPPORTED (MSI_GENERIC_FLAGS_MASK | \
+ MSI_FLAG_PCI_MSIX)
+
+ static const struct msi_parent_ops sg2044_msi_parent_ops = {
+diff --git a/drivers/leds/flash/leds-qcom-flash.c b/drivers/leds/flash/leds-qcom-flash.c
+index 89cf5120f5d55b..db7c2c743adc75 100644
+--- a/drivers/leds/flash/leds-qcom-flash.c
++++ b/drivers/leds/flash/leds-qcom-flash.c
+@@ -1,6 +1,6 @@
+ // SPDX-License-Identifier: GPL-2.0-only
+ /*
+- * Copyright (c) 2022, 2024 Qualcomm Innovation Center, Inc. All rights reserved.
++ * Copyright (c) 2022, 2024-2025 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+ #include <linux/bitfield.h>
+@@ -114,36 +114,39 @@ enum {
+ REG_THERM_THRSH1,
+ REG_THERM_THRSH2,
+ REG_THERM_THRSH3,
++ REG_TORCH_CLAMP,
+ REG_MAX_COUNT,
+ };
+
+ static const struct reg_field mvflash_3ch_regs[REG_MAX_COUNT] = {
+- REG_FIELD(0x08, 0, 7), /* status1 */
+- REG_FIELD(0x09, 0, 7), /* status2 */
+- REG_FIELD(0x0a, 0, 7), /* status3 */
+- REG_FIELD_ID(0x40, 0, 7, 3, 1), /* chan_timer */
+- REG_FIELD_ID(0x43, 0, 6, 3, 1), /* itarget */
+- REG_FIELD(0x46, 7, 7), /* module_en */
+- REG_FIELD(0x47, 0, 5), /* iresolution */
+- REG_FIELD_ID(0x49, 0, 2, 3, 1), /* chan_strobe */
+- REG_FIELD(0x4c, 0, 2), /* chan_en */
+- REG_FIELD(0x56, 0, 2), /* therm_thrsh1 */
+- REG_FIELD(0x57, 0, 2), /* therm_thrsh2 */
+- REG_FIELD(0x58, 0, 2), /* therm_thrsh3 */
++ [REG_STATUS1] = REG_FIELD(0x08, 0, 7),
++ [REG_STATUS2] = REG_FIELD(0x09, 0, 7),
++ [REG_STATUS3] = REG_FIELD(0x0a, 0, 7),
++ [REG_CHAN_TIMER] = REG_FIELD_ID(0x40, 0, 7, 3, 1),
++ [REG_ITARGET] = REG_FIELD_ID(0x43, 0, 6, 3, 1),
++ [REG_MODULE_EN] = REG_FIELD(0x46, 7, 7),
++ [REG_IRESOLUTION] = REG_FIELD(0x47, 0, 5),
++ [REG_CHAN_STROBE] = REG_FIELD_ID(0x49, 0, 2, 3, 1),
++ [REG_CHAN_EN] = REG_FIELD(0x4c, 0, 2),
++ [REG_THERM_THRSH1] = REG_FIELD(0x56, 0, 2),
++ [REG_THERM_THRSH2] = REG_FIELD(0x57, 0, 2),
++ [REG_THERM_THRSH3] = REG_FIELD(0x58, 0, 2),
++ [REG_TORCH_CLAMP] = REG_FIELD(0xec, 0, 6),
+ };
+
+ static const struct reg_field mvflash_4ch_regs[REG_MAX_COUNT] = {
+- REG_FIELD(0x06, 0, 7), /* status1 */
+- REG_FIELD(0x07, 0, 6), /* status2 */
+- REG_FIELD(0x09, 0, 7), /* status3 */
+- REG_FIELD_ID(0x3e, 0, 7, 4, 1), /* chan_timer */
+- REG_FIELD_ID(0x42, 0, 6, 4, 1), /* itarget */
+- REG_FIELD(0x46, 7, 7), /* module_en */
+- REG_FIELD(0x49, 0, 3), /* iresolution */
+- REG_FIELD_ID(0x4a, 0, 6, 4, 1), /* chan_strobe */
+- REG_FIELD(0x4e, 0, 3), /* chan_en */
+- REG_FIELD(0x7a, 0, 2), /* therm_thrsh1 */
+- REG_FIELD(0x78, 0, 2), /* therm_thrsh2 */
++ [REG_STATUS1] = REG_FIELD(0x06, 0, 7),
++ [REG_STATUS2] = REG_FIELD(0x07, 0, 6),
++ [REG_STATUS3] = REG_FIELD(0x09, 0, 7),
++ [REG_CHAN_TIMER] = REG_FIELD_ID(0x3e, 0, 7, 4, 1),
++ [REG_ITARGET] = REG_FIELD_ID(0x42, 0, 6, 4, 1),
++ [REG_MODULE_EN] = REG_FIELD(0x46, 7, 7),
++ [REG_IRESOLUTION] = REG_FIELD(0x49, 0, 3),
++ [REG_CHAN_STROBE] = REG_FIELD_ID(0x4a, 0, 6, 4, 1),
++ [REG_CHAN_EN] = REG_FIELD(0x4e, 0, 3),
++ [REG_THERM_THRSH1] = REG_FIELD(0x7a, 0, 2),
++ [REG_THERM_THRSH2] = REG_FIELD(0x78, 0, 2),
++ [REG_TORCH_CLAMP] = REG_FIELD(0xed, 0, 6),
+ };
+
+ struct qcom_flash_data {
+@@ -156,6 +159,7 @@ struct qcom_flash_data {
+ u8 max_channels;
+ u8 chan_en_bits;
+ u8 revision;
++ u8 torch_clamp;
+ };
+
+ struct qcom_flash_led {
+@@ -702,6 +706,7 @@ static int qcom_flash_register_led_device(struct device *dev,
+ u32 current_ua, timeout_us;
+ u32 channels[4];
+ int i, rc, count;
++ u8 torch_clamp;
+
+ count = fwnode_property_count_u32(node, "led-sources");
+ if (count <= 0) {
+@@ -751,6 +756,12 @@ static int qcom_flash_register_led_device(struct device *dev,
+ current_ua = min_t(u32, current_ua, TORCH_CURRENT_MAX_UA * led->chan_count);
+ led->max_torch_current_ma = current_ua / UA_PER_MA;
+
++ torch_clamp = (current_ua / led->chan_count) / TORCH_IRES_UA;
++ if (torch_clamp != 0)
++ torch_clamp--;
++
++ flash_data->torch_clamp = max_t(u8, flash_data->torch_clamp, torch_clamp);
++
+ if (fwnode_property_present(node, "flash-max-microamp")) {
+ flash->led_cdev.flags |= LED_DEV_CAP_FLASH;
+
+@@ -917,8 +928,7 @@ static int qcom_flash_led_probe(struct platform_device *pdev)
+ flash_data->leds_count++;
+ }
+
+- return 0;
+-
++ return regmap_field_write(flash_data->r_fields[REG_TORCH_CLAMP], flash_data->torch_clamp);
+ release:
+ while (flash_data->v4l2_flash[flash_data->leds_count] && flash_data->leds_count)
+ v4l2_flash_release(flash_data->v4l2_flash[flash_data->leds_count--]);
+diff --git a/drivers/leds/leds-lp55xx-common.c b/drivers/leds/leds-lp55xx-common.c
+index e71456a56ab8da..fd447eb7eb15e2 100644
+--- a/drivers/leds/leds-lp55xx-common.c
++++ b/drivers/leds/leds-lp55xx-common.c
+@@ -212,7 +212,7 @@ int lp55xx_update_program_memory(struct lp55xx_chip *chip,
+ * For LED chip that support page, PAGE is already set in load_engine.
+ */
+ if (!cfg->pages_per_engine)
+- start_addr += LP55xx_BYTES_PER_PAGE * idx;
++ start_addr += LP55xx_BYTES_PER_PAGE * (idx - 1);
+
+ for (page = 0; page < program_length / LP55xx_BYTES_PER_PAGE; page++) {
+ /* Write to the next page each 32 bytes (if supported) */
+diff --git a/drivers/leds/leds-max77705.c b/drivers/leds/leds-max77705.c
+index 933cb4f19be9bc..b7403b3fcf5e72 100644
+--- a/drivers/leds/leds-max77705.c
++++ b/drivers/leds/leds-max77705.c
+@@ -180,7 +180,7 @@ static int max77705_add_led(struct device *dev, struct regmap *regmap, struct fw
+
+ ret = fwnode_property_read_u32(np, "reg", ®);
+ if (ret || reg >= MAX77705_LED_NUM_LEDS)
+- ret = -EINVAL;
++ return -EINVAL;
+
+ info = devm_kcalloc(dev, num_channels, sizeof(*info), GFP_KERNEL);
+ if (!info)
+diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
+index c889332e533bca..0070e4462ee2fe 100644
+--- a/drivers/md/dm-core.h
++++ b/drivers/md/dm-core.h
+@@ -162,6 +162,7 @@ struct mapped_device {
+ #define DMF_SUSPENDED_INTERNALLY 7
+ #define DMF_POST_SUSPENDING 8
+ #define DMF_EMULATE_ZONE_APPEND 9
++#define DMF_QUEUE_STOPPED 10
+
+ static inline sector_t dm_get_size(struct mapped_device *md)
+ {
+diff --git a/drivers/md/dm-vdo/indexer/volume-index.c b/drivers/md/dm-vdo/indexer/volume-index.c
+index 12f954a0c5325d..afb062e1f1fb48 100644
+--- a/drivers/md/dm-vdo/indexer/volume-index.c
++++ b/drivers/md/dm-vdo/indexer/volume-index.c
+@@ -836,7 +836,7 @@ static int start_restoring_volume_sub_index(struct volume_sub_index *sub_index,
+ "%zu bytes decoded of %zu expected", offset,
+ sizeof(buffer));
+ if (result != VDO_SUCCESS)
+- result = UDS_CORRUPT_DATA;
++ return UDS_CORRUPT_DATA;
+
+ if (memcmp(header.magic, MAGIC_START_5, MAGIC_SIZE) != 0) {
+ return vdo_log_warning_strerror(UDS_CORRUPT_DATA,
+@@ -928,7 +928,7 @@ static int start_restoring_volume_index(struct volume_index *volume_index,
+ "%zu bytes decoded of %zu expected", offset,
+ sizeof(buffer));
+ if (result != VDO_SUCCESS)
+- result = UDS_CORRUPT_DATA;
++ return UDS_CORRUPT_DATA;
+
+ if (memcmp(header.magic, MAGIC_START_6, MAGIC_SIZE) != 0)
+ return vdo_log_warning_strerror(UDS_CORRUPT_DATA,
+diff --git a/drivers/md/dm.c b/drivers/md/dm.c
+index a44e8c2dccee4e..66dd5f6ce778b6 100644
+--- a/drivers/md/dm.c
++++ b/drivers/md/dm.c
+@@ -2908,7 +2908,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
+ {
+ bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG;
+ bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG;
+- int r;
++ int r = 0;
+
+ lockdep_assert_held(&md->suspend_lock);
+
+@@ -2960,8 +2960,10 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
+ * Stop md->queue before flushing md->wq in case request-based
+ * dm defers requests to md->wq from md->queue.
+ */
+- if (dm_request_based(md))
++ if (map && dm_request_based(md)) {
+ dm_stop_queue(md->queue);
++ set_bit(DMF_QUEUE_STOPPED, &md->flags);
++ }
+
+ flush_workqueue(md->wq);
+
+@@ -2970,7 +2972,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
+ * We call dm_wait_for_completion to wait for all existing requests
+ * to finish.
+ */
+- r = dm_wait_for_completion(md, task_state);
++ if (map)
++ r = dm_wait_for_completion(md, task_state);
+ if (!r)
+ set_bit(dmf_suspended_flag, &md->flags);
+
+@@ -2983,7 +2986,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
+ if (r < 0) {
+ dm_queue_flush(md);
+
+- if (dm_request_based(md))
++ if (test_and_clear_bit(DMF_QUEUE_STOPPED, &md->flags))
+ dm_start_queue(md->queue);
+
+ unlock_fs(md);
+@@ -3067,7 +3070,7 @@ static int __dm_resume(struct mapped_device *md, struct dm_table *map)
+ * so that mapping of targets can work correctly.
+ * Request-based dm is queueing the deferred I/Os in its request_queue.
+ */
+- if (dm_request_based(md))
++ if (test_and_clear_bit(DMF_QUEUE_STOPPED, &md->flags))
+ dm_start_queue(md->queue);
+
+ unlock_fs(md);
+diff --git a/drivers/media/i2c/rj54n1cb0c.c b/drivers/media/i2c/rj54n1cb0c.c
+index b7ca39f63dba84..6dfc912168510f 100644
+--- a/drivers/media/i2c/rj54n1cb0c.c
++++ b/drivers/media/i2c/rj54n1cb0c.c
+@@ -1329,10 +1329,13 @@ static int rj54n1_probe(struct i2c_client *client)
+ V4L2_CID_GAIN, 0, 127, 1, 66);
+ v4l2_ctrl_new_std(&rj54n1->hdl, &rj54n1_ctrl_ops,
+ V4L2_CID_AUTO_WHITE_BALANCE, 0, 1, 1, 1);
+- rj54n1->subdev.ctrl_handler = &rj54n1->hdl;
+- if (rj54n1->hdl.error)
+- return rj54n1->hdl.error;
+
++ if (rj54n1->hdl.error) {
++ ret = rj54n1->hdl.error;
++ goto err_free_ctrl;
++ }
++
++ rj54n1->subdev.ctrl_handler = &rj54n1->hdl;
+ rj54n1->clk_div = clk_div;
+ rj54n1->rect.left = RJ54N1_COLUMN_SKIP;
+ rj54n1->rect.top = RJ54N1_ROW_SKIP;
+diff --git a/drivers/media/i2c/vd55g1.c b/drivers/media/i2c/vd55g1.c
+index 7c39183dd44bfe..4a62d350068294 100644
+--- a/drivers/media/i2c/vd55g1.c
++++ b/drivers/media/i2c/vd55g1.c
+@@ -66,7 +66,7 @@
+ #define VD55G1_REG_READOUT_CTRL CCI_REG8(0x052e)
+ #define VD55G1_READOUT_CTRL_BIN_MODE_NORMAL 0
+ #define VD55G1_READOUT_CTRL_BIN_MODE_DIGITAL_X2 1
+-#define VD55G1_REG_DUSTER_CTRL CCI_REG8(0x03ea)
++#define VD55G1_REG_DUSTER_CTRL CCI_REG8(0x03ae)
+ #define VD55G1_DUSTER_ENABLE BIT(0)
+ #define VD55G1_DUSTER_DISABLE 0
+ #define VD55G1_DUSTER_DYN_ENABLE BIT(1)
+diff --git a/drivers/media/pci/zoran/zoran.h b/drivers/media/pci/zoran/zoran.h
+index 1cd990468d3de9..d05e222b392156 100644
+--- a/drivers/media/pci/zoran/zoran.h
++++ b/drivers/media/pci/zoran/zoran.h
+@@ -154,12 +154,6 @@ struct zoran_jpg_settings {
+
+ struct zoran;
+
+-/* zoran_fh contains per-open() settings */
+-struct zoran_fh {
+- struct v4l2_fh fh;
+- struct zoran *zr;
+-};
+-
+ struct card_info {
+ enum card_type type;
+ char name[32];
+diff --git a/drivers/media/pci/zoran/zoran_driver.c b/drivers/media/pci/zoran/zoran_driver.c
+index f42f596d3e6295..ec7fc1da4cc02f 100644
+--- a/drivers/media/pci/zoran/zoran_driver.c
++++ b/drivers/media/pci/zoran/zoran_driver.c
+@@ -511,12 +511,11 @@ static int zoran_s_fmt_vid_cap(struct file *file, void *__fh,
+ struct v4l2_format *fmt)
+ {
+ struct zoran *zr = video_drvdata(file);
+- struct zoran_fh *fh = __fh;
+ int i;
+ int res = 0;
+
+ if (fmt->fmt.pix.pixelformat == V4L2_PIX_FMT_MJPEG)
+- return zoran_s_fmt_vid_out(file, fh, fmt);
++ return zoran_s_fmt_vid_out(file, __fh, fmt);
+
+ for (i = 0; i < NUM_FORMATS; i++)
+ if (fmt->fmt.pix.pixelformat == zoran_formats[i].fourcc)
+diff --git a/drivers/media/platform/st/sti/delta/delta-mjpeg-dec.c b/drivers/media/platform/st/sti/delta/delta-mjpeg-dec.c
+index 0533d4a083d249..a078f1107300ee 100644
+--- a/drivers/media/platform/st/sti/delta/delta-mjpeg-dec.c
++++ b/drivers/media/platform/st/sti/delta/delta-mjpeg-dec.c
+@@ -239,7 +239,7 @@ static int delta_mjpeg_ipc_open(struct delta_ctx *pctx)
+ return 0;
+ }
+
+-static int delta_mjpeg_ipc_decode(struct delta_ctx *pctx, struct delta_au *au)
++static int delta_mjpeg_ipc_decode(struct delta_ctx *pctx, dma_addr_t pstart, dma_addr_t pend)
+ {
+ struct delta_dev *delta = pctx->dev;
+ struct delta_mjpeg_ctx *ctx = to_ctx(pctx);
+@@ -256,8 +256,8 @@ static int delta_mjpeg_ipc_decode(struct delta_ctx *pctx, struct delta_au *au)
+
+ memset(params, 0, sizeof(*params));
+
+- params->picture_start_addr_p = (u32)(au->paddr);
+- params->picture_end_addr_p = (u32)(au->paddr + au->size - 1);
++ params->picture_start_addr_p = pstart;
++ params->picture_end_addr_p = pend;
+
+ /*
+ * !WARNING!
+@@ -374,12 +374,14 @@ static int delta_mjpeg_decode(struct delta_ctx *pctx, struct delta_au *pau)
+ struct delta_dev *delta = pctx->dev;
+ struct delta_mjpeg_ctx *ctx = to_ctx(pctx);
+ int ret;
+- struct delta_au au = *pau;
++ void *au_vaddr = pau->vaddr;
++ dma_addr_t au_dma = pau->paddr;
++ size_t au_size = pau->size;
+ unsigned int data_offset = 0;
+ struct mjpeg_header *header = &ctx->header_struct;
+
+ if (!ctx->header) {
+- ret = delta_mjpeg_read_header(pctx, au.vaddr, au.size,
++ ret = delta_mjpeg_read_header(pctx, au_vaddr, au_size,
+ header, &data_offset);
+ if (ret) {
+ pctx->stream_errors++;
+@@ -405,17 +407,17 @@ static int delta_mjpeg_decode(struct delta_ctx *pctx, struct delta_au *pau)
+ goto err;
+ }
+
+- ret = delta_mjpeg_read_header(pctx, au.vaddr, au.size,
++ ret = delta_mjpeg_read_header(pctx, au_vaddr, au_size,
+ ctx->header, &data_offset);
+ if (ret) {
+ pctx->stream_errors++;
+ goto err;
+ }
+
+- au.paddr += data_offset;
+- au.vaddr += data_offset;
++ au_dma += data_offset;
++ au_vaddr += data_offset;
+
+- ret = delta_mjpeg_ipc_decode(pctx, &au);
++ ret = delta_mjpeg_ipc_decode(pctx, au_dma, au_dma + au_size - 1);
+ if (ret)
+ goto err;
+
+diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+index 4c1a68c9f5750f..6daf33e07ea0a8 100644
+--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
++++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .max_register = 0xff,
++ /* The hardware does not support reading multiple registers at once */
++ .use_single_read = true,
+ };
+
+ static const struct regmap_irq chtdc_ti_irqs[] = {
+diff --git a/drivers/mfd/max77705.c b/drivers/mfd/max77705.c
+index 6b263bacb8c28d..e1a9bfd6585603 100644
+--- a/drivers/mfd/max77705.c
++++ b/drivers/mfd/max77705.c
+@@ -61,21 +61,21 @@ static const struct regmap_config max77705_regmap_config = {
+ .max_register = MAX77705_PMIC_REG_USBC_RESET,
+ };
+
+-static const struct regmap_irq max77705_topsys_irqs[] = {
+- { .mask = MAX77705_SYSTEM_IRQ_BSTEN_INT, },
+- { .mask = MAX77705_SYSTEM_IRQ_SYSUVLO_INT, },
+- { .mask = MAX77705_SYSTEM_IRQ_SYSOVLO_INT, },
+- { .mask = MAX77705_SYSTEM_IRQ_TSHDN_INT, },
+- { .mask = MAX77705_SYSTEM_IRQ_TM_INT, },
++static const struct regmap_irq max77705_irqs[] = {
++ { .mask = MAX77705_SRC_IRQ_CHG, },
++ { .mask = MAX77705_SRC_IRQ_TOP, },
++ { .mask = MAX77705_SRC_IRQ_FG, },
++ { .mask = MAX77705_SRC_IRQ_USBC, },
+ };
+
+-static const struct regmap_irq_chip max77705_topsys_irq_chip = {
+- .name = "max77705-topsys",
+- .status_base = MAX77705_PMIC_REG_SYSTEM_INT,
+- .mask_base = MAX77705_PMIC_REG_SYSTEM_INT_MASK,
++static const struct regmap_irq_chip max77705_irq_chip = {
++ .name = "max77705",
++ .status_base = MAX77705_PMIC_REG_INTSRC,
++ .ack_base = MAX77705_PMIC_REG_INTSRC,
++ .mask_base = MAX77705_PMIC_REG_INTSRC_MASK,
+ .num_regs = 1,
+- .irqs = max77705_topsys_irqs,
+- .num_irqs = ARRAY_SIZE(max77705_topsys_irqs),
++ .irqs = max77705_irqs,
++ .num_irqs = ARRAY_SIZE(max77705_irqs),
+ };
+
+ static int max77705_i2c_probe(struct i2c_client *i2c)
+@@ -108,21 +108,17 @@ static int max77705_i2c_probe(struct i2c_client *i2c)
+ if (pmic_rev != MAX77705_PASS3)
+ return dev_err_probe(dev, -ENODEV, "Rev.0x%x is not tested\n", pmic_rev);
+
++ /* Active Discharge Enable */
++ regmap_update_bits(max77705->regmap, MAX77705_PMIC_REG_MAINCTRL1, 1, 1);
++
+ ret = devm_regmap_add_irq_chip(dev, max77705->regmap,
+ i2c->irq,
+- IRQF_ONESHOT | IRQF_SHARED, 0,
+- &max77705_topsys_irq_chip,
++ IRQF_ONESHOT, 0,
++ &max77705_irq_chip,
+ &irq_data);
+ if (ret)
+ return dev_err_probe(dev, ret, "Failed to add IRQ chip\n");
+
+- /* Unmask interrupts from all blocks in interrupt source register */
+- ret = regmap_update_bits(max77705->regmap,
+- MAX77705_PMIC_REG_INTSRC_MASK,
+- MAX77705_SRC_IRQ_ALL, (unsigned int)~MAX77705_SRC_IRQ_ALL);
+- if (ret < 0)
+- return dev_err_probe(dev, ret, "Could not unmask interrupts in INTSRC\n");
+-
+ domain = regmap_irq_get_domain(irq_data);
+
+ ret = devm_mfd_add_devices(dev, PLATFORM_DEVID_NONE,
+diff --git a/drivers/mfd/rz-mtu3.c b/drivers/mfd/rz-mtu3.c
+index f3dac4a29a8324..9cdfef610398f3 100644
+--- a/drivers/mfd/rz-mtu3.c
++++ b/drivers/mfd/rz-mtu3.c
+@@ -32,7 +32,7 @@ static const unsigned long rz_mtu3_8bit_ch_reg_offs[][13] = {
+ [RZ_MTU3_CHAN_2] = MTU_8BIT_CH_1_2(0x204, 0x092, 0x205, 0x200, 0x20c, 0x201, 0x202),
+ [RZ_MTU3_CHAN_3] = MTU_8BIT_CH_3_4_6_7(0x008, 0x093, 0x02c, 0x000, 0x04c, 0x002, 0x004, 0x005, 0x038),
+ [RZ_MTU3_CHAN_4] = MTU_8BIT_CH_3_4_6_7(0x009, 0x094, 0x02d, 0x001, 0x04d, 0x003, 0x006, 0x007, 0x039),
+- [RZ_MTU3_CHAN_5] = MTU_8BIT_CH_5(0xab2, 0x1eb, 0xab4, 0xab6, 0xa84, 0xa85, 0xa86, 0xa94, 0xa95, 0xa96, 0xaa4, 0xaa5, 0xaa6),
++ [RZ_MTU3_CHAN_5] = MTU_8BIT_CH_5(0xab2, 0x895, 0xab4, 0xab6, 0xa84, 0xa85, 0xa86, 0xa94, 0xa95, 0xa96, 0xaa4, 0xaa5, 0xaa6),
+ [RZ_MTU3_CHAN_6] = MTU_8BIT_CH_3_4_6_7(0x808, 0x893, 0x82c, 0x800, 0x84c, 0x802, 0x804, 0x805, 0x838),
+ [RZ_MTU3_CHAN_7] = MTU_8BIT_CH_3_4_6_7(0x809, 0x894, 0x82d, 0x801, 0x84d, 0x803, 0x806, 0x807, 0x839),
+ [RZ_MTU3_CHAN_8] = MTU_8BIT_CH_8(0x404, 0x098, 0x400, 0x406, 0x401, 0x402, 0x403)
+diff --git a/drivers/mfd/vexpress-sysreg.c b/drivers/mfd/vexpress-sysreg.c
+index fc2daffc4352cc..77245c1e5d7df4 100644
+--- a/drivers/mfd/vexpress-sysreg.c
++++ b/drivers/mfd/vexpress-sysreg.c
+@@ -99,6 +99,7 @@ static int vexpress_sysreg_probe(struct platform_device *pdev)
+ struct resource *mem;
+ void __iomem *base;
+ struct gpio_chip *mmc_gpio_chip;
++ int ret;
+
+ mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!mem)
+@@ -119,7 +120,10 @@ static int vexpress_sysreg_probe(struct platform_device *pdev)
+ bgpio_init(mmc_gpio_chip, &pdev->dev, 0x4, base + SYS_MCI,
+ NULL, NULL, NULL, NULL, 0);
+ mmc_gpio_chip->ngpio = 2;
+- devm_gpiochip_add_data(&pdev->dev, mmc_gpio_chip, NULL);
++
++ ret = devm_gpiochip_add_data(&pdev->dev, mmc_gpio_chip, NULL);
++ if (ret)
++ return ret;
+
+ return devm_mfd_add_devices(&pdev->dev, PLATFORM_DEVID_AUTO,
+ vexpress_sysreg_cells,
+diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
+index 53e88a1bc43044..7eec907ed45424 100644
+--- a/drivers/misc/fastrpc.c
++++ b/drivers/misc/fastrpc.c
+@@ -323,11 +323,11 @@ static void fastrpc_free_map(struct kref *ref)
+
+ perm.vmid = QCOM_SCM_VMID_HLOS;
+ perm.perm = QCOM_SCM_PERM_RWX;
+- err = qcom_scm_assign_mem(map->phys, map->size,
++ err = qcom_scm_assign_mem(map->phys, map->len,
+ &src_perms, &perm, 1);
+ if (err) {
+ dev_err(map->fl->sctx->dev, "Failed to assign memory phys 0x%llx size 0x%llx err %d\n",
+- map->phys, map->size, err);
++ map->phys, map->len, err);
+ return;
+ }
+ }
+@@ -363,26 +363,21 @@ static int fastrpc_map_get(struct fastrpc_map *map)
+
+
+ static int fastrpc_map_lookup(struct fastrpc_user *fl, int fd,
+- struct fastrpc_map **ppmap, bool take_ref)
++ struct fastrpc_map **ppmap)
+ {
+- struct fastrpc_session_ctx *sess = fl->sctx;
+ struct fastrpc_map *map = NULL;
++ struct dma_buf *buf;
+ int ret = -ENOENT;
+
++ buf = dma_buf_get(fd);
++ if (IS_ERR(buf))
++ return PTR_ERR(buf);
++
+ spin_lock(&fl->lock);
+ list_for_each_entry(map, &fl->maps, node) {
+- if (map->fd != fd)
++ if (map->fd != fd || map->buf != buf)
+ continue;
+
+- if (take_ref) {
+- ret = fastrpc_map_get(map);
+- if (ret) {
+- dev_dbg(sess->dev, "%s: Failed to get map fd=%d ret=%d\n",
+- __func__, fd, ret);
+- break;
+- }
+- }
+-
+ *ppmap = map;
+ ret = 0;
+ break;
+@@ -752,16 +747,14 @@ static const struct dma_buf_ops fastrpc_dma_buf_ops = {
+ .release = fastrpc_release,
+ };
+
+-static int fastrpc_map_create(struct fastrpc_user *fl, int fd,
++static int fastrpc_map_attach(struct fastrpc_user *fl, int fd,
+ u64 len, u32 attr, struct fastrpc_map **ppmap)
+ {
+ struct fastrpc_session_ctx *sess = fl->sctx;
+ struct fastrpc_map *map = NULL;
+ struct sg_table *table;
+- int err = 0;
+-
+- if (!fastrpc_map_lookup(fl, fd, ppmap, true))
+- return 0;
++ struct scatterlist *sgl = NULL;
++ int err = 0, sgl_index = 0;
+
+ map = kzalloc(sizeof(*map), GFP_KERNEL);
+ if (!map)
+@@ -798,7 +791,15 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int fd,
+ map->phys = sg_dma_address(map->table->sgl);
+ map->phys += ((u64)fl->sctx->sid << 32);
+ }
+- map->size = len;
++ for_each_sg(map->table->sgl, sgl, map->table->nents,
++ sgl_index)
++ map->size += sg_dma_len(sgl);
++ if (len > map->size) {
++ dev_dbg(sess->dev, "Bad size passed len 0x%llx map size 0x%llx\n",
++ len, map->size);
++ err = -EINVAL;
++ goto map_err;
++ }
+ map->va = sg_virt(map->table->sgl);
+ map->len = len;
+
+@@ -815,10 +816,10 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int fd,
+ dst_perms[1].vmid = fl->cctx->vmperms[0].vmid;
+ dst_perms[1].perm = QCOM_SCM_PERM_RWX;
+ map->attr = attr;
+- err = qcom_scm_assign_mem(map->phys, (u64)map->size, &src_perms, dst_perms, 2);
++ err = qcom_scm_assign_mem(map->phys, (u64)map->len, &src_perms, dst_perms, 2);
+ if (err) {
+ dev_err(sess->dev, "Failed to assign memory with phys 0x%llx size 0x%llx err %d\n",
+- map->phys, map->size, err);
++ map->phys, map->len, err);
+ goto map_err;
+ }
+ }
+@@ -839,6 +840,24 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int fd,
+ return err;
+ }
+
++static int fastrpc_map_create(struct fastrpc_user *fl, int fd,
++ u64 len, u32 attr, struct fastrpc_map **ppmap)
++{
++ struct fastrpc_session_ctx *sess = fl->sctx;
++ int err = 0;
++
++ if (!fastrpc_map_lookup(fl, fd, ppmap)) {
++ if (!fastrpc_map_get(*ppmap))
++ return 0;
++ dev_dbg(sess->dev, "%s: Failed to get map fd=%d\n",
++ __func__, fd);
++ }
++
++ err = fastrpc_map_attach(fl, fd, len, attr, ppmap);
++
++ return err;
++}
++
+ /*
+ * Fastrpc payload buffer with metadata looks like:
+ *
+@@ -911,8 +930,12 @@ static int fastrpc_create_maps(struct fastrpc_invoke_ctx *ctx)
+ ctx->args[i].length == 0)
+ continue;
+
+- err = fastrpc_map_create(ctx->fl, ctx->args[i].fd,
+- ctx->args[i].length, ctx->args[i].attr, &ctx->maps[i]);
++ if (i < ctx->nbufs)
++ err = fastrpc_map_create(ctx->fl, ctx->args[i].fd,
++ ctx->args[i].length, ctx->args[i].attr, &ctx->maps[i]);
++ else
++ err = fastrpc_map_attach(ctx->fl, ctx->args[i].fd,
++ ctx->args[i].length, ctx->args[i].attr, &ctx->maps[i]);
+ if (err) {
+ dev_err(dev, "Error Creating map %d\n", err);
+ return -EINVAL;
+@@ -1071,6 +1094,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
+ struct fastrpc_phy_page *pages;
+ u64 *fdlist;
+ int i, inbufs, outbufs, handles;
++ int ret = 0;
+
+ inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
+ outbufs = REMOTE_SCALARS_OUTBUFS(ctx->sc);
+@@ -1086,23 +1110,26 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
+ u64 len = rpra[i].buf.len;
+
+ if (!kernel) {
+- if (copy_to_user((void __user *)dst, src, len))
+- return -EFAULT;
++ if (copy_to_user((void __user *)dst, src, len)) {
++ ret = -EFAULT;
++ goto cleanup_fdlist;
++ }
+ } else {
+ memcpy(dst, src, len);
+ }
+ }
+ }
+
++cleanup_fdlist:
+ /* Clean up fdlist which is updated by DSP */
+ for (i = 0; i < FASTRPC_MAX_FDLIST; i++) {
+ if (!fdlist[i])
+ break;
+- if (!fastrpc_map_lookup(fl, (int)fdlist[i], &mmap, false))
++ if (!fastrpc_map_lookup(fl, (int)fdlist[i], &mmap))
+ fastrpc_map_put(mmap);
+ }
+
+- return 0;
++ return ret;
+ }
+
+ static int fastrpc_invoke_send(struct fastrpc_session_ctx *sctx,
+@@ -2046,7 +2073,7 @@ static int fastrpc_req_mem_map(struct fastrpc_user *fl, char __user *argp)
+ args[0].length = sizeof(req_msg);
+
+ pages.addr = map->phys;
+- pages.size = map->size;
++ pages.size = map->len;
+
+ args[1].ptr = (u64) (uintptr_t) &pages;
+ args[1].length = sizeof(pages);
+@@ -2061,7 +2088,7 @@ static int fastrpc_req_mem_map(struct fastrpc_user *fl, char __user *argp)
+ err = fastrpc_internal_invoke(fl, true, FASTRPC_INIT_HANDLE, sc, &args[0]);
+ if (err) {
+ dev_err(dev, "mem mmap error, fd %d, vaddr %llx, size %lld\n",
+- req.fd, req.vaddrin, map->size);
++ req.fd, req.vaddrin, map->len);
+ goto err_invoke;
+ }
+
+@@ -2074,7 +2101,7 @@ static int fastrpc_req_mem_map(struct fastrpc_user *fl, char __user *argp)
+ if (copy_to_user((void __user *)argp, &req, sizeof(req))) {
+ /* unmap the memory and release the buffer */
+ req_unmap.vaddr = (uintptr_t) rsp_msg.vaddr;
+- req_unmap.length = map->size;
++ req_unmap.length = map->len;
+ fastrpc_req_mem_unmap_impl(fl, &req_unmap);
+ return -EFAULT;
+ }
+diff --git a/drivers/misc/genwqe/card_ddcb.c b/drivers/misc/genwqe/card_ddcb.c
+index 500b1feaf1f6f5..fd7d5cd50d3966 100644
+--- a/drivers/misc/genwqe/card_ddcb.c
++++ b/drivers/misc/genwqe/card_ddcb.c
+@@ -923,7 +923,7 @@ int __genwqe_execute_raw_ddcb(struct genwqe_dev *cd,
+ }
+ if (cmd->asv_length > DDCB_ASV_LENGTH) {
+ dev_err(&pci_dev->dev, "[%s] err: wrong asv_length of %d\n",
+- __func__, cmd->asiv_length);
++ __func__, cmd->asv_length);
+ return -EINVAL;
+ }
+ rc = __genwqe_enqueue_ddcb(cd, req, f_flags);
+diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
+index 1c156a3f845e11..f935175d8bf550 100644
+--- a/drivers/misc/pci_endpoint_test.c
++++ b/drivers/misc/pci_endpoint_test.c
+@@ -937,7 +937,7 @@ static long pci_endpoint_test_ioctl(struct file *file, unsigned int cmd,
+ switch (cmd) {
+ case PCITEST_BAR:
+ bar = arg;
+- if (bar > BAR_5)
++ if (bar <= NO_BAR || bar > BAR_5)
+ goto ret;
+ if (is_am654_pci_dev(pdev) && bar == BAR_0)
+ goto ret;
+diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
+index 9cc47bf94804b6..dd6cffc0df729a 100644
+--- a/drivers/mmc/core/block.c
++++ b/drivers/mmc/core/block.c
+@@ -2936,15 +2936,15 @@ static int mmc_route_rpmb_frames(struct device *dev, u8 *req,
+ return -ENOMEM;
+
+ if (write) {
+- struct rpmb_frame *frm = (struct rpmb_frame *)resp;
++ struct rpmb_frame *resp_frm = (struct rpmb_frame *)resp;
+
+ /* Send write request frame(s) */
+ set_idata(idata[0], MMC_WRITE_MULTIPLE_BLOCK,
+ 1 | MMC_CMD23_ARG_REL_WR, req, req_len);
+
+ /* Send result request frame */
+- memset(frm, 0, sizeof(*frm));
+- frm->req_resp = cpu_to_be16(RPMB_RESULT_READ);
++ memset(resp_frm, 0, sizeof(*resp_frm));
++ resp_frm->req_resp = cpu_to_be16(RPMB_RESULT_READ);
+ set_idata(idata[1], MMC_WRITE_MULTIPLE_BLOCK, 1, resp,
+ resp_len);
+
+diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
+index 7232de1c068873..5cc415ba4f550d 100644
+--- a/drivers/mmc/host/Kconfig
++++ b/drivers/mmc/host/Kconfig
+@@ -1115,6 +1115,7 @@ config MMC_LOONGSON2
+ tristate "Loongson-2K SD/SDIO/eMMC Host Interface support"
+ depends on LOONGARCH || COMPILE_TEST
+ depends on HAS_DMA
++ select REGMAP_MMIO
+ help
+ This selects support for the SD/SDIO/eMMC Host Controller on
+ Loongson-2K series CPUs.
+diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
+index db94d14a3807f5..49e00458eebeba 100644
+--- a/drivers/mtd/nand/raw/atmel/nand-controller.c
++++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
+@@ -1858,7 +1858,7 @@ atmel_nand_controller_legacy_add_nands(struct atmel_nand_controller *nc)
+
+ static int atmel_nand_controller_add_nands(struct atmel_nand_controller *nc)
+ {
+- struct device_node *np, *nand_np;
++ struct device_node *np;
+ struct device *dev = nc->dev;
+ int ret, reg_cells;
+ u32 val;
+@@ -1885,7 +1885,7 @@ static int atmel_nand_controller_add_nands(struct atmel_nand_controller *nc)
+
+ reg_cells += val;
+
+- for_each_child_of_node(np, nand_np) {
++ for_each_child_of_node_scoped(np, nand_np) {
+ struct atmel_nand *nand;
+
+ nand = atmel_nand_create(nc, nand_np, reg_cells);
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index 57be04f6cb11a8..f4f0feddd9fa08 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -4411,7 +4411,7 @@ void bond_work_init_all(struct bonding *bond)
+ INIT_DELAYED_WORK(&bond->slave_arr_work, bond_slave_arr_handler);
+ }
+
+-static void bond_work_cancel_all(struct bonding *bond)
++void bond_work_cancel_all(struct bonding *bond)
+ {
+ cancel_delayed_work_sync(&bond->mii_work);
+ cancel_delayed_work_sync(&bond->arp_work);
+diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
+index 57fff2421f1b58..7a9d73ec8e91cd 100644
+--- a/drivers/net/bonding/bond_netlink.c
++++ b/drivers/net/bonding/bond_netlink.c
+@@ -579,20 +579,22 @@ static int bond_newlink(struct net_device *bond_dev,
+ struct rtnl_newlink_params *params,
+ struct netlink_ext_ack *extack)
+ {
++ struct bonding *bond = netdev_priv(bond_dev);
+ struct nlattr **data = params->data;
+ struct nlattr **tb = params->tb;
+ int err;
+
+- err = bond_changelink(bond_dev, tb, data, extack);
+- if (err < 0)
++ err = register_netdevice(bond_dev);
++ if (err)
+ return err;
+
+- err = register_netdevice(bond_dev);
+- if (!err) {
+- struct bonding *bond = netdev_priv(bond_dev);
++ netif_carrier_off(bond_dev);
++ bond_work_init_all(bond);
+
+- netif_carrier_off(bond_dev);
+- bond_work_init_all(bond);
++ err = bond_changelink(bond_dev, tb, data, extack);
++ if (err) {
++ bond_work_cancel_all(bond);
++ unregister_netdevice(bond_dev);
+ }
+
+ return err;
+diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+index a81d3a7a3bb9ae..fe3479b84a1f31 100644
+--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
++++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+@@ -865,7 +865,10 @@ static u32 ena_get_rxfh_indir_size(struct net_device *netdev)
+
+ static u32 ena_get_rxfh_key_size(struct net_device *netdev)
+ {
+- return ENA_HASH_KEY_SIZE;
++ struct ena_adapter *adapter = netdev_priv(netdev);
++ struct ena_rss *rss = &adapter->ena_dev->rss;
++
++ return rss->hash_key ? ENA_HASH_KEY_SIZE : 0;
+ }
+
+ static int ena_indirection_table_set(struct ena_adapter *adapter,
+diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
+index c9a5c8beb2fa81..a7e845fee4b3a2 100644
+--- a/drivers/net/ethernet/cadence/macb.h
++++ b/drivers/net/ethernet/cadence/macb.h
+@@ -213,10 +213,8 @@
+
+ #define GEM_ISR(hw_q) (0x0400 + ((hw_q) << 2))
+ #define GEM_TBQP(hw_q) (0x0440 + ((hw_q) << 2))
+-#define GEM_TBQPH(hw_q) (0x04C8)
+ #define GEM_RBQP(hw_q) (0x0480 + ((hw_q) << 2))
+ #define GEM_RBQS(hw_q) (0x04A0 + ((hw_q) << 2))
+-#define GEM_RBQPH(hw_q) (0x04D4)
+ #define GEM_IER(hw_q) (0x0600 + ((hw_q) << 2))
+ #define GEM_IDR(hw_q) (0x0620 + ((hw_q) << 2))
+ #define GEM_IMR(hw_q) (0x0640 + ((hw_q) << 2))
+@@ -1214,10 +1212,8 @@ struct macb_queue {
+ unsigned int IDR;
+ unsigned int IMR;
+ unsigned int TBQP;
+- unsigned int TBQPH;
+ unsigned int RBQS;
+ unsigned int RBQP;
+- unsigned int RBQPH;
+
+ /* Lock to protect tx_head and tx_tail */
+ spinlock_t tx_ptr_lock;
+diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
+index c769b7dbd3baf5..fc082a7a5a313b 100644
+--- a/drivers/net/ethernet/cadence/macb_main.c
++++ b/drivers/net/ethernet/cadence/macb_main.c
+@@ -51,14 +51,10 @@ struct sifive_fu540_macb_mgmt {
+ #define DEFAULT_RX_RING_SIZE 512 /* must be power of 2 */
+ #define MIN_RX_RING_SIZE 64
+ #define MAX_RX_RING_SIZE 8192
+-#define RX_RING_BYTES(bp) (macb_dma_desc_get_size(bp) \
+- * (bp)->rx_ring_size)
+
+ #define DEFAULT_TX_RING_SIZE 512 /* must be power of 2 */
+ #define MIN_TX_RING_SIZE 64
+ #define MAX_TX_RING_SIZE 4096
+-#define TX_RING_BYTES(bp) (macb_dma_desc_get_size(bp) \
+- * (bp)->tx_ring_size)
+
+ /* level of occupied TX descriptors under which we wake up TX process */
+ #define MACB_TX_WAKEUP_THRESH(bp) (3 * (bp)->tx_ring_size / 4)
+@@ -495,19 +491,19 @@ static void macb_init_buffers(struct macb *bp)
+ struct macb_queue *queue;
+ unsigned int q;
+
+- for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+- queue_writel(queue, RBQP, lower_32_bits(queue->rx_ring_dma));
+ #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- if (bp->hw_dma_cap & HW_DMA_CAP_64B)
+- queue_writel(queue, RBQPH,
+- upper_32_bits(queue->rx_ring_dma));
++ /* Single register for all queues' high 32 bits. */
++ if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
++ macb_writel(bp, RBQPH,
++ upper_32_bits(bp->queues[0].rx_ring_dma));
++ macb_writel(bp, TBQPH,
++ upper_32_bits(bp->queues[0].tx_ring_dma));
++ }
+ #endif
++
++ for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
++ queue_writel(queue, RBQP, lower_32_bits(queue->rx_ring_dma));
+ queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
+-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- if (bp->hw_dma_cap & HW_DMA_CAP_64B)
+- queue_writel(queue, TBQPH,
+- upper_32_bits(queue->tx_ring_dma));
+-#endif
+ }
+ }
+
+@@ -1166,10 +1162,6 @@ static void macb_tx_error_task(struct work_struct *work)
+
+ /* Reinitialize the TX desc queue */
+ queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
+-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- if (bp->hw_dma_cap & HW_DMA_CAP_64B)
+- queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
+-#endif
+ /* Make TX ring reflect state of hardware */
+ queue->tx_head = 0;
+ queue->tx_tail = 0;
+@@ -2474,35 +2466,42 @@ static void macb_free_rx_buffers(struct macb *bp)
+ }
+ }
+
++static unsigned int macb_tx_ring_size_per_queue(struct macb *bp)
++{
++ return macb_dma_desc_get_size(bp) * bp->tx_ring_size + bp->tx_bd_rd_prefetch;
++}
++
++static unsigned int macb_rx_ring_size_per_queue(struct macb *bp)
++{
++ return macb_dma_desc_get_size(bp) * bp->rx_ring_size + bp->rx_bd_rd_prefetch;
++}
++
+ static void macb_free_consistent(struct macb *bp)
+ {
++ struct device *dev = &bp->pdev->dev;
+ struct macb_queue *queue;
+ unsigned int q;
+- int size;
++ size_t size;
+
+ if (bp->rx_ring_tieoff) {
+- dma_free_coherent(&bp->pdev->dev, macb_dma_desc_get_size(bp),
++ dma_free_coherent(dev, macb_dma_desc_get_size(bp),
+ bp->rx_ring_tieoff, bp->rx_ring_tieoff_dma);
+ bp->rx_ring_tieoff = NULL;
+ }
+
+ bp->macbgem_ops.mog_free_rx_buffers(bp);
+
++ size = bp->num_queues * macb_tx_ring_size_per_queue(bp);
++ dma_free_coherent(dev, size, bp->queues[0].tx_ring, bp->queues[0].tx_ring_dma);
++
++ size = bp->num_queues * macb_rx_ring_size_per_queue(bp);
++ dma_free_coherent(dev, size, bp->queues[0].rx_ring, bp->queues[0].rx_ring_dma);
++
+ for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+ kfree(queue->tx_skb);
+ queue->tx_skb = NULL;
+- if (queue->tx_ring) {
+- size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
+- dma_free_coherent(&bp->pdev->dev, size,
+- queue->tx_ring, queue->tx_ring_dma);
+- queue->tx_ring = NULL;
+- }
+- if (queue->rx_ring) {
+- size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
+- dma_free_coherent(&bp->pdev->dev, size,
+- queue->rx_ring, queue->rx_ring_dma);
+- queue->rx_ring = NULL;
+- }
++ queue->tx_ring = NULL;
++ queue->rx_ring = NULL;
+ }
+ }
+
+@@ -2544,35 +2543,45 @@ static int macb_alloc_rx_buffers(struct macb *bp)
+
+ static int macb_alloc_consistent(struct macb *bp)
+ {
++ struct device *dev = &bp->pdev->dev;
++ dma_addr_t tx_dma, rx_dma;
+ struct macb_queue *queue;
+ unsigned int q;
+- int size;
++ void *tx, *rx;
++ size_t size;
++
++ /*
++ * Upper 32-bits of Tx/Rx DMA descriptor for each queues much match!
++ * We cannot enforce this guarantee, the best we can do is do a single
++ * allocation and hope it will land into alloc_pages() that guarantees
++ * natural alignment of physical addresses.
++ */
++
++ size = bp->num_queues * macb_tx_ring_size_per_queue(bp);
++ tx = dma_alloc_coherent(dev, size, &tx_dma, GFP_KERNEL);
++ if (!tx || upper_32_bits(tx_dma) != upper_32_bits(tx_dma + size - 1))
++ goto out_err;
++ netdev_dbg(bp->dev, "Allocated %zu bytes for %u TX rings at %08lx (mapped %p)\n",
++ size, bp->num_queues, (unsigned long)tx_dma, tx);
++
++ size = bp->num_queues * macb_rx_ring_size_per_queue(bp);
++ rx = dma_alloc_coherent(dev, size, &rx_dma, GFP_KERNEL);
++ if (!rx || upper_32_bits(rx_dma) != upper_32_bits(rx_dma + size - 1))
++ goto out_err;
++ netdev_dbg(bp->dev, "Allocated %zu bytes for %u RX rings at %08lx (mapped %p)\n",
++ size, bp->num_queues, (unsigned long)rx_dma, rx);
+
+ for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+- size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
+- queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
+- &queue->tx_ring_dma,
+- GFP_KERNEL);
+- if (!queue->tx_ring)
+- goto out_err;
+- netdev_dbg(bp->dev,
+- "Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
+- q, size, (unsigned long)queue->tx_ring_dma,
+- queue->tx_ring);
++ queue->tx_ring = tx + macb_tx_ring_size_per_queue(bp) * q;
++ queue->tx_ring_dma = tx_dma + macb_tx_ring_size_per_queue(bp) * q;
++
++ queue->rx_ring = rx + macb_rx_ring_size_per_queue(bp) * q;
++ queue->rx_ring_dma = rx_dma + macb_rx_ring_size_per_queue(bp) * q;
+
+ size = bp->tx_ring_size * sizeof(struct macb_tx_skb);
+ queue->tx_skb = kmalloc(size, GFP_KERNEL);
+ if (!queue->tx_skb)
+ goto out_err;
+-
+- size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
+- queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
+- &queue->rx_ring_dma, GFP_KERNEL);
+- if (!queue->rx_ring)
+- goto out_err;
+- netdev_dbg(bp->dev,
+- "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
+- size, (unsigned long)queue->rx_ring_dma, queue->rx_ring);
+ }
+ if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
+ goto out_err;
+@@ -4309,12 +4318,6 @@ static int macb_init(struct platform_device *pdev)
+ queue->TBQP = GEM_TBQP(hw_q - 1);
+ queue->RBQP = GEM_RBQP(hw_q - 1);
+ queue->RBQS = GEM_RBQS(hw_q - 1);
+-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
+- queue->TBQPH = GEM_TBQPH(hw_q - 1);
+- queue->RBQPH = GEM_RBQPH(hw_q - 1);
+- }
+-#endif
+ } else {
+ /* queue0 uses legacy registers */
+ queue->ISR = MACB_ISR;
+@@ -4323,12 +4326,6 @@ static int macb_init(struct platform_device *pdev)
+ queue->IMR = MACB_IMR;
+ queue->TBQP = MACB_TBQP;
+ queue->RBQP = MACB_RBQP;
+-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
+- queue->TBQPH = MACB_TBQPH;
+- queue->RBQPH = MACB_RBQPH;
+- }
+-#endif
+ }
+
+ /* get irq: here we use the linux queue index, not the hardware
+@@ -5452,6 +5449,11 @@ static int __maybe_unused macb_suspend(struct device *dev)
+ */
+ tmp = macb_readl(bp, NCR);
+ macb_writel(bp, NCR, tmp & ~(MACB_BIT(TE) | MACB_BIT(RE)));
++#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
++ if (!(bp->caps & MACB_CAPS_QUEUE_DISABLE))
++ macb_writel(bp, RBQPH,
++ upper_32_bits(bp->rx_ring_tieoff_dma));
++#endif
+ for (q = 0, queue = bp->queues; q < bp->num_queues;
+ ++q, ++queue) {
+ /* Disable RX queues */
+@@ -5461,10 +5463,6 @@ static int __maybe_unused macb_suspend(struct device *dev)
+ /* Tie off RX queues */
+ queue_writel(queue, RBQP,
+ lower_32_bits(bp->rx_ring_tieoff_dma));
+-#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- queue_writel(queue, RBQPH,
+- upper_32_bits(bp->rx_ring_tieoff_dma));
+-#endif
+ }
+ /* Disable all interrupts */
+ queue_writel(queue, IDR, -1);
+diff --git a/drivers/net/ethernet/dlink/dl2k.c b/drivers/net/ethernet/dlink/dl2k.c
+index 6bbf6e5584e54f..1996d2e4e3e2c9 100644
+--- a/drivers/net/ethernet/dlink/dl2k.c
++++ b/drivers/net/ethernet/dlink/dl2k.c
+@@ -964,15 +964,18 @@ receive_packet (struct net_device *dev)
+ } else {
+ struct sk_buff *skb;
+
++ skb = NULL;
+ /* Small skbuffs for short packets */
+- if (pkt_len > copy_thresh) {
++ if (pkt_len <= copy_thresh)
++ skb = netdev_alloc_skb_ip_align(dev, pkt_len);
++ if (!skb) {
+ dma_unmap_single(&np->pdev->dev,
+ desc_to_dma(desc),
+ np->rx_buf_sz,
+ DMA_FROM_DEVICE);
+ skb_put (skb = np->rx_skbuff[entry], pkt_len);
+ np->rx_skbuff[entry] = NULL;
+- } else if ((skb = netdev_alloc_skb_ip_align(dev, pkt_len))) {
++ } else {
+ dma_sync_single_for_cpu(&np->pdev->dev,
+ desc_to_dma(desc),
+ np->rx_buf_sz,
+diff --git a/drivers/net/ethernet/freescale/enetc/enetc4_pf.c b/drivers/net/ethernet/freescale/enetc/enetc4_pf.c
+index b3dc1afeefd1d5..a5c1f1cef3b0c4 100644
+--- a/drivers/net/ethernet/freescale/enetc/enetc4_pf.c
++++ b/drivers/net/ethernet/freescale/enetc/enetc4_pf.c
+@@ -1030,7 +1030,7 @@ static int enetc4_pf_probe(struct pci_dev *pdev,
+ err = enetc_get_driver_data(si);
+ if (err)
+ return dev_err_probe(dev, err,
+- "Could not get VF driver data\n");
++ "Could not get PF driver data\n");
+
+ err = enetc4_pf_struct_init(si);
+ if (err)
+diff --git a/drivers/net/ethernet/freescale/enetc/ntmp.c b/drivers/net/ethernet/freescale/enetc/ntmp.c
+index ba32c1bbd9e184..0c1d343253bfb7 100644
+--- a/drivers/net/ethernet/freescale/enetc/ntmp.c
++++ b/drivers/net/ethernet/freescale/enetc/ntmp.c
+@@ -52,24 +52,19 @@ int ntmp_init_cbdr(struct netc_cbdr *cbdr, struct device *dev,
+ cbdr->addr_base_align = PTR_ALIGN(cbdr->addr_base,
+ NTMP_BASE_ADDR_ALIGN);
+
+- cbdr->next_to_clean = 0;
+- cbdr->next_to_use = 0;
+ spin_lock_init(&cbdr->ring_lock);
+
++ cbdr->next_to_use = netc_read(cbdr->regs.pir);
++ cbdr->next_to_clean = netc_read(cbdr->regs.cir);
++
+ /* Step 1: Configure the base address of the Control BD Ring */
+ netc_write(cbdr->regs.bar0, lower_32_bits(cbdr->dma_base_align));
+ netc_write(cbdr->regs.bar1, upper_32_bits(cbdr->dma_base_align));
+
+- /* Step 2: Configure the producer index register */
+- netc_write(cbdr->regs.pir, cbdr->next_to_clean);
+-
+- /* Step 3: Configure the consumer index register */
+- netc_write(cbdr->regs.cir, cbdr->next_to_use);
+-
+- /* Step4: Configure the number of BDs of the Control BD Ring */
++ /* Step 2: Configure the number of BDs of the Control BD Ring */
+ netc_write(cbdr->regs.lenr, cbdr->bd_num);
+
+- /* Step 5: Enable the Control BD Ring */
++ /* Step 3: Enable the Control BD Ring */
+ netc_write(cbdr->regs.mr, NETC_CBDR_MR_EN);
+
+ return 0;
+diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+index eaad52a83b04c0..50f90ed3107ec6 100644
+--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
++++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+@@ -3187,18 +3187,14 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
+ /* get the Rx desc from Rx queue based on 'next_to_clean' */
+ rx_desc = &rxq->rx[ntc].flex_adv_nic_3_wb;
+
+- /* This memory barrier is needed to keep us from reading
+- * any other fields out of the rx_desc
+- */
+- dma_rmb();
+-
+ /* if the descriptor isn't done, no work yet to do */
+ gen_id = le16_get_bits(rx_desc->pktlen_gen_bufq_id,
+ VIRTCHNL2_RX_FLEX_DESC_ADV_GEN_M);
+-
+ if (idpf_queue_has(GEN_CHK, rxq) != gen_id)
+ break;
+
++ dma_rmb();
++
+ rxdid = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_RXDID_M,
+ rx_desc->rxdid_ucast);
+ if (rxdid != VIRTCHNL2_RXDID_2_FLEX_SPLITQ) {
+diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+index 6330d4a0ae075d..c1f34381333d13 100644
+--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
++++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+@@ -702,9 +702,9 @@ int idpf_recv_mb_msg(struct idpf_adapter *adapter)
+ /* If post failed clear the only buffer we supplied */
+ if (post_err) {
+ if (dma_mem)
+- dmam_free_coherent(&adapter->pdev->dev,
+- dma_mem->size, dma_mem->va,
+- dma_mem->pa);
++ dma_free_coherent(&adapter->pdev->dev,
++ dma_mem->size, dma_mem->va,
++ dma_mem->pa);
+ break;
+ }
+
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+index 5027fae0aa77a6..e808995703cfd0 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+@@ -3542,6 +3542,7 @@ static void otx2_remove(struct pci_dev *pdev)
+ otx2_disable_mbox_intr(pf);
+ otx2_pfaf_mbox_destroy(pf);
+ pci_free_irq_vectors(pf->pdev);
++ bitmap_free(pf->af_xdp_zc_qidx);
+ pci_set_drvdata(pdev, NULL);
+ free_netdev(netdev);
+ }
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
+index 7ebb6e656884ae..25381f079b97d6 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
+@@ -854,6 +854,7 @@ static void otx2vf_remove(struct pci_dev *pdev)
+ qmem_free(vf->dev, vf->dync_lmt);
+ otx2vf_vfaf_mbox_destroy(vf);
+ pci_free_irq_vectors(vf->pdev);
++ bitmap_free(vf->af_xdp_zc_qidx);
+ pci_set_drvdata(pdev, NULL);
+ free_netdev(netdev);
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+index e395ef5f356eb5..722282cebce9a6 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+@@ -294,6 +294,10 @@ static void poll_timeout(struct mlx5_cmd_work_ent *ent)
+ return;
+ }
+ cond_resched();
++ if (mlx5_cmd_is_down(dev)) {
++ ent->ret = -ENXIO;
++ return;
++ }
+ } while (time_before(jiffies, poll_end));
+
+ ent->ret = -ETIMEDOUT;
+@@ -1070,7 +1074,7 @@ static void cmd_work_handler(struct work_struct *work)
+ poll_timeout(ent);
+ /* make sure we read the descriptor after ownership is SW */
+ rmb();
+- mlx5_cmd_comp_handler(dev, 1ULL << ent->idx, (ent->ret == -ETIMEDOUT));
++ mlx5_cmd_comp_handler(dev, 1ULL << ent->idx, !!ent->ret);
+ }
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h b/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h
+index 66d276a1be836a..f4a19ffbb641c0 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h
+@@ -66,23 +66,11 @@ struct mlx5e_port_buffer {
+ struct mlx5e_bufferx_reg buffer[MLX5E_MAX_NETWORK_BUFFER];
+ };
+
+-#ifdef CONFIG_MLX5_CORE_EN_DCB
+ int mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv,
+ u32 change, unsigned int mtu,
+ struct ieee_pfc *pfc,
+ u32 *buffer_size,
+ u8 *prio2buffer);
+-#else
+-static inline int
+-mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv,
+- u32 change, unsigned int mtu,
+- void *pfc,
+- u32 *buffer_size,
+- u8 *prio2buffer)
+-{
+- return 0;
+-}
+-#endif
+
+ int mlx5e_port_query_buffer(struct mlx5e_priv *priv,
+ struct mlx5e_port_buffer *port_buffer);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index 15eded36b872a2..21bb88c5d3dcee 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -49,7 +49,6 @@
+ #include "en.h"
+ #include "en/dim.h"
+ #include "en/txrx.h"
+-#include "en/port_buffer.h"
+ #include "en_tc.h"
+ #include "en_rep.h"
+ #include "en_accel/ipsec.h"
+@@ -3041,11 +3040,9 @@ int mlx5e_set_dev_port_mtu(struct mlx5e_priv *priv)
+ struct mlx5e_params *params = &priv->channels.params;
+ struct net_device *netdev = priv->netdev;
+ struct mlx5_core_dev *mdev = priv->mdev;
+- u16 mtu, prev_mtu;
++ u16 mtu;
+ int err;
+
+- mlx5e_query_mtu(mdev, params, &prev_mtu);
+-
+ err = mlx5e_set_mtu(mdev, params, params->sw_mtu);
+ if (err)
+ return err;
+@@ -3055,18 +3052,6 @@ int mlx5e_set_dev_port_mtu(struct mlx5e_priv *priv)
+ netdev_warn(netdev, "%s: VPort MTU %d is different than netdev mtu %d\n",
+ __func__, mtu, params->sw_mtu);
+
+- if (mtu != prev_mtu && MLX5_BUFFER_SUPPORTED(mdev)) {
+- err = mlx5e_port_manual_buffer_config(priv, 0, mtu,
+- NULL, NULL, NULL);
+- if (err) {
+- netdev_warn(netdev, "%s: Failed to set Xon/Xoff values with MTU %d (err %d), setting back to previous MTU %d\n",
+- __func__, mtu, err, prev_mtu);
+-
+- mlx5e_set_mtu(mdev, params, prev_mtu);
+- return err;
+- }
+- }
+-
+ params->sw_mtu = mtu;
+ return 0;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+index 22995131824a03..89e399606877ba 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+@@ -27,6 +27,7 @@ struct mlx5_fw_reset {
+ struct work_struct reset_reload_work;
+ struct work_struct reset_now_work;
+ struct work_struct reset_abort_work;
++ struct delayed_work reset_timeout_work;
+ unsigned long reset_flags;
+ u8 reset_method;
+ struct timer_list timer;
+@@ -259,6 +260,8 @@ static int mlx5_sync_reset_clear_reset_requested(struct mlx5_core_dev *dev, bool
+ return -EALREADY;
+ }
+
++ if (current_work() != &fw_reset->reset_timeout_work.work)
++ cancel_delayed_work(&fw_reset->reset_timeout_work);
+ mlx5_stop_sync_reset_poll(dev);
+ if (poll_health)
+ mlx5_start_health_poll(dev);
+@@ -330,6 +333,11 @@ static int mlx5_sync_reset_set_reset_requested(struct mlx5_core_dev *dev)
+ }
+ mlx5_stop_health_poll(dev, true);
+ mlx5_start_sync_reset_poll(dev);
++
++ if (!test_bit(MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS,
++ &fw_reset->reset_flags))
++ schedule_delayed_work(&fw_reset->reset_timeout_work,
++ msecs_to_jiffies(mlx5_tout_ms(dev, PCI_SYNC_UPDATE)));
+ return 0;
+ }
+
+@@ -739,6 +747,19 @@ static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct
+ }
+ }
+
++static void mlx5_sync_reset_timeout_work(struct work_struct *work)
++{
++ struct delayed_work *dwork = container_of(work, struct delayed_work,
++ work);
++ struct mlx5_fw_reset *fw_reset =
++ container_of(dwork, struct mlx5_fw_reset, reset_timeout_work);
++ struct mlx5_core_dev *dev = fw_reset->dev;
++
++ if (mlx5_sync_reset_clear_reset_requested(dev, true))
++ return;
++ mlx5_core_warn(dev, "PCI Sync FW Update Reset Timeout.\n");
++}
++
+ static int fw_reset_event_notifier(struct notifier_block *nb, unsigned long action, void *data)
+ {
+ struct mlx5_fw_reset *fw_reset = mlx5_nb_cof(nb, struct mlx5_fw_reset, nb);
+@@ -822,6 +843,7 @@ void mlx5_drain_fw_reset(struct mlx5_core_dev *dev)
+ cancel_work_sync(&fw_reset->reset_reload_work);
+ cancel_work_sync(&fw_reset->reset_now_work);
+ cancel_work_sync(&fw_reset->reset_abort_work);
++ cancel_delayed_work(&fw_reset->reset_timeout_work);
+ }
+
+ static const struct devlink_param mlx5_fw_reset_devlink_params[] = {
+@@ -865,6 +887,8 @@ int mlx5_fw_reset_init(struct mlx5_core_dev *dev)
+ INIT_WORK(&fw_reset->reset_reload_work, mlx5_sync_reset_reload_work);
+ INIT_WORK(&fw_reset->reset_now_work, mlx5_sync_reset_now_event);
+ INIT_WORK(&fw_reset->reset_abort_work, mlx5_sync_reset_abort_event);
++ INIT_DELAYED_WORK(&fw_reset->reset_timeout_work,
++ mlx5_sync_reset_timeout_work);
+
+ init_completion(&fw_reset->done);
+ return 0;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+index 9bc9bd83c2324c..cd68c4b2c0bf91 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+@@ -489,9 +489,12 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev,
+ u32 func_id;
+ u32 npages;
+ u32 i = 0;
++ int err;
+
+- if (!mlx5_cmd_is_down(dev))
+- return mlx5_cmd_do(dev, in, in_size, out, out_size);
++ err = mlx5_cmd_do(dev, in, in_size, out, out_size);
++ /* If FW is gone (-ENXIO), proceed to forceful reclaim */
++ if (err != -ENXIO)
++ return err;
+
+ /* No hard feelings, we want our pages back! */
+ npages = MLX5_GET(manage_pages_in, in, input_num_entries);
+diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+index a36215195923cf..16c828dd5c1a3f 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
++++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+@@ -1788,7 +1788,7 @@ static u32 nfp_net_get_rxfh_key_size(struct net_device *netdev)
+ struct nfp_net *nn = netdev_priv(netdev);
+
+ if (!(nn->cap & NFP_NET_CFG_CTRL_RSS_ANY))
+- return -EOPNOTSUPP;
++ return 0;
+
+ return nfp_net_rss_key_sz(nn);
+ }
+diff --git a/drivers/net/phy/as21xxx.c b/drivers/net/phy/as21xxx.c
+index 92697f43087dcc..00527736065620 100644
+--- a/drivers/net/phy/as21xxx.c
++++ b/drivers/net/phy/as21xxx.c
+@@ -884,11 +884,12 @@ static int as21xxx_match_phy_device(struct phy_device *phydev,
+ u32 phy_id;
+ int ret;
+
+- /* Skip PHY that are not AS21xxx or already have firmware loaded */
+- if (phydev->c45_ids.device_ids[MDIO_MMD_PCS] != PHY_ID_AS21XXX)
++ /* Skip PHY that are not AS21xxx */
++ if (!phy_id_compare_vendor(phydev->c45_ids.device_ids[MDIO_MMD_PCS],
++ PHY_VENDOR_AEONSEMI))
+ return genphy_match_phy_device(phydev, phydrv);
+
+- /* Read PHY ID to handle firmware just loaded */
++ /* Read PHY ID to handle firmware loaded or HW reset */
+ ret = phy_read_mmd(phydev, MDIO_MMD_PCS, MII_PHYSID1);
+ if (ret < 0)
+ return ret;
+diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
+index 792ddda1ad493d..85bd5d845409b9 100644
+--- a/drivers/net/usb/asix_devices.c
++++ b/drivers/net/usb/asix_devices.c
+@@ -625,6 +625,21 @@ static void ax88772_suspend(struct usbnet *dev)
+ asix_read_medium_status(dev, 1));
+ }
+
++/* Notes on PM callbacks and locking context:
++ *
++ * - asix_suspend()/asix_resume() are invoked for both runtime PM and
++ * system-wide suspend/resume. For struct usb_driver the ->resume()
++ * callback does not receive pm_message_t, so the resume type cannot
++ * be distinguished here.
++ *
++ * - The MAC driver must hold RTNL when calling phylink interfaces such as
++ * phylink_suspend()/resume(). Those calls will also perform MDIO I/O.
++ *
++ * - Taking RTNL and doing MDIO from a runtime-PM resume callback (while
++ * the USB PM lock is held) is fragile. Since autosuspend brings no
++ * measurable power saving here, we block it by holding a PM usage
++ * reference in ax88772_bind().
++ */
+ static int asix_suspend(struct usb_interface *intf, pm_message_t message)
+ {
+ struct usbnet *dev = usb_get_intfdata(intf);
+@@ -919,6 +934,13 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
+ if (ret)
+ goto initphy_err;
+
++ /* Keep this interface runtime-PM active by taking a usage ref.
++ * Prevents runtime suspend while bound and avoids resume paths
++ * that could deadlock (autoresume under RTNL while USB PM lock
++ * is held, phylink/MDIO wants RTNL).
++ */
++ pm_runtime_get_noresume(&intf->dev);
++
+ return 0;
+
+ initphy_err:
+@@ -948,6 +970,8 @@ static void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf)
+ phylink_destroy(priv->phylink);
+ ax88772_mdio_unregister(priv);
+ asix_rx_fixup_common_free(dev->driver_priv);
++ /* Drop the PM usage ref taken in bind() */
++ pm_runtime_put(&intf->dev);
+ }
+
+ static void ax88178_unbind(struct usbnet *dev, struct usb_interface *intf)
+@@ -1600,6 +1624,11 @@ static struct usb_driver asix_driver = {
+ .resume = asix_resume,
+ .reset_resume = asix_resume,
+ .disconnect = usbnet_disconnect,
++ /* usbnet enables autosuspend by default (supports_autosuspend=1).
++ * We keep runtime-PM active for AX88772* by taking a PM usage
++ * reference in ax88772_bind() (pm_runtime_get_noresume()) and
++ * dropping it in unbind(), which effectively blocks autosuspend.
++ */
+ .supports_autosuspend = 1,
+ .disable_hub_initiated_lpm = 1,
+ };
+diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
+index ddff6f19ff98eb..92add3daadbb18 100644
+--- a/drivers/net/usb/rtl8150.c
++++ b/drivers/net/usb/rtl8150.c
+@@ -664,7 +664,6 @@ static void rtl8150_set_multicast(struct net_device *netdev)
+ rtl8150_t *dev = netdev_priv(netdev);
+ u16 rx_creg = 0x9e;
+
+- netif_stop_queue(netdev);
+ if (netdev->flags & IFF_PROMISC) {
+ rx_creg |= 0x0001;
+ dev_info(&netdev->dev, "%s: promiscuous mode\n", netdev->name);
+@@ -678,7 +677,6 @@ static void rtl8150_set_multicast(struct net_device *netdev)
+ rx_creg &= 0x00fc;
+ }
+ async_set_registers(dev, RCR, sizeof(rx_creg), rx_creg);
+- netif_wake_queue(netdev);
+ }
+
+ static netdev_tx_t rtl8150_start_xmit(struct sk_buff *skb,
+diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
+index cb8ae751eb3121..e595b0979a56d3 100644
+--- a/drivers/net/wireless/ath/ath10k/wmi.c
++++ b/drivers/net/wireless/ath/ath10k/wmi.c
+@@ -1764,33 +1764,32 @@ void ath10k_wmi_put_wmi_channel(struct ath10k *ar, struct wmi_channel *ch,
+
+ int ath10k_wmi_wait_for_service_ready(struct ath10k *ar)
+ {
++ unsigned long timeout = jiffies + WMI_SERVICE_READY_TIMEOUT_HZ;
+ unsigned long time_left, i;
+
+- time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
+- WMI_SERVICE_READY_TIMEOUT_HZ);
+- if (!time_left) {
+- /* Sometimes the PCI HIF doesn't receive interrupt
+- * for the service ready message even if the buffer
+- * was completed. PCIe sniffer shows that it's
+- * because the corresponding CE ring doesn't fires
+- * it. Workaround here by polling CE rings once.
+- */
+- ath10k_warn(ar, "failed to receive service ready completion, polling..\n");
+-
++ /* Sometimes the PCI HIF doesn't receive interrupt
++ * for the service ready message even if the buffer
++ * was completed. PCIe sniffer shows that it's
++ * because the corresponding CE ring doesn't fires
++ * it. Workaround here by polling CE rings. Since
++ * the message could arrive at any time, continue
++ * polling until timeout.
++ */
++ do {
+ for (i = 0; i < CE_COUNT; i++)
+ ath10k_hif_send_complete_check(ar, i, 1);
+
++ /* The 100 ms granularity is a tradeoff considering scheduler
++ * overhead and response latency
++ */
+ time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
+- WMI_SERVICE_READY_TIMEOUT_HZ);
+- if (!time_left) {
+- ath10k_warn(ar, "polling timed out\n");
+- return -ETIMEDOUT;
+- }
+-
+- ath10k_warn(ar, "service ready completion received, continuing normally\n");
+- }
++ msecs_to_jiffies(100));
++ if (time_left)
++ return 0;
++ } while (time_before(jiffies, timeout));
+
+- return 0;
++ ath10k_warn(ar, "failed to receive service ready completion\n");
++ return -ETIMEDOUT;
+ }
+
+ int ath10k_wmi_wait_for_unified_ready(struct ath10k *ar)
+diff --git a/drivers/net/wireless/ath/ath12k/ce.c b/drivers/net/wireless/ath/ath12k/ce.c
+index f93a419abf65ec..c5aadbc6367ce0 100644
+--- a/drivers/net/wireless/ath/ath12k/ce.c
++++ b/drivers/net/wireless/ath/ath12k/ce.c
+@@ -478,7 +478,7 @@ static void ath12k_ce_recv_process_cb(struct ath12k_ce_pipe *pipe)
+ }
+
+ while ((skb = __skb_dequeue(&list))) {
+- ath12k_dbg(ab, ATH12K_DBG_AHB, "rx ce pipe %d len %d\n",
++ ath12k_dbg(ab, ATH12K_DBG_CE, "rx ce pipe %d len %d\n",
+ pipe->pipe_num, skb->len);
+ pipe->recv_cb(ab, skb);
+ }
+diff --git a/drivers/net/wireless/ath/ath12k/debug.h b/drivers/net/wireless/ath/ath12k/debug.h
+index 48916e4e1f6014..bf254e43a68d08 100644
+--- a/drivers/net/wireless/ath/ath12k/debug.h
++++ b/drivers/net/wireless/ath/ath12k/debug.h
+@@ -26,6 +26,7 @@ enum ath12k_debug_mask {
+ ATH12K_DBG_DP_TX = 0x00002000,
+ ATH12K_DBG_DP_RX = 0x00004000,
+ ATH12K_DBG_WOW = 0x00008000,
++ ATH12K_DBG_CE = 0x00010000,
+ ATH12K_DBG_ANY = 0xffffffff,
+ };
+
+diff --git a/drivers/net/wireless/ath/ath12k/dp_mon.c b/drivers/net/wireless/ath/ath12k/dp_mon.c
+index 8189e52ed00718..009c495021489d 100644
+--- a/drivers/net/wireless/ath/ath12k/dp_mon.c
++++ b/drivers/net/wireless/ath/ath12k/dp_mon.c
+@@ -1440,6 +1440,34 @@ static void ath12k_dp_mon_parse_rx_msdu_end_err(u32 info, u32 *errmap)
+ *errmap |= HAL_RX_MPDU_ERR_MPDU_LEN;
+ }
+
++static void
++ath12k_parse_cmn_usr_info(const struct hal_phyrx_common_user_info *cmn_usr_info,
++ struct hal_rx_mon_ppdu_info *ppdu_info)
++{
++ struct hal_rx_radiotap_eht *eht = &ppdu_info->eht_info.eht;
++ u32 known, data, cp_setting, ltf_size;
++
++ known = __le32_to_cpu(eht->known);
++ known |= IEEE80211_RADIOTAP_EHT_KNOWN_GI |
++ IEEE80211_RADIOTAP_EHT_KNOWN_EHT_LTF;
++ eht->known = cpu_to_le32(known);
++
++ cp_setting = le32_get_bits(cmn_usr_info->info0,
++ HAL_RX_CMN_USR_INFO0_CP_SETTING);
++ ltf_size = le32_get_bits(cmn_usr_info->info0,
++ HAL_RX_CMN_USR_INFO0_LTF_SIZE);
++
++ data = __le32_to_cpu(eht->data[0]);
++ data |= u32_encode_bits(cp_setting, IEEE80211_RADIOTAP_EHT_DATA0_GI);
++ data |= u32_encode_bits(ltf_size, IEEE80211_RADIOTAP_EHT_DATA0_LTF);
++ eht->data[0] = cpu_to_le32(data);
++
++ if (!ppdu_info->ltf_size)
++ ppdu_info->ltf_size = ltf_size;
++ if (!ppdu_info->gi)
++ ppdu_info->gi = cp_setting;
++}
++
+ static void
+ ath12k_dp_mon_parse_status_msdu_end(struct ath12k_mon_data *pmon,
+ const struct hal_rx_msdu_end *msdu_end)
+@@ -1627,25 +1655,22 @@ ath12k_dp_mon_rx_parse_status_tlv(struct ath12k *ar,
+ const struct hal_rx_phyrx_rssi_legacy_info *rssi = tlv_data;
+
+ info[0] = __le32_to_cpu(rssi->info0);
+- info[1] = __le32_to_cpu(rssi->info1);
++ info[2] = __le32_to_cpu(rssi->info2);
+
+ /* TODO: Please note that the combined rssi will not be accurate
+ * in MU case. Rssi in MU needs to be retrieved from
+ * PHYRX_OTHER_RECEIVE_INFO TLV.
+ */
+ ppdu_info->rssi_comb =
+- u32_get_bits(info[1],
+- HAL_RX_PHYRX_RSSI_LEGACY_INFO_INFO1_RSSI_COMB);
++ u32_get_bits(info[2],
++ HAL_RX_RSSI_LEGACY_INFO_INFO2_RSSI_COMB_PPDU);
+
+ ppdu_info->bw = u32_get_bits(info[0],
+- HAL_RX_PHYRX_RSSI_LEGACY_INFO_INFO0_RX_BW);
++ HAL_RX_RSSI_LEGACY_INFO_INFO0_RX_BW);
+ break;
+ }
+- case HAL_PHYRX_OTHER_RECEIVE_INFO: {
+- const struct hal_phyrx_common_user_info *cmn_usr_info = tlv_data;
+-
+- ppdu_info->gi = le32_get_bits(cmn_usr_info->info0,
+- HAL_RX_PHY_CMN_USER_INFO0_GI);
++ case HAL_PHYRX_COMMON_USER_INFO: {
++ ath12k_parse_cmn_usr_info(tlv_data, ppdu_info);
+ break;
+ }
+ case HAL_RX_PPDU_START_USER_INFO:
+@@ -2154,8 +2179,12 @@ static void ath12k_dp_mon_update_radiotap(struct ath12k *ar,
+ spin_unlock_bh(&ar->data_lock);
+
+ rxs->flag |= RX_FLAG_MACTIME_START;
+- rxs->signal = ppduinfo->rssi_comb + noise_floor;
+ rxs->nss = ppduinfo->nss + 1;
++ if (test_bit(WMI_TLV_SERVICE_HW_DB2DBM_CONVERSION_SUPPORT,
++ ar->ab->wmi_ab.svc_map))
++ rxs->signal = ppduinfo->rssi_comb;
++ else
++ rxs->signal = ppduinfo->rssi_comb + noise_floor;
+
+ if (ppduinfo->userstats[ppduinfo->userid].ampdu_present) {
+ rxs->flag |= RX_FLAG_AMPDU_DETAILS;
+@@ -2244,6 +2273,7 @@ static void ath12k_dp_mon_update_radiotap(struct ath12k *ar,
+
+ static void ath12k_dp_mon_rx_deliver_msdu(struct ath12k *ar, struct napi_struct *napi,
+ struct sk_buff *msdu,
++ const struct hal_rx_mon_ppdu_info *ppduinfo,
+ struct ieee80211_rx_status *status,
+ u8 decap)
+ {
+@@ -2257,7 +2287,6 @@ static void ath12k_dp_mon_rx_deliver_msdu(struct ath12k *ar, struct napi_struct
+ struct ieee80211_sta *pubsta = NULL;
+ struct ath12k_peer *peer;
+ struct ath12k_skb_rxcb *rxcb = ATH12K_SKB_RXCB(msdu);
+- struct ath12k_dp_rx_info rx_info;
+ bool is_mcbc = rxcb->is_mcbc;
+ bool is_eapol_tkip = rxcb->is_eapol;
+
+@@ -2271,8 +2300,7 @@ static void ath12k_dp_mon_rx_deliver_msdu(struct ath12k *ar, struct napi_struct
+ }
+
+ spin_lock_bh(&ar->ab->base_lock);
+- rx_info.addr2_present = false;
+- peer = ath12k_dp_rx_h_find_peer(ar->ab, msdu, &rx_info);
++ peer = ath12k_peer_find_by_id(ar->ab, ppduinfo->peer_id);
+ if (peer && peer->sta) {
+ pubsta = peer->sta;
+ if (pubsta->valid_links) {
+@@ -2365,7 +2393,7 @@ static int ath12k_dp_mon_rx_deliver(struct ath12k *ar,
+ decap = mon_mpdu->decap_format;
+
+ ath12k_dp_mon_update_radiotap(ar, ppduinfo, mon_skb, rxs);
+- ath12k_dp_mon_rx_deliver_msdu(ar, napi, mon_skb, rxs, decap);
++ ath12k_dp_mon_rx_deliver_msdu(ar, napi, mon_skb, ppduinfo, rxs, decap);
+ mon_skb = skb_next;
+ } while (mon_skb);
+ rxs->flag = 0;
+diff --git a/drivers/net/wireless/ath/ath12k/dp_rx.c b/drivers/net/wireless/ath/ath12k/dp_rx.c
+index 8ab91273592c82..9048818984f198 100644
+--- a/drivers/net/wireless/ath/ath12k/dp_rx.c
++++ b/drivers/net/wireless/ath/ath12k/dp_rx.c
+@@ -21,6 +21,9 @@
+
+ #define ATH12K_DP_RX_FRAGMENT_TIMEOUT_MS (2 * HZ)
+
++static int ath12k_dp_rx_tid_delete_handler(struct ath12k_base *ab,
++ struct ath12k_dp_rx_tid *rx_tid);
++
+ static enum hal_encrypt_type ath12k_dp_rx_h_enctype(struct ath12k_base *ab,
+ struct hal_rx_desc *desc)
+ {
+@@ -769,6 +772,23 @@ static void ath12k_dp_rx_tid_del_func(struct ath12k_dp *dp, void *ctx,
+ rx_tid->qbuf.vaddr = NULL;
+ }
+
++static int ath12k_dp_rx_tid_delete_handler(struct ath12k_base *ab,
++ struct ath12k_dp_rx_tid *rx_tid)
++{
++ struct ath12k_hal_reo_cmd cmd = {};
++
++ cmd.flag = HAL_REO_CMD_FLG_NEED_STATUS;
++ cmd.addr_lo = lower_32_bits(rx_tid->qbuf.paddr_aligned);
++ cmd.addr_hi = upper_32_bits(rx_tid->qbuf.paddr_aligned);
++ cmd.upd0 |= HAL_REO_CMD_UPD0_VLD;
++ /* Observed flush cache failure, to avoid that set vld bit during delete */
++ cmd.upd1 |= HAL_REO_CMD_UPD1_VLD;
++
++ return ath12k_dp_reo_cmd_send(ab, rx_tid,
++ HAL_REO_CMD_UPDATE_RX_QUEUE, &cmd,
++ ath12k_dp_rx_tid_del_func);
++}
++
+ static void ath12k_peer_rx_tid_qref_setup(struct ath12k_base *ab, u16 peer_id, u16 tid,
+ dma_addr_t paddr)
+ {
+@@ -828,20 +848,13 @@ static void ath12k_peer_rx_tid_qref_reset(struct ath12k_base *ab, u16 peer_id, u
+ void ath12k_dp_rx_peer_tid_delete(struct ath12k *ar,
+ struct ath12k_peer *peer, u8 tid)
+ {
+- struct ath12k_hal_reo_cmd cmd = {};
+ struct ath12k_dp_rx_tid *rx_tid = &peer->rx_tid[tid];
+ int ret;
+
+ if (!rx_tid->active)
+ return;
+
+- cmd.flag = HAL_REO_CMD_FLG_NEED_STATUS;
+- cmd.addr_lo = lower_32_bits(rx_tid->qbuf.paddr_aligned);
+- cmd.addr_hi = upper_32_bits(rx_tid->qbuf.paddr_aligned);
+- cmd.upd0 = HAL_REO_CMD_UPD0_VLD;
+- ret = ath12k_dp_reo_cmd_send(ar->ab, rx_tid,
+- HAL_REO_CMD_UPDATE_RX_QUEUE, &cmd,
+- ath12k_dp_rx_tid_del_func);
++ ret = ath12k_dp_rx_tid_delete_handler(ar->ab, rx_tid);
+ if (ret) {
+ ath12k_err(ar->ab, "failed to send HAL_REO_CMD_UPDATE_RX_QUEUE cmd, tid %d (%d)\n",
+ tid, ret);
+@@ -2533,6 +2546,8 @@ void ath12k_dp_rx_h_ppdu(struct ath12k *ar, struct ath12k_dp_rx_info *rx_info)
+ channel_num = meta_data;
+ center_freq = meta_data >> 16;
+
++ rx_status->band = NUM_NL80211_BANDS;
++
+ if (center_freq >= ATH12K_MIN_6GHZ_FREQ &&
+ center_freq <= ATH12K_MAX_6GHZ_FREQ) {
+ rx_status->band = NL80211_BAND_6GHZ;
+@@ -2541,21 +2556,33 @@ void ath12k_dp_rx_h_ppdu(struct ath12k *ar, struct ath12k_dp_rx_info *rx_info)
+ rx_status->band = NL80211_BAND_2GHZ;
+ } else if (channel_num >= 36 && channel_num <= 173) {
+ rx_status->band = NL80211_BAND_5GHZ;
+- } else {
++ }
++
++ if (unlikely(rx_status->band == NUM_NL80211_BANDS ||
++ !ath12k_ar_to_hw(ar)->wiphy->bands[rx_status->band])) {
++ ath12k_warn(ar->ab, "sband is NULL for status band %d channel_num %d center_freq %d pdev_id %d\n",
++ rx_status->band, channel_num, center_freq, ar->pdev_idx);
++
+ spin_lock_bh(&ar->data_lock);
+ channel = ar->rx_channel;
+ if (channel) {
+ rx_status->band = channel->band;
+ channel_num =
+ ieee80211_frequency_to_channel(channel->center_freq);
++ rx_status->freq = ieee80211_channel_to_frequency(channel_num,
++ rx_status->band);
++ } else {
++ ath12k_err(ar->ab, "unable to determine channel, band for rx packet");
+ }
+ spin_unlock_bh(&ar->data_lock);
++ goto h_rate;
+ }
+
+ if (rx_status->band != NL80211_BAND_6GHZ)
+ rx_status->freq = ieee80211_channel_to_frequency(channel_num,
+ rx_status->band);
+
++h_rate:
+ ath12k_dp_rx_h_rate(ar, rx_info);
+ }
+
+diff --git a/drivers/net/wireless/ath/ath12k/hal_rx.h b/drivers/net/wireless/ath/ath12k/hal_rx.h
+index a3ab588aae19d6..d1ad7747b82c49 100644
+--- a/drivers/net/wireless/ath/ath12k/hal_rx.h
++++ b/drivers/net/wireless/ath/ath12k/hal_rx.h
+@@ -483,15 +483,16 @@ enum hal_rx_ul_reception_type {
+ HAL_RECEPTION_TYPE_FRAMELESS
+ };
+
+-#define HAL_RX_PHYRX_RSSI_LEGACY_INFO_INFO0_RECEPTION GENMASK(3, 0)
+-#define HAL_RX_PHYRX_RSSI_LEGACY_INFO_INFO0_RX_BW GENMASK(7, 5)
+-#define HAL_RX_PHYRX_RSSI_LEGACY_INFO_INFO1_RSSI_COMB GENMASK(15, 8)
++#define HAL_RX_RSSI_LEGACY_INFO_INFO0_RECEPTION GENMASK(3, 0)
++#define HAL_RX_RSSI_LEGACY_INFO_INFO0_RX_BW GENMASK(7, 5)
++#define HAL_RX_RSSI_LEGACY_INFO_INFO1_RSSI_COMB GENMASK(15, 8)
++#define HAL_RX_RSSI_LEGACY_INFO_INFO2_RSSI_COMB_PPDU GENMASK(7, 0)
+
+ struct hal_rx_phyrx_rssi_legacy_info {
+ __le32 info0;
+ __le32 rsvd0[39];
+ __le32 info1;
+- __le32 rsvd1;
++ __le32 info2;
+ } __packed;
+
+ #define HAL_RX_MPDU_START_INFO0_PPDU_ID GENMASK(31, 16)
+@@ -695,7 +696,8 @@ struct hal_rx_resp_req_info {
+ #define HAL_RX_MPDU_ERR_MPDU_LEN BIT(6)
+ #define HAL_RX_MPDU_ERR_UNENCRYPTED_FRAME BIT(7)
+
+-#define HAL_RX_PHY_CMN_USER_INFO0_GI GENMASK(17, 16)
++#define HAL_RX_CMN_USR_INFO0_CP_SETTING GENMASK(17, 16)
++#define HAL_RX_CMN_USR_INFO0_LTF_SIZE GENMASK(19, 18)
+
+ struct hal_phyrx_common_user_info {
+ __le32 rsvd[2];
+diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
+index 3a3965b79942d2..2644b5d4b0bc86 100644
+--- a/drivers/net/wireless/ath/ath12k/mac.c
++++ b/drivers/net/wireless/ath/ath12k/mac.c
+@@ -11240,8 +11240,8 @@ void ath12k_mac_fill_reg_tpc_info(struct ath12k *ar,
+ struct ieee80211_channel *chan, *temp_chan;
+ u8 pwr_lvl_idx, num_pwr_levels, pwr_reduction;
+ bool is_psd_power = false, is_tpe_present = false;
+- s8 max_tx_power[ATH12K_NUM_PWR_LEVELS],
+- psd_power, tx_power, eirp_power;
++ s8 max_tx_power[ATH12K_NUM_PWR_LEVELS], psd_power, tx_power;
++ s8 eirp_power = 0;
+ struct ath12k_vif *ahvif = arvif->ahvif;
+ u16 start_freq, center_freq;
+ u8 reg_6ghz_power_mode;
+@@ -11447,8 +11447,10 @@ static void ath12k_mac_parse_tx_pwr_env(struct ath12k *ar,
+
+ tpc_info->num_pwr_levels = max(local_psd->count,
+ reg_psd->count);
+- if (tpc_info->num_pwr_levels > ATH12K_NUM_PWR_LEVELS)
+- tpc_info->num_pwr_levels = ATH12K_NUM_PWR_LEVELS;
++ tpc_info->num_pwr_levels =
++ min3(tpc_info->num_pwr_levels,
++ IEEE80211_TPE_PSD_ENTRIES_320MHZ,
++ ATH12K_NUM_PWR_LEVELS);
+
+ for (i = 0; i < tpc_info->num_pwr_levels; i++) {
+ tpc_info->tpe[i] = min(local_psd->power[i],
+@@ -11463,8 +11465,10 @@ static void ath12k_mac_parse_tx_pwr_env(struct ath12k *ar,
+
+ tpc_info->num_pwr_levels = max(local_non_psd->count,
+ reg_non_psd->count);
+- if (tpc_info->num_pwr_levels > ATH12K_NUM_PWR_LEVELS)
+- tpc_info->num_pwr_levels = ATH12K_NUM_PWR_LEVELS;
++ tpc_info->num_pwr_levels =
++ min3(tpc_info->num_pwr_levels,
++ IEEE80211_TPE_EIRP_ENTRIES_320MHZ,
++ ATH12K_NUM_PWR_LEVELS);
+
+ for (i = 0; i < tpc_info->num_pwr_levels; i++) {
+ tpc_info->tpe[i] = min(local_non_psd->power[i],
+diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
+index 8ab7d1e34a6e14..6a3f187320fc41 100644
+--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
++++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
+@@ -997,9 +997,9 @@ static const struct sdio_device_id brcmf_sdmmc_ids[] = {
+ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_4356, WCC),
+ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_4359, WCC),
+ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_43751, WCC),
++ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_43752, WCC),
+ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_4373, CYW),
+ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43012, CYW),
+- BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43752, CYW),
+ BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_89359, CYW),
+ CYW_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43439, CYW),
+ { /* end: all zeroes */ }
+diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c
+index 9074ab49e80685..4239f2b21e5423 100644
+--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c
++++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c
+@@ -738,8 +738,8 @@ static u32 brcmf_chip_tcm_rambase(struct brcmf_chip_priv *ci)
+ case BRCM_CC_4364_CHIP_ID:
+ case CY_CC_4373_CHIP_ID:
+ return 0x160000;
+- case CY_CC_43752_CHIP_ID:
+ case BRCM_CC_43751_CHIP_ID:
++ case BRCM_CC_43752_CHIP_ID:
+ case BRCM_CC_4377_CHIP_ID:
+ return 0x170000;
+ case BRCM_CC_4378_CHIP_ID:
+@@ -1452,7 +1452,7 @@ bool brcmf_chip_sr_capable(struct brcmf_chip *pub)
+ return (reg & CC_SR_CTL0_ENABLE_MASK) != 0;
+ case BRCM_CC_4359_CHIP_ID:
+ case BRCM_CC_43751_CHIP_ID:
+- case CY_CC_43752_CHIP_ID:
++ case BRCM_CC_43752_CHIP_ID:
+ case CY_CC_43012_CHIP_ID:
+ addr = CORE_CC_REG(pmu->base, retention_ctl);
+ reg = chip->ops->read32(chip->ctx, addr);
+diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+index 8a0bad5119a0dd..8cf9d7e7c3f70c 100644
+--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
++++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+@@ -655,10 +655,10 @@ static const struct brcmf_firmware_mapping brcmf_sdio_fwnames[] = {
+ BRCMF_FW_ENTRY(BRCM_CC_4356_CHIP_ID, 0xFFFFFFFF, 4356),
+ BRCMF_FW_ENTRY(BRCM_CC_4359_CHIP_ID, 0xFFFFFFFF, 4359),
+ BRCMF_FW_ENTRY(BRCM_CC_43751_CHIP_ID, 0xFFFFFFFF, 43752),
++ BRCMF_FW_ENTRY(BRCM_CC_43752_CHIP_ID, 0xFFFFFFFF, 43752),
+ BRCMF_FW_ENTRY(CY_CC_4373_CHIP_ID, 0xFFFFFFFF, 4373),
+ BRCMF_FW_ENTRY(CY_CC_43012_CHIP_ID, 0xFFFFFFFF, 43012),
+ BRCMF_FW_ENTRY(CY_CC_43439_CHIP_ID, 0xFFFFFFFF, 43439),
+- BRCMF_FW_ENTRY(CY_CC_43752_CHIP_ID, 0xFFFFFFFF, 43752)
+ };
+
+ #define TXCTL_CREDITS 2
+@@ -3426,8 +3426,8 @@ static int brcmf_sdio_download_firmware(struct brcmf_sdio *bus,
+ static bool brcmf_sdio_aos_no_decode(struct brcmf_sdio *bus)
+ {
+ if (bus->ci->chip == BRCM_CC_43751_CHIP_ID ||
+- bus->ci->chip == CY_CC_43012_CHIP_ID ||
+- bus->ci->chip == CY_CC_43752_CHIP_ID)
++ bus->ci->chip == BRCM_CC_43752_CHIP_ID ||
++ bus->ci->chip == CY_CC_43012_CHIP_ID)
+ return true;
+ else
+ return false;
+@@ -4278,8 +4278,8 @@ static void brcmf_sdio_firmware_callback(struct device *dev, int err,
+
+ switch (sdiod->func1->device) {
+ case SDIO_DEVICE_ID_BROADCOM_43751:
++ case SDIO_DEVICE_ID_BROADCOM_43752:
+ case SDIO_DEVICE_ID_BROADCOM_CYPRESS_4373:
+- case SDIO_DEVICE_ID_BROADCOM_CYPRESS_43752:
+ brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes\n",
+ CY_4373_F2_WATERMARK);
+ brcmf_sdiod_writeb(sdiod, SBSDIO_WATERMARK,
+diff --git a/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h b/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h
+index b39c5c1ee18b6e..df3b67ba4db290 100644
+--- a/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h
++++ b/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h
+@@ -60,7 +60,6 @@
+ #define CY_CC_4373_CHIP_ID 0x4373
+ #define CY_CC_43012_CHIP_ID 43012
+ #define CY_CC_43439_CHIP_ID 43439
+-#define CY_CC_43752_CHIP_ID 43752
+
+ /* USB Device IDs */
+ #define BRCM_USB_43143_DEVICE_ID 0xbd1e
+diff --git a/drivers/net/wireless/intel/iwlwifi/fw/regulatory.h b/drivers/net/wireless/intel/iwlwifi/fw/regulatory.h
+index a07c512b6ed43e..735482e7adf560 100644
+--- a/drivers/net/wireless/intel/iwlwifi/fw/regulatory.h
++++ b/drivers/net/wireless/intel/iwlwifi/fw/regulatory.h
+@@ -12,7 +12,6 @@
+ #include "fw/api/phy.h"
+ #include "fw/api/config.h"
+ #include "fw/api/nvm-reg.h"
+-#include "fw/img.h"
+ #include "iwl-trans.h"
+
+ #define BIOS_SAR_MAX_PROFILE_NUM 4
+diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+index 4c8c7a5fdf23e2..be23a29e7de095 100644
+--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
++++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+@@ -686,10 +686,9 @@ static void mwifiex_reg_notifier(struct wiphy *wiphy,
+ return;
+ }
+
+- /* Don't send world or same regdom info to firmware */
+- if (strncmp(request->alpha2, "00", 2) &&
+- strncmp(request->alpha2, adapter->country_code,
+- sizeof(request->alpha2))) {
++ /* Don't send same regdom info to firmware */
++ if (strncmp(request->alpha2, adapter->country_code,
++ sizeof(request->alpha2)) != 0) {
+ memcpy(adapter->country_code, request->alpha2,
+ sizeof(request->alpha2));
+ mwifiex_send_domain_info_cmd_fw(wiphy);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/soc.c b/drivers/net/wireless/mediatek/mt76/mt7603/soc.c
+index 08590aa68356f7..1dd37237204807 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7603/soc.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7603/soc.c
+@@ -48,7 +48,7 @@ mt76_wmac_probe(struct platform_device *pdev)
+
+ return 0;
+ error:
+- ieee80211_free_hw(mt76_hw(dev));
++ mt76_free_device(mdev);
+ return ret;
+ }
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.h b/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.h
+index 31aec0f40232ae..73611c9d26e151 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.h
+@@ -50,9 +50,9 @@ enum mt7915_eeprom_field {
+ #define MT_EE_CAL_GROUP_SIZE_7975 (54 * MT_EE_CAL_UNIT + 16)
+ #define MT_EE_CAL_GROUP_SIZE_7976 (94 * MT_EE_CAL_UNIT + 16)
+ #define MT_EE_CAL_GROUP_SIZE_7916_6G (94 * MT_EE_CAL_UNIT + 16)
++#define MT_EE_CAL_GROUP_SIZE_7981 (144 * MT_EE_CAL_UNIT + 16)
+ #define MT_EE_CAL_DPD_SIZE_V1 (54 * MT_EE_CAL_UNIT)
+ #define MT_EE_CAL_DPD_SIZE_V2 (300 * MT_EE_CAL_UNIT)
+-#define MT_EE_CAL_DPD_SIZE_V2_7981 (102 * MT_EE_CAL_UNIT) /* no 6g dpd data */
+
+ #define MT_EE_WIFI_CONF0_TX_PATH GENMASK(2, 0)
+ #define MT_EE_WIFI_CONF0_RX_PATH GENMASK(5, 3)
+@@ -180,6 +180,8 @@ mt7915_get_cal_group_size(struct mt7915_dev *dev)
+ val = FIELD_GET(MT_EE_WIFI_CONF0_BAND_SEL, val);
+ return (val == MT_EE_V2_BAND_SEL_6GHZ) ? MT_EE_CAL_GROUP_SIZE_7916_6G :
+ MT_EE_CAL_GROUP_SIZE_7916;
++ } else if (is_mt7981(&dev->mt76)) {
++ return MT_EE_CAL_GROUP_SIZE_7981;
+ } else if (mt7915_check_adie(dev, false)) {
+ return MT_EE_CAL_GROUP_SIZE_7976;
+ } else {
+@@ -192,8 +194,6 @@ mt7915_get_cal_dpd_size(struct mt7915_dev *dev)
+ {
+ if (is_mt7915(&dev->mt76))
+ return MT_EE_CAL_DPD_SIZE_V1;
+- else if (is_mt7981(&dev->mt76))
+- return MT_EE_CAL_DPD_SIZE_V2_7981;
+ else
+ return MT_EE_CAL_DPD_SIZE_V2;
+ }
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+index 2928e75b239762..c1fdd3c4f1ba6e 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+@@ -3052,30 +3052,15 @@ static int mt7915_dpd_freq_idx(struct mt7915_dev *dev, u16 freq, u8 bw)
+ /* 5G BW160 */
+ 5250, 5570, 5815
+ };
+- static const u16 freq_list_v2_7981[] = {
+- /* 5G BW20 */
+- 5180, 5200, 5220, 5240,
+- 5260, 5280, 5300, 5320,
+- 5500, 5520, 5540, 5560,
+- 5580, 5600, 5620, 5640,
+- 5660, 5680, 5700, 5720,
+- 5745, 5765, 5785, 5805,
+- 5825, 5845, 5865, 5885,
+- /* 5G BW160 */
+- 5250, 5570, 5815
+- };
+- const u16 *freq_list = freq_list_v1;
+- int n_freqs = ARRAY_SIZE(freq_list_v1);
+- int idx;
++ const u16 *freq_list;
++ int idx, n_freqs;
+
+ if (!is_mt7915(&dev->mt76)) {
+- if (is_mt7981(&dev->mt76)) {
+- freq_list = freq_list_v2_7981;
+- n_freqs = ARRAY_SIZE(freq_list_v2_7981);
+- } else {
+- freq_list = freq_list_v2;
+- n_freqs = ARRAY_SIZE(freq_list_v2);
+- }
++ freq_list = freq_list_v2;
++ n_freqs = ARRAY_SIZE(freq_list_v2);
++ } else {
++ freq_list = freq_list_v1;
++ n_freqs = ARRAY_SIZE(freq_list_v1);
+ }
+
+ if (freq < 4000) {
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/init.c b/drivers/net/wireless/mediatek/mt76/mt7996/init.c
+index a9599c286328eb..84015ab24af625 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/init.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/init.c
+@@ -671,13 +671,20 @@ static int mt7996_register_phy(struct mt7996_dev *dev, enum mt76_band_id band)
+
+ /* init wiphy according to mphy and phy */
+ mt7996_init_wiphy_band(mphy->hw, phy);
+- ret = mt7996_init_tx_queues(mphy->priv,
+- MT_TXQ_ID(band),
+- MT7996_TX_RING_SIZE,
+- MT_TXQ_RING_BASE(band) + hif1_ofs,
+- wed);
+- if (ret)
+- goto error;
++
++ if (is_mt7996(&dev->mt76) && !dev->hif2 && band == MT_BAND1) {
++ int i;
++
++ for (i = 0; i <= MT_TXQ_PSD; i++)
++ mphy->q_tx[i] = dev->mt76.phys[MT_BAND0]->q_tx[0];
++ } else {
++ ret = mt7996_init_tx_queues(mphy->priv, MT_TXQ_ID(band),
++ MT7996_TX_RING_SIZE,
++ MT_TXQ_RING_BASE(band) + hif1_ofs,
++ wed);
++ if (ret)
++ goto error;
++ }
+
+ ret = mt76_register_phy(mphy, true, mt76_rates,
+ ARRAY_SIZE(mt76_rates));
+@@ -727,6 +734,7 @@ void mt7996_wfsys_reset(struct mt7996_dev *dev)
+ static int mt7996_wed_rro_init(struct mt7996_dev *dev)
+ {
+ #ifdef CONFIG_NET_MEDIATEK_SOC_WED
++ u32 val = FIELD_PREP(WED_RRO_ADDR_SIGNATURE_MASK, 0xff);
+ struct mtk_wed_device *wed = &dev->mt76.mmio.wed;
+ u32 reg = MT_RRO_ADDR_ELEM_SEG_ADDR0;
+ struct mt7996_wed_rro_addr *addr;
+@@ -766,7 +774,7 @@ static int mt7996_wed_rro_init(struct mt7996_dev *dev)
+
+ addr = dev->wed_rro.addr_elem[i].ptr;
+ for (j = 0; j < MT7996_RRO_WINDOW_MAX_SIZE; j++) {
+- addr->signature = 0xff;
++ addr->data = cpu_to_le32(val);
+ addr++;
+ }
+
+@@ -784,7 +792,7 @@ static int mt7996_wed_rro_init(struct mt7996_dev *dev)
+ dev->wed_rro.session.ptr = ptr;
+ addr = dev->wed_rro.session.ptr;
+ for (i = 0; i < MT7996_RRO_WINDOW_MAX_LEN; i++) {
+- addr->signature = 0xff;
++ addr->data = cpu_to_le32(val);
+ addr++;
+ }
+
+@@ -884,6 +892,7 @@ static void mt7996_wed_rro_free(struct mt7996_dev *dev)
+ static void mt7996_wed_rro_work(struct work_struct *work)
+ {
+ #ifdef CONFIG_NET_MEDIATEK_SOC_WED
++ u32 val = FIELD_PREP(WED_RRO_ADDR_SIGNATURE_MASK, 0xff);
+ struct mt7996_dev *dev;
+ LIST_HEAD(list);
+
+@@ -920,7 +929,7 @@ static void mt7996_wed_rro_work(struct work_struct *work)
+ MT7996_RRO_WINDOW_MAX_LEN;
+ reset:
+ elem = ptr + elem_id * sizeof(*elem);
+- elem->signature = 0xff;
++ elem->data |= cpu_to_le32(val);
+ }
+ mt7996_mcu_wed_rro_reset_sessions(dev, e->id);
+ out:
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mac.c b/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
+index b3fcca9bbb9589..28477702c18b3d 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
+@@ -1766,13 +1766,10 @@ void mt7996_tx_token_put(struct mt7996_dev *dev)
+ static int
+ mt7996_mac_restart(struct mt7996_dev *dev)
+ {
+- struct mt7996_phy *phy2, *phy3;
+ struct mt76_dev *mdev = &dev->mt76;
++ struct mt7996_phy *phy;
+ int i, ret;
+
+- phy2 = mt7996_phy2(dev);
+- phy3 = mt7996_phy3(dev);
+-
+ if (dev->hif2) {
+ mt76_wr(dev, MT_INT1_MASK_CSR, 0x0);
+ mt76_wr(dev, MT_INT1_SOURCE_CSR, ~0);
+@@ -1784,20 +1781,14 @@ mt7996_mac_restart(struct mt7996_dev *dev)
+ mt76_wr(dev, MT_PCIE1_MAC_INT_ENABLE, 0x0);
+ }
+
+- set_bit(MT76_RESET, &dev->mphy.state);
+ set_bit(MT76_MCU_RESET, &dev->mphy.state);
++ mt7996_for_each_phy(dev, phy)
++ set_bit(MT76_RESET, &phy->mt76->state);
+ wake_up(&dev->mt76.mcu.wait);
+- if (phy2)
+- set_bit(MT76_RESET, &phy2->mt76->state);
+- if (phy3)
+- set_bit(MT76_RESET, &phy3->mt76->state);
+
+ /* lock/unlock all queues to ensure that no tx is pending */
+- mt76_txq_schedule_all(&dev->mphy);
+- if (phy2)
+- mt76_txq_schedule_all(phy2->mt76);
+- if (phy3)
+- mt76_txq_schedule_all(phy3->mt76);
++ mt7996_for_each_phy(dev, phy)
++ mt76_txq_schedule_all(phy->mt76);
+
+ /* disable all tx/rx napi */
+ mt76_worker_disable(&dev->mt76.tx_worker);
+@@ -1855,36 +1846,25 @@ mt7996_mac_restart(struct mt7996_dev *dev)
+ goto out;
+
+ mt7996_mac_init(dev);
+- mt7996_init_txpower(&dev->phy);
+- mt7996_init_txpower(phy2);
+- mt7996_init_txpower(phy3);
++ mt7996_for_each_phy(dev, phy)
++ mt7996_init_txpower(phy);
+ ret = mt7996_txbf_init(dev);
++ if (ret)
++ goto out;
+
+- if (test_bit(MT76_STATE_RUNNING, &dev->mphy.state)) {
+- ret = mt7996_run(&dev->phy);
+- if (ret)
+- goto out;
+- }
+-
+- if (phy2 && test_bit(MT76_STATE_RUNNING, &phy2->mt76->state)) {
+- ret = mt7996_run(phy2);
+- if (ret)
+- goto out;
+- }
++ mt7996_for_each_phy(dev, phy) {
++ if (!test_bit(MT76_STATE_RUNNING, &phy->mt76->state))
++ continue;
+
+- if (phy3 && test_bit(MT76_STATE_RUNNING, &phy3->mt76->state)) {
+- ret = mt7996_run(phy3);
++ ret = mt7996_run(&dev->phy);
+ if (ret)
+ goto out;
+ }
+
+ out:
+ /* reset done */
+- clear_bit(MT76_RESET, &dev->mphy.state);
+- if (phy2)
+- clear_bit(MT76_RESET, &phy2->mt76->state);
+- if (phy3)
+- clear_bit(MT76_RESET, &phy3->mt76->state);
++ mt7996_for_each_phy(dev, phy)
++ clear_bit(MT76_RESET, &phy->mt76->state);
+
+ napi_enable(&dev->mt76.tx_napi);
+ local_bh_disable();
+@@ -1898,26 +1878,18 @@ mt7996_mac_restart(struct mt7996_dev *dev)
+ static void
+ mt7996_mac_full_reset(struct mt7996_dev *dev)
+ {
+- struct mt7996_phy *phy2, *phy3;
++ struct ieee80211_hw *hw = mt76_hw(dev);
++ struct mt7996_phy *phy;
+ int i;
+
+- phy2 = mt7996_phy2(dev);
+- phy3 = mt7996_phy3(dev);
+ dev->recovery.hw_full_reset = true;
+
+ wake_up(&dev->mt76.mcu.wait);
+- ieee80211_stop_queues(mt76_hw(dev));
+- if (phy2)
+- ieee80211_stop_queues(phy2->mt76->hw);
+- if (phy3)
+- ieee80211_stop_queues(phy3->mt76->hw);
++ ieee80211_stop_queues(hw);
+
+ cancel_work_sync(&dev->wed_rro.work);
+- cancel_delayed_work_sync(&dev->mphy.mac_work);
+- if (phy2)
+- cancel_delayed_work_sync(&phy2->mt76->mac_work);
+- if (phy3)
+- cancel_delayed_work_sync(&phy3->mt76->mac_work);
++ mt7996_for_each_phy(dev, phy)
++ cancel_delayed_work_sync(&phy->mt76->mac_work);
+
+ mutex_lock(&dev->mt76.mutex);
+ for (i = 0; i < 10; i++) {
+@@ -1930,40 +1902,23 @@ mt7996_mac_full_reset(struct mt7996_dev *dev)
+ dev_err(dev->mt76.dev, "chip full reset failed\n");
+
+ ieee80211_restart_hw(mt76_hw(dev));
+- if (phy2)
+- ieee80211_restart_hw(phy2->mt76->hw);
+- if (phy3)
+- ieee80211_restart_hw(phy3->mt76->hw);
+-
+ ieee80211_wake_queues(mt76_hw(dev));
+- if (phy2)
+- ieee80211_wake_queues(phy2->mt76->hw);
+- if (phy3)
+- ieee80211_wake_queues(phy3->mt76->hw);
+
+ dev->recovery.hw_full_reset = false;
+- ieee80211_queue_delayed_work(mt76_hw(dev),
+- &dev->mphy.mac_work,
+- MT7996_WATCHDOG_TIME);
+- if (phy2)
+- ieee80211_queue_delayed_work(phy2->mt76->hw,
+- &phy2->mt76->mac_work,
+- MT7996_WATCHDOG_TIME);
+- if (phy3)
+- ieee80211_queue_delayed_work(phy3->mt76->hw,
+- &phy3->mt76->mac_work,
++ mt7996_for_each_phy(dev, phy)
++ ieee80211_queue_delayed_work(hw, &phy->mt76->mac_work,
+ MT7996_WATCHDOG_TIME);
+ }
+
+ void mt7996_mac_reset_work(struct work_struct *work)
+ {
+- struct mt7996_phy *phy2, *phy3;
++ struct ieee80211_hw *hw;
+ struct mt7996_dev *dev;
++ struct mt7996_phy *phy;
+ int i;
+
+ dev = container_of(work, struct mt7996_dev, reset_work);
+- phy2 = mt7996_phy2(dev);
+- phy3 = mt7996_phy3(dev);
++ hw = mt76_hw(dev);
+
+ /* chip full reset */
+ if (dev->recovery.restart) {
+@@ -1994,7 +1949,7 @@ void mt7996_mac_reset_work(struct work_struct *work)
+ return;
+
+ dev_info(dev->mt76.dev,"\n%s L1 SER recovery start.",
+- wiphy_name(dev->mt76.hw->wiphy));
++ wiphy_name(hw->wiphy));
+
+ if (mtk_wed_device_active(&dev->mt76.mmio.wed_hif2))
+ mtk_wed_device_stop(&dev->mt76.mmio.wed_hif2);
+@@ -2003,25 +1958,17 @@ void mt7996_mac_reset_work(struct work_struct *work)
+ mtk_wed_device_stop(&dev->mt76.mmio.wed);
+
+ ieee80211_stop_queues(mt76_hw(dev));
+- if (phy2)
+- ieee80211_stop_queues(phy2->mt76->hw);
+- if (phy3)
+- ieee80211_stop_queues(phy3->mt76->hw);
+
+ set_bit(MT76_RESET, &dev->mphy.state);
+ set_bit(MT76_MCU_RESET, &dev->mphy.state);
+ wake_up(&dev->mt76.mcu.wait);
+
+ cancel_work_sync(&dev->wed_rro.work);
+- cancel_delayed_work_sync(&dev->mphy.mac_work);
+- if (phy2) {
+- set_bit(MT76_RESET, &phy2->mt76->state);
+- cancel_delayed_work_sync(&phy2->mt76->mac_work);
+- }
+- if (phy3) {
+- set_bit(MT76_RESET, &phy3->mt76->state);
+- cancel_delayed_work_sync(&phy3->mt76->mac_work);
++ mt7996_for_each_phy(dev, phy) {
++ set_bit(MT76_RESET, &phy->mt76->state);
++ cancel_delayed_work_sync(&phy->mt76->mac_work);
+ }
++
+ mt76_worker_disable(&dev->mt76.tx_worker);
+ mt76_for_each_q_rx(&dev->mt76, i) {
+ if (mtk_wed_device_active(&dev->mt76.mmio.wed) &&
+@@ -2074,11 +2021,8 @@ void mt7996_mac_reset_work(struct work_struct *work)
+ }
+
+ clear_bit(MT76_MCU_RESET, &dev->mphy.state);
+- clear_bit(MT76_RESET, &dev->mphy.state);
+- if (phy2)
+- clear_bit(MT76_RESET, &phy2->mt76->state);
+- if (phy3)
+- clear_bit(MT76_RESET, &phy3->mt76->state);
++ mt7996_for_each_phy(dev, phy)
++ clear_bit(MT76_RESET, &phy->mt76->state);
+
+ mt76_for_each_q_rx(&dev->mt76, i) {
+ if (mtk_wed_device_active(&dev->mt76.mmio.wed) &&
+@@ -2100,25 +2044,14 @@ void mt7996_mac_reset_work(struct work_struct *work)
+ napi_schedule(&dev->mt76.tx_napi);
+ local_bh_enable();
+
+- ieee80211_wake_queues(mt76_hw(dev));
+- if (phy2)
+- ieee80211_wake_queues(phy2->mt76->hw);
+- if (phy3)
+- ieee80211_wake_queues(phy3->mt76->hw);
++ ieee80211_wake_queues(hw);
+
+ mutex_unlock(&dev->mt76.mutex);
+
+ mt7996_update_beacons(dev);
+
+- ieee80211_queue_delayed_work(mt76_hw(dev), &dev->mphy.mac_work,
+- MT7996_WATCHDOG_TIME);
+- if (phy2)
+- ieee80211_queue_delayed_work(phy2->mt76->hw,
+- &phy2->mt76->mac_work,
+- MT7996_WATCHDOG_TIME);
+- if (phy3)
+- ieee80211_queue_delayed_work(phy3->mt76->hw,
+- &phy3->mt76->mac_work,
++ mt7996_for_each_phy(dev, phy)
++ ieee80211_queue_delayed_work(hw, &phy->mt76->mac_work,
+ MT7996_WATCHDOG_TIME);
+ dev_info(dev->mt76.dev,"\n%s L1 SER recovery completed.",
+ wiphy_name(dev->mt76.hw->wiphy));
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/main.c b/drivers/net/wireless/mediatek/mt76/mt7996/main.c
+index 84f731b387d20a..d01b5778da20e9 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/main.c
+@@ -138,6 +138,28 @@ static int get_omac_idx(enum nl80211_iftype type, u64 mask)
+ return -1;
+ }
+
++static int get_own_mld_idx(u64 mask, bool group_mld)
++{
++ u8 start = group_mld ? 0 : 16;
++ u8 end = group_mld ? 15 : 63;
++ int idx;
++
++ idx = get_free_idx(mask, start, end);
++ if (idx)
++ return idx - 1;
++
++ /* If the 16-63 range is not available, perform another lookup in the
++ * range 0-15
++ */
++ if (!group_mld) {
++ idx = get_free_idx(mask, 0, 15);
++ if (idx)
++ return idx - 1;
++ }
++
++ return -EINVAL;
++}
++
+ static void
+ mt7996_init_bitrate_mask(struct ieee80211_vif *vif, struct mt7996_vif_link *mlink)
+ {
+@@ -279,7 +301,7 @@ int mt7996_vif_link_add(struct mt76_phy *mphy, struct ieee80211_vif *vif,
+ struct mt7996_dev *dev = phy->dev;
+ u8 band_idx = phy->mt76->band_idx;
+ struct mt76_txq *mtxq;
+- int idx, ret;
++ int mld_idx, idx, ret;
+
+ mlink->idx = __ffs64(~dev->mt76.vif_mask);
+ if (mlink->idx >= mt7996_max_interface_num(dev))
+@@ -289,6 +311,17 @@ int mt7996_vif_link_add(struct mt76_phy *mphy, struct ieee80211_vif *vif,
+ if (idx < 0)
+ return -ENOSPC;
+
++ if (!dev->mld_idx_mask) { /* first link in the group */
++ mvif->mld_group_idx = get_own_mld_idx(dev->mld_idx_mask, true);
++ mvif->mld_remap_idx = get_free_idx(dev->mld_remap_idx_mask,
++ 0, 15);
++ }
++
++ mld_idx = get_own_mld_idx(dev->mld_idx_mask, false);
++ if (mld_idx < 0)
++ return -ENOSPC;
++
++ link->mld_idx = mld_idx;
+ link->phy = phy;
+ mlink->omac_idx = idx;
+ mlink->band_idx = band_idx;
+@@ -301,6 +334,11 @@ int mt7996_vif_link_add(struct mt76_phy *mphy, struct ieee80211_vif *vif,
+ return ret;
+
+ dev->mt76.vif_mask |= BIT_ULL(mlink->idx);
++ if (!dev->mld_idx_mask) {
++ dev->mld_idx_mask |= BIT_ULL(mvif->mld_group_idx);
++ dev->mld_remap_idx_mask |= BIT_ULL(mvif->mld_remap_idx);
++ }
++ dev->mld_idx_mask |= BIT_ULL(link->mld_idx);
+ phy->omac_mask |= BIT_ULL(mlink->omac_idx);
+
+ idx = MT7996_WTBL_RESERVED - mlink->idx;
+@@ -380,7 +418,13 @@ void mt7996_vif_link_remove(struct mt76_phy *mphy, struct ieee80211_vif *vif,
+ }
+
+ dev->mt76.vif_mask &= ~BIT_ULL(mlink->idx);
++ dev->mld_idx_mask &= ~BIT_ULL(link->mld_idx);
+ phy->omac_mask &= ~BIT_ULL(mlink->omac_idx);
++ if (!(dev->mld_idx_mask & ~BIT_ULL(mvif->mld_group_idx))) {
++ /* last link */
++ dev->mld_idx_mask &= ~BIT_ULL(mvif->mld_group_idx);
++ dev->mld_remap_idx_mask &= ~BIT_ULL(mvif->mld_remap_idx);
++ }
+
+ spin_lock_bh(&dev->mt76.sta_poll_lock);
+ if (!list_empty(&msta_link->wcid.poll_list))
+@@ -1036,16 +1080,17 @@ mt7996_mac_sta_add_links(struct mt7996_dev *dev, struct ieee80211_vif *vif,
+ goto error_unlink;
+ }
+
+- err = mt7996_mac_sta_init_link(dev, link_conf, link_sta, link,
+- link_id);
+- if (err)
+- goto error_unlink;
+-
+ mphy = mt76_vif_link_phy(&link->mt76);
+ if (!mphy) {
+ err = -EINVAL;
+ goto error_unlink;
+ }
++
++ err = mt7996_mac_sta_init_link(dev, link_conf, link_sta, link,
++ link_id);
++ if (err)
++ goto error_unlink;
++
+ mphy->num_sta++;
+ }
+
+@@ -1327,11 +1372,13 @@ mt7996_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
+ case IEEE80211_AMPDU_RX_START:
+ mt76_rx_aggr_start(&dev->mt76, &msta_link->wcid, tid,
+ ssn, params->buf_size);
+- ret = mt7996_mcu_add_rx_ba(dev, params, link, true);
++ ret = mt7996_mcu_add_rx_ba(dev, params, link,
++ msta_link, true);
+ break;
+ case IEEE80211_AMPDU_RX_STOP:
+ mt76_rx_aggr_stop(&dev->mt76, &msta_link->wcid, tid);
+- ret = mt7996_mcu_add_rx_ba(dev, params, link, false);
++ ret = mt7996_mcu_add_rx_ba(dev, params, link,
++ msta_link, false);
+ break;
+ case IEEE80211_AMPDU_TX_OPERATIONAL:
+ mtxq->aggr = true;
+@@ -1617,19 +1664,13 @@ static void mt7996_sta_statistics(struct ieee80211_hw *hw,
+ }
+ }
+
+-static void mt7996_link_rate_ctrl_update(void *data, struct ieee80211_sta *sta)
++static void mt7996_link_rate_ctrl_update(void *data,
++ struct mt7996_sta_link *msta_link)
+ {
+- struct mt7996_sta *msta = (struct mt7996_sta *)sta->drv_priv;
++ struct mt7996_sta *msta = msta_link->sta;
+ struct mt7996_dev *dev = msta->vif->deflink.phy->dev;
+- struct mt7996_sta_link *msta_link;
+ u32 *changed = data;
+
+- rcu_read_lock();
+-
+- msta_link = rcu_dereference(msta->link[msta->deflink_id]);
+- if (!msta_link)
+- goto out;
+-
+ spin_lock_bh(&dev->mt76.sta_poll_lock);
+
+ msta_link->changed |= *changed;
+@@ -1637,8 +1678,6 @@ static void mt7996_link_rate_ctrl_update(void *data, struct ieee80211_sta *sta)
+ list_add_tail(&msta_link->rc_list, &dev->sta_rc_list);
+
+ spin_unlock_bh(&dev->mt76.sta_poll_lock);
+-out:
+- rcu_read_unlock();
+ }
+
+ static void mt7996_link_sta_rc_update(struct ieee80211_hw *hw,
+@@ -1646,11 +1685,32 @@ static void mt7996_link_sta_rc_update(struct ieee80211_hw *hw,
+ struct ieee80211_link_sta *link_sta,
+ u32 changed)
+ {
+- struct mt7996_dev *dev = mt7996_hw_dev(hw);
+ struct ieee80211_sta *sta = link_sta->sta;
++ struct mt7996_sta *msta = (struct mt7996_sta *)sta->drv_priv;
++ struct mt7996_sta_link *msta_link;
+
+- mt7996_link_rate_ctrl_update(&changed, sta);
+- ieee80211_queue_work(hw, &dev->rc_work);
++ rcu_read_lock();
++
++ msta_link = rcu_dereference(msta->link[link_sta->link_id]);
++ if (msta_link) {
++ struct mt7996_dev *dev = mt7996_hw_dev(hw);
++
++ mt7996_link_rate_ctrl_update(&changed, msta_link);
++ ieee80211_queue_work(hw, &dev->rc_work);
++ }
++
++ rcu_read_unlock();
++}
++
++static void mt7996_sta_rate_ctrl_update(void *data, struct ieee80211_sta *sta)
++{
++ struct mt7996_sta *msta = (struct mt7996_sta *)sta->drv_priv;
++ struct mt7996_sta_link *msta_link;
++ u32 *changed = data;
++
++ msta_link = rcu_dereference(msta->link[msta->deflink_id]);
++ if (msta_link)
++ mt7996_link_rate_ctrl_update(&changed, msta_link);
+ }
+
+ static int
+@@ -1671,7 +1731,7 @@ mt7996_set_bitrate_mask(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
+ * - multiple rates: if it's not in range format i.e 0-{7,8,9} for VHT
+ * then multiple MCS setting (MCS 4,5,6) is not supported.
+ */
+- ieee80211_iterate_stations_atomic(hw, mt7996_link_rate_ctrl_update,
++ ieee80211_iterate_stations_atomic(hw, mt7996_sta_rate_ctrl_update,
+ &changed);
+ ieee80211_queue_work(hw, &dev->rc_work);
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+index 0be03eb3cf4613..aad58f7831c7b2 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+@@ -899,17 +899,28 @@ mt7996_mcu_bss_txcmd_tlv(struct sk_buff *skb, bool en)
+ }
+
+ static void
+-mt7996_mcu_bss_mld_tlv(struct sk_buff *skb, struct mt76_vif_link *mlink)
++mt7996_mcu_bss_mld_tlv(struct sk_buff *skb,
++ struct ieee80211_bss_conf *link_conf,
++ struct mt7996_vif_link *link)
+ {
++ struct ieee80211_vif *vif = link_conf->vif;
++ struct mt7996_vif *mvif = (struct mt7996_vif *)vif->drv_priv;
+ struct bss_mld_tlv *mld;
+ struct tlv *tlv;
+
+ tlv = mt7996_mcu_add_uni_tlv(skb, UNI_BSS_INFO_MLD, sizeof(*mld));
+-
+ mld = (struct bss_mld_tlv *)tlv;
+- mld->group_mld_id = 0xff;
+- mld->own_mld_id = mlink->idx;
+- mld->remap_idx = 0xff;
++ mld->own_mld_id = link->mld_idx;
++ mld->link_id = link_conf->link_id;
++
++ if (ieee80211_vif_is_mld(vif)) {
++ mld->group_mld_id = mvif->mld_group_idx;
++ mld->remap_idx = mvif->mld_remap_idx;
++ memcpy(mld->mac_addr, vif->addr, ETH_ALEN);
++ } else {
++ mld->group_mld_id = 0xff;
++ mld->remap_idx = 0xff;
++ }
+ }
+
+ static void
+@@ -1108,6 +1119,8 @@ int mt7996_mcu_add_bss_info(struct mt7996_phy *phy, struct ieee80211_vif *vif,
+ goto out;
+
+ if (enable) {
++ struct mt7996_vif_link *link;
++
+ mt7996_mcu_bss_rfch_tlv(skb, phy);
+ mt7996_mcu_bss_bmc_tlv(skb, mlink, phy);
+ mt7996_mcu_bss_ra_tlv(skb, phy);
+@@ -1118,7 +1131,8 @@ int mt7996_mcu_add_bss_info(struct mt7996_phy *phy, struct ieee80211_vif *vif,
+ mt7996_mcu_bss_he_tlv(skb, vif, link_conf, phy);
+
+ /* this tag is necessary no matter if the vif is MLD */
+- mt7996_mcu_bss_mld_tlv(skb, mlink);
++ link = container_of(mlink, struct mt7996_vif_link, mt76);
++ mt7996_mcu_bss_mld_tlv(skb, link_conf, link);
+ }
+
+ mt7996_mcu_bss_mbssid_tlv(skb, link_conf, enable);
+@@ -1149,9 +1163,8 @@ int mt7996_mcu_set_timing(struct mt7996_phy *phy, struct ieee80211_vif *vif,
+ static int
+ mt7996_mcu_sta_ba(struct mt7996_dev *dev, struct mt76_vif_link *mvif,
+ struct ieee80211_ampdu_params *params,
+- bool enable, bool tx)
++ struct mt76_wcid *wcid, bool enable, bool tx)
+ {
+- struct mt76_wcid *wcid = (struct mt76_wcid *)params->sta->drv_priv;
+ struct sta_rec_ba_uni *ba;
+ struct sk_buff *skb;
+ struct tlv *tlv;
+@@ -1185,14 +1198,17 @@ int mt7996_mcu_add_tx_ba(struct mt7996_dev *dev,
+ if (enable && !params->amsdu)
+ msta_link->wcid.amsdu = false;
+
+- return mt7996_mcu_sta_ba(dev, &link->mt76, params, enable, true);
++ return mt7996_mcu_sta_ba(dev, &link->mt76, params, &msta_link->wcid,
++ enable, true);
+ }
+
+ int mt7996_mcu_add_rx_ba(struct mt7996_dev *dev,
+ struct ieee80211_ampdu_params *params,
+- struct mt7996_vif_link *link, bool enable)
++ struct mt7996_vif_link *link,
++ struct mt7996_sta_link *msta_link, bool enable)
+ {
+- return mt7996_mcu_sta_ba(dev, &link->mt76, params, enable, false);
++ return mt7996_mcu_sta_ba(dev, &link->mt76, params, &msta_link->wcid,
++ enable, false);
+ }
+
+ static void
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.h b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.h
+index 130ea95626d5b1..7b21d6ae7e4350 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.h
+@@ -481,7 +481,8 @@ struct bss_mld_tlv {
+ u8 own_mld_id;
+ u8 mac_addr[ETH_ALEN];
+ u8 remap_idx;
+- u8 __rsv[3];
++ u8 link_id;
++ u8 __rsv[2];
+ } __packed;
+
+ struct sta_rec_ht_uni {
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h b/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h
+index 8509d508e1e19c..048d9a9898c6ec 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h
+@@ -248,11 +248,16 @@ struct mt7996_vif_link {
+
+ struct ieee80211_tx_queue_params queue_params[IEEE80211_NUM_ACS];
+ struct cfg80211_bitrate_mask bitrate_mask;
++
++ u8 mld_idx;
+ };
+
+ struct mt7996_vif {
+ struct mt7996_vif_link deflink; /* must be first */
+ struct mt76_vif_data mt76;
++
++ u8 mld_group_idx;
++ u8 mld_remap_idx;
+ };
+
+ /* crash-dump */
+@@ -272,13 +277,12 @@ struct mt7996_hif {
+ int irq;
+ };
+
++#define WED_RRO_ADDR_SIGNATURE_MASK GENMASK(31, 24)
++#define WED_RRO_ADDR_COUNT_MASK GENMASK(14, 4)
++#define WED_RRO_ADDR_HEAD_HIGH_MASK GENMASK(3, 0)
+ struct mt7996_wed_rro_addr {
+- u32 head_low;
+- u32 head_high : 4;
+- u32 count: 11;
+- u32 oor: 1;
+- u32 rsv : 8;
+- u32 signature : 8;
++ __le32 head_low;
++ __le32 data;
+ };
+
+ struct mt7996_wed_rro_session_id {
+@@ -337,6 +341,9 @@ struct mt7996_dev {
+ u32 q_int_mask[MT7996_MAX_QUEUE];
+ u32 q_wfdma_mask;
+
++ u64 mld_idx_mask;
++ u64 mld_remap_idx_mask;
++
+ const struct mt76_bus_ops *bus_ops;
+ struct mt7996_phy phy;
+
+@@ -608,7 +615,8 @@ int mt7996_mcu_add_tx_ba(struct mt7996_dev *dev,
+ struct mt7996_sta_link *msta_link, bool enable);
+ int mt7996_mcu_add_rx_ba(struct mt7996_dev *dev,
+ struct ieee80211_ampdu_params *params,
+- struct mt7996_vif_link *link, bool enable);
++ struct mt7996_vif_link *link,
++ struct mt7996_sta_link *msta_link, bool enable);
+ int mt7996_mcu_update_bss_color(struct mt7996_dev *dev,
+ struct mt76_vif_link *mlink,
+ struct cfg80211_he_bss_color *he_bss_color);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/pci.c b/drivers/net/wireless/mediatek/mt76/mt7996/pci.c
+index 19e99bc1c6c415..f5ce50056ee94e 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/pci.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/pci.c
+@@ -137,6 +137,7 @@ static int mt7996_pci_probe(struct pci_dev *pdev,
+ mdev = &dev->mt76;
+ mt7996_wfsys_reset(dev);
+ hif2 = mt7996_pci_init_hif2(pdev);
++ dev->hif2 = hif2;
+
+ ret = mt7996_mmio_wed_init(dev, pdev, false, &irq);
+ if (ret < 0)
+@@ -161,7 +162,6 @@ static int mt7996_pci_probe(struct pci_dev *pdev,
+
+ if (hif2) {
+ hif2_dev = container_of(hif2->dev, struct pci_dev, dev);
+- dev->hif2 = hif2;
+
+ ret = mt7996_mmio_wed_init(dev, hif2_dev, true, &hif2_irq);
+ if (ret < 0)
+diff --git a/drivers/net/wireless/realtek/rtw88/led.c b/drivers/net/wireless/realtek/rtw88/led.c
+index 25aa6cbaa7286b..4cc62e49d1679a 100644
+--- a/drivers/net/wireless/realtek/rtw88/led.c
++++ b/drivers/net/wireless/realtek/rtw88/led.c
+@@ -6,13 +6,17 @@
+ #include "debug.h"
+ #include "led.h"
+
+-static int rtw_led_set_blocking(struct led_classdev *led,
+- enum led_brightness brightness)
++static int rtw_led_set(struct led_classdev *led,
++ enum led_brightness brightness)
+ {
+ struct rtw_dev *rtwdev = container_of(led, struct rtw_dev, led_cdev);
+
++ mutex_lock(&rtwdev->mutex);
++
+ rtwdev->chip->ops->led_set(led, brightness);
+
++ mutex_unlock(&rtwdev->mutex);
++
+ return 0;
+ }
+
+@@ -36,10 +40,7 @@ void rtw_led_init(struct rtw_dev *rtwdev)
+ if (!rtwdev->chip->ops->led_set)
+ return;
+
+- if (rtw_hci_type(rtwdev) == RTW_HCI_TYPE_PCIE)
+- led->brightness_set = rtwdev->chip->ops->led_set;
+- else
+- led->brightness_set_blocking = rtw_led_set_blocking;
++ led->brightness_set_blocking = rtw_led_set;
+
+ snprintf(rtwdev->led_name, sizeof(rtwdev->led_name),
+ "rtw88-%s", dev_name(rtwdev->dev));
+diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
+index b9c2224dde4a37..1837f17239ab60 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.c
++++ b/drivers/net/wireless/realtek/rtw89/core.c
+@@ -3456,6 +3456,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
+ rtwsta_link = rtwsta->links[rtwvif_link->link_id];
+ if (unlikely(!rtwsta_link)) {
+ ret = -ENOLINK;
++ dev_kfree_skb_any(skb);
+ goto out;
+ }
+
+diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
+index fe7beff8c42465..f99e179f7ff9fe 100644
+--- a/drivers/net/wireless/realtek/rtw89/ser.c
++++ b/drivers/net/wireless/realtek/rtw89/ser.c
+@@ -205,7 +205,6 @@ static void rtw89_ser_hdl_work(struct work_struct *work)
+
+ static int ser_send_msg(struct rtw89_ser *ser, u8 event)
+ {
+- struct rtw89_dev *rtwdev = container_of(ser, struct rtw89_dev, ser);
+ struct ser_msg *msg = NULL;
+
+ if (test_bit(RTW89_SER_DRV_STOP_RUN, ser->flags))
+@@ -221,7 +220,7 @@ static int ser_send_msg(struct rtw89_ser *ser, u8 event)
+ list_add(&msg->list, &ser->msg_q);
+ spin_unlock_irq(&ser->msg_q_lock);
+
+- ieee80211_queue_work(rtwdev->hw, &ser->ser_hdl_work);
++ schedule_work(&ser->ser_hdl_work);
+ return 0;
+ }
+
+diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
+index 201fc8809a628c..012fcfc79a73b1 100644
+--- a/drivers/nvme/host/auth.c
++++ b/drivers/nvme/host/auth.c
+@@ -331,9 +331,10 @@ static int nvme_auth_set_dhchap_reply_data(struct nvme_ctrl *ctrl,
+ } else {
+ memset(chap->c2, 0, chap->hash_len);
+ }
+- if (ctrl->opts->concat)
++ if (ctrl->opts->concat) {
+ chap->s2 = 0;
+- else
++ chap->bi_directional = false;
++ } else
+ chap->s2 = nvme_auth_get_seqnum();
+ data->seqnum = cpu_to_le32(chap->s2);
+ if (chap->host_key_len) {
+diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
+index c0fe8cfb7229e1..1413788ca7d523 100644
+--- a/drivers/nvme/host/tcp.c
++++ b/drivers/nvme/host/tcp.c
+@@ -2250,6 +2250,9 @@ static int nvme_tcp_configure_admin_queue(struct nvme_ctrl *ctrl, bool new)
+ if (error)
+ goto out_cleanup_tagset;
+
++ if (ctrl->opts->concat && !ctrl->tls_pskid)
++ return 0;
++
+ error = nvme_enable_ctrl(ctrl);
+ if (error)
+ goto out_stop_queue;
+diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
+index a9b18c051f5bd8..6725c34dd7c90a 100644
+--- a/drivers/nvme/target/fc.c
++++ b/drivers/nvme/target/fc.c
+@@ -54,6 +54,8 @@ struct nvmet_fc_ls_req_op { /* for an LS RQST XMT */
+ int ls_error;
+ struct list_head lsreq_list; /* tgtport->ls_req_list */
+ bool req_queued;
++
++ struct work_struct put_work;
+ };
+
+
+@@ -111,8 +113,6 @@ struct nvmet_fc_tgtport {
+ struct nvmet_fc_port_entry *pe;
+ struct kref ref;
+ u32 max_sg_cnt;
+-
+- struct work_struct put_work;
+ };
+
+ struct nvmet_fc_port_entry {
+@@ -235,12 +235,13 @@ static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
+ static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
+ static int nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
+ static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
+-static void nvmet_fc_put_tgtport_work(struct work_struct *work)
++static void nvmet_fc_put_lsop_work(struct work_struct *work)
+ {
+- struct nvmet_fc_tgtport *tgtport =
+- container_of(work, struct nvmet_fc_tgtport, put_work);
++ struct nvmet_fc_ls_req_op *lsop =
++ container_of(work, struct nvmet_fc_ls_req_op, put_work);
+
+- nvmet_fc_tgtport_put(tgtport);
++ nvmet_fc_tgtport_put(lsop->tgtport);
++ kfree(lsop);
+ }
+ static int nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
+ static void nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
+@@ -367,7 +368,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
+ DMA_BIDIRECTIONAL);
+
+ out_putwork:
+- queue_work(nvmet_wq, &tgtport->put_work);
++ queue_work(nvmet_wq, &lsop->put_work);
+ }
+
+ static int
+@@ -388,6 +389,7 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
+ lsreq->done = done;
+ lsop->req_queued = false;
+ INIT_LIST_HEAD(&lsop->lsreq_list);
++ INIT_WORK(&lsop->put_work, nvmet_fc_put_lsop_work);
+
+ lsreq->rqstdma = fc_dma_map_single(tgtport->dev, lsreq->rqstaddr,
+ lsreq->rqstlen + lsreq->rsplen,
+@@ -447,8 +449,6 @@ nvmet_fc_disconnect_assoc_done(struct nvmefc_ls_req *lsreq, int status)
+ __nvmet_fc_finish_ls_req(lsop);
+
+ /* fc-nvme target doesn't care about success or failure of cmd */
+-
+- kfree(lsop);
+ }
+
+ /*
+@@ -1410,7 +1410,6 @@ nvmet_fc_register_targetport(struct nvmet_fc_port_info *pinfo,
+ kref_init(&newrec->ref);
+ ida_init(&newrec->assoc_cnt);
+ newrec->max_sg_cnt = template->max_sgl_segments;
+- INIT_WORK(&newrec->put_work, nvmet_fc_put_tgtport_work);
+
+ ret = nvmet_fc_alloc_ls_iodlist(newrec);
+ if (ret) {
+diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
+index 257b497d515a89..5dffcc5becae86 100644
+--- a/drivers/nvme/target/fcloop.c
++++ b/drivers/nvme/target/fcloop.c
+@@ -496,13 +496,15 @@ fcloop_t2h_xmt_ls_rsp(struct nvme_fc_local_port *localport,
+ if (!targetport) {
+ /*
+ * The target port is gone. The target doesn't expect any
+- * response anymore and the ->done call is not valid
+- * because the resources have been freed by
+- * nvmet_fc_free_pending_reqs.
++ * response anymore and thus lsreq can't be accessed anymore.
+ *
+ * We end up here from delete association exchange:
+ * nvmet_fc_xmt_disconnect_assoc sends an async request.
++ *
++ * Return success because this is what LLDDs do; silently
++ * drop the response.
+ */
++ lsrsp->done(lsrsp);
+ kmem_cache_free(lsreq_cache, tls_req);
+ return 0;
+ }
+diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
+index 6c93f39d028883..5e445a7bda3328 100644
+--- a/drivers/pci/controller/cadence/pci-j721e.c
++++ b/drivers/pci/controller/cadence/pci-j721e.c
+@@ -549,7 +549,7 @@ static int j721e_pcie_probe(struct platform_device *pdev)
+
+ ret = j721e_pcie_ctrl_init(pcie);
+ if (ret < 0) {
+- dev_err_probe(dev, ret, "pm_runtime_get_sync failed\n");
++ dev_err_probe(dev, ret, "j721e_pcie_ctrl_init failed\n");
+ goto err_get_sync;
+ }
+
+diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
+index 00f52d472dcdd7..cc71a2d90cd48c 100644
+--- a/drivers/pci/controller/dwc/pcie-designware.h
++++ b/drivers/pci/controller/dwc/pcie-designware.h
+@@ -123,7 +123,6 @@
+ #define GEN3_RELATED_OFF_GEN3_EQ_DISABLE BIT(16)
+ #define GEN3_RELATED_OFF_RATE_SHADOW_SEL_SHIFT 24
+ #define GEN3_RELATED_OFF_RATE_SHADOW_SEL_MASK GENMASK(25, 24)
+-#define GEN3_RELATED_OFF_RATE_SHADOW_SEL_16_0GT 0x1
+
+ #define GEN3_EQ_CONTROL_OFF 0x8A8
+ #define GEN3_EQ_CONTROL_OFF_FB_MODE GENMASK(3, 0)
+diff --git a/drivers/pci/controller/dwc/pcie-qcom-common.c b/drivers/pci/controller/dwc/pcie-qcom-common.c
+index 3aad19b56da8f6..0c6f4514f922f4 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom-common.c
++++ b/drivers/pci/controller/dwc/pcie-qcom-common.c
+@@ -8,9 +8,11 @@
+ #include "pcie-designware.h"
+ #include "pcie-qcom-common.h"
+
+-void qcom_pcie_common_set_16gt_equalization(struct dw_pcie *pci)
++void qcom_pcie_common_set_equalization(struct dw_pcie *pci)
+ {
++ struct device *dev = pci->dev;
+ u32 reg;
++ u16 speed;
+
+ /*
+ * GEN3_RELATED_OFF register is repurposed to apply equalization
+@@ -19,32 +21,40 @@ void qcom_pcie_common_set_16gt_equalization(struct dw_pcie *pci)
+ * determines the data rate for which these equalization settings are
+ * applied.
+ */
+- reg = dw_pcie_readl_dbi(pci, GEN3_RELATED_OFF);
+- reg &= ~GEN3_RELATED_OFF_GEN3_ZRXDC_NONCOMPL;
+- reg &= ~GEN3_RELATED_OFF_RATE_SHADOW_SEL_MASK;
+- reg |= FIELD_PREP(GEN3_RELATED_OFF_RATE_SHADOW_SEL_MASK,
+- GEN3_RELATED_OFF_RATE_SHADOW_SEL_16_0GT);
+- dw_pcie_writel_dbi(pci, GEN3_RELATED_OFF, reg);
+
+- reg = dw_pcie_readl_dbi(pci, GEN3_EQ_FB_MODE_DIR_CHANGE_OFF);
+- reg &= ~(GEN3_EQ_FMDC_T_MIN_PHASE23 |
+- GEN3_EQ_FMDC_N_EVALS |
+- GEN3_EQ_FMDC_MAX_PRE_CUSROR_DELTA |
+- GEN3_EQ_FMDC_MAX_POST_CUSROR_DELTA);
+- reg |= FIELD_PREP(GEN3_EQ_FMDC_T_MIN_PHASE23, 0x1) |
+- FIELD_PREP(GEN3_EQ_FMDC_N_EVALS, 0xd) |
+- FIELD_PREP(GEN3_EQ_FMDC_MAX_PRE_CUSROR_DELTA, 0x5) |
+- FIELD_PREP(GEN3_EQ_FMDC_MAX_POST_CUSROR_DELTA, 0x5);
+- dw_pcie_writel_dbi(pci, GEN3_EQ_FB_MODE_DIR_CHANGE_OFF, reg);
++ for (speed = PCIE_SPEED_8_0GT; speed <= pcie_link_speed[pci->max_link_speed]; speed++) {
++ if (speed > PCIE_SPEED_32_0GT) {
++ dev_warn(dev, "Skipped equalization settings for unsupported data rate\n");
++ break;
++ }
+
+- reg = dw_pcie_readl_dbi(pci, GEN3_EQ_CONTROL_OFF);
+- reg &= ~(GEN3_EQ_CONTROL_OFF_FB_MODE |
+- GEN3_EQ_CONTROL_OFF_PHASE23_EXIT_MODE |
+- GEN3_EQ_CONTROL_OFF_FOM_INC_INITIAL_EVAL |
+- GEN3_EQ_CONTROL_OFF_PSET_REQ_VEC);
+- dw_pcie_writel_dbi(pci, GEN3_EQ_CONTROL_OFF, reg);
++ reg = dw_pcie_readl_dbi(pci, GEN3_RELATED_OFF);
++ reg &= ~GEN3_RELATED_OFF_GEN3_ZRXDC_NONCOMPL;
++ reg &= ~GEN3_RELATED_OFF_RATE_SHADOW_SEL_MASK;
++ reg |= FIELD_PREP(GEN3_RELATED_OFF_RATE_SHADOW_SEL_MASK,
++ speed - PCIE_SPEED_8_0GT);
++ dw_pcie_writel_dbi(pci, GEN3_RELATED_OFF, reg);
++
++ reg = dw_pcie_readl_dbi(pci, GEN3_EQ_FB_MODE_DIR_CHANGE_OFF);
++ reg &= ~(GEN3_EQ_FMDC_T_MIN_PHASE23 |
++ GEN3_EQ_FMDC_N_EVALS |
++ GEN3_EQ_FMDC_MAX_PRE_CUSROR_DELTA |
++ GEN3_EQ_FMDC_MAX_POST_CUSROR_DELTA);
++ reg |= FIELD_PREP(GEN3_EQ_FMDC_T_MIN_PHASE23, 0x1) |
++ FIELD_PREP(GEN3_EQ_FMDC_N_EVALS, 0xd) |
++ FIELD_PREP(GEN3_EQ_FMDC_MAX_PRE_CUSROR_DELTA, 0x5) |
++ FIELD_PREP(GEN3_EQ_FMDC_MAX_POST_CUSROR_DELTA, 0x5);
++ dw_pcie_writel_dbi(pci, GEN3_EQ_FB_MODE_DIR_CHANGE_OFF, reg);
++
++ reg = dw_pcie_readl_dbi(pci, GEN3_EQ_CONTROL_OFF);
++ reg &= ~(GEN3_EQ_CONTROL_OFF_FB_MODE |
++ GEN3_EQ_CONTROL_OFF_PHASE23_EXIT_MODE |
++ GEN3_EQ_CONTROL_OFF_FOM_INC_INITIAL_EVAL |
++ GEN3_EQ_CONTROL_OFF_PSET_REQ_VEC);
++ dw_pcie_writel_dbi(pci, GEN3_EQ_CONTROL_OFF, reg);
++ }
+ }
+-EXPORT_SYMBOL_GPL(qcom_pcie_common_set_16gt_equalization);
++EXPORT_SYMBOL_GPL(qcom_pcie_common_set_equalization);
+
+ void qcom_pcie_common_set_16gt_lane_margining(struct dw_pcie *pci)
+ {
+diff --git a/drivers/pci/controller/dwc/pcie-qcom-common.h b/drivers/pci/controller/dwc/pcie-qcom-common.h
+index 7d88d29e476611..7f5ca2fd9a72fc 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom-common.h
++++ b/drivers/pci/controller/dwc/pcie-qcom-common.h
+@@ -8,7 +8,7 @@
+
+ struct dw_pcie;
+
+-void qcom_pcie_common_set_16gt_equalization(struct dw_pcie *pci);
++void qcom_pcie_common_set_equalization(struct dw_pcie *pci);
+ void qcom_pcie_common_set_16gt_lane_margining(struct dw_pcie *pci);
+
+ #endif
+diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
+index bf7c6ac0f3e396..aaf060bf39d40b 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
++++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
+@@ -511,10 +511,10 @@ static int qcom_pcie_perst_deassert(struct dw_pcie *pci)
+ goto err_disable_resources;
+ }
+
+- if (pcie_link_speed[pci->max_link_speed] == PCIE_SPEED_16_0GT) {
+- qcom_pcie_common_set_16gt_equalization(pci);
++ qcom_pcie_common_set_equalization(pci);
++
++ if (pcie_link_speed[pci->max_link_speed] == PCIE_SPEED_16_0GT)
+ qcom_pcie_common_set_16gt_lane_margining(pci);
+- }
+
+ /*
+ * The physical address of the MMIO region which is exposed as the BAR
+diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
+index 294babe1816e4d..a93740ae602f2a 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom.c
++++ b/drivers/pci/controller/dwc/pcie-qcom.c
+@@ -322,10 +322,10 @@ static int qcom_pcie_start_link(struct dw_pcie *pci)
+ {
+ struct qcom_pcie *pcie = to_qcom_pcie(pci);
+
+- if (pcie_link_speed[pci->max_link_speed] == PCIE_SPEED_16_0GT) {
+- qcom_pcie_common_set_16gt_equalization(pci);
++ qcom_pcie_common_set_equalization(pci);
++
++ if (pcie_link_speed[pci->max_link_speed] == PCIE_SPEED_16_0GT)
+ qcom_pcie_common_set_16gt_lane_margining(pci);
+- }
+
+ /* Enable Link Training state machine */
+ if (pcie->cfg->ops->ltssm_enable)
+@@ -1740,6 +1740,8 @@ static int qcom_pcie_parse_ports(struct qcom_pcie *pcie)
+ int ret = -ENOENT;
+
+ for_each_available_child_of_node_scoped(dev->of_node, of_port) {
++ if (!of_node_is_type(of_port, "pci"))
++ continue;
+ ret = qcom_pcie_parse_port(pcie, of_port);
+ if (ret)
+ goto err_port_del;
+diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
+index 18055807a4f5f9..c16c4c2be4993a 100644
+--- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
++++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
+@@ -182,8 +182,17 @@ static int rcar_gen4_pcie_common_init(struct rcar_gen4_pcie *rcar)
+ return ret;
+ }
+
+- if (!reset_control_status(dw->core_rsts[DW_PCIE_PWR_RST].rstc))
++ if (!reset_control_status(dw->core_rsts[DW_PCIE_PWR_RST].rstc)) {
+ reset_control_assert(dw->core_rsts[DW_PCIE_PWR_RST].rstc);
++ /*
++ * R-Car V4H Reference Manual R19UH0186EJ0130 Rev.1.30 Apr.
++ * 21, 2025 page 585 Figure 9.3.2 Software Reset flow (B)
++ * indicates that for peripherals in HSC domain, after
++ * reset has been asserted by writing a matching reset bit
++ * into register SRCR, it is mandatory to wait 1ms.
++ */
++ fsleep(1000);
++ }
+
+ val = readl(rcar->base + PCIEMSR0);
+ if (rcar->drvdata->mode == DW_PCIE_RC_TYPE) {
+@@ -204,6 +213,19 @@ static int rcar_gen4_pcie_common_init(struct rcar_gen4_pcie *rcar)
+ if (ret)
+ goto err_unprepare;
+
++ /*
++ * Assure the reset is latched and the core is ready for DBI access.
++ * On R-Car V4H, the PCIe reset is asynchronous and does not take
++ * effect immediately, but needs a short time to complete. In case
++ * DBI access happens in that short time, that access generates an
++ * SError. To make sure that condition can never happen, read back the
++ * state of the reset, which should turn the asynchronous reset into
++ * synchronous one, and wait a little over 1ms to add additional
++ * safety margin.
++ */
++ reset_control_status(dw->core_rsts[DW_PCIE_PWR_RST].rstc);
++ fsleep(1000);
++
+ if (rcar->drvdata->additional_common_init)
+ rcar->drvdata->additional_common_init(rcar);
+
+@@ -711,7 +733,7 @@ static int rcar_gen4_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable
+ val &= ~APP_HOLD_PHY_RST;
+ writel(val, rcar->base + PCIERSTCTRL1);
+
+- ret = readl_poll_timeout(rcar->phy_base + 0x0f8, val, !(val & BIT(18)), 100, 10000);
++ ret = readl_poll_timeout(rcar->phy_base + 0x0f8, val, val & BIT(18), 100, 10000);
+ if (ret < 0)
+ return ret;
+
+diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
+index 4f26086f25daf8..0c0734aa14b680 100644
+--- a/drivers/pci/controller/dwc/pcie-tegra194.c
++++ b/drivers/pci/controller/dwc/pcie-tegra194.c
+@@ -1722,9 +1722,9 @@ static void pex_ep_event_pex_rst_assert(struct tegra_pcie_dw *pcie)
+ ret);
+ }
+
+- ret = tegra_pcie_bpmp_set_pll_state(pcie, false);
++ ret = tegra_pcie_bpmp_set_ctrl_state(pcie, false);
+ if (ret)
+- dev_err(pcie->dev, "Failed to turn off UPHY: %d\n", ret);
++ dev_err(pcie->dev, "Failed to disable controller: %d\n", ret);
+
+ pcie->ep_state = EP_STATE_DISABLED;
+ dev_dbg(pcie->dev, "Uninitialization of endpoint is completed\n");
+diff --git a/drivers/pci/controller/pci-tegra.c b/drivers/pci/controller/pci-tegra.c
+index 467ddc701adce2..bb88767a379795 100644
+--- a/drivers/pci/controller/pci-tegra.c
++++ b/drivers/pci/controller/pci-tegra.c
+@@ -1344,7 +1344,7 @@ static int tegra_pcie_port_get_phys(struct tegra_pcie_port *port)
+ unsigned int i;
+ int err;
+
+- port->phys = devm_kcalloc(dev, sizeof(phy), port->lanes, GFP_KERNEL);
++ port->phys = devm_kcalloc(dev, port->lanes, sizeof(phy), GFP_KERNEL);
+ if (!port->phys)
+ return -ENOMEM;
+
+diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
+index 0a37a3f1809c50..654639bccd10e3 100644
+--- a/drivers/pci/controller/pci-xgene-msi.c
++++ b/drivers/pci/controller/pci-xgene-msi.c
+@@ -311,7 +311,7 @@ static int xgene_msi_handler_setup(struct platform_device *pdev)
+ msi_val = xgene_msi_int_read(xgene_msi, i);
+ if (msi_val) {
+ dev_err(&pdev->dev, "Failed to clear spurious IRQ\n");
+- return EINVAL;
++ return -EINVAL;
+ }
+
+ irq = platform_get_irq(pdev, i);
+diff --git a/drivers/pci/controller/pcie-rcar-host.c b/drivers/pci/controller/pcie-rcar-host.c
+index fe288fd770c493..4780e0109e5834 100644
+--- a/drivers/pci/controller/pcie-rcar-host.c
++++ b/drivers/pci/controller/pcie-rcar-host.c
+@@ -584,7 +584,7 @@ static irqreturn_t rcar_pcie_msi_irq(int irq, void *data)
+ unsigned int index = find_first_bit(®, 32);
+ int ret;
+
+- ret = generic_handle_domain_irq(msi->domain->parent, index);
++ ret = generic_handle_domain_irq(msi->domain, index);
+ if (ret) {
+ /* Unknown MSI, just clear it */
+ dev_dbg(dev, "unexpected MSI\n");
+diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
+index e091193bd8a8a1..044f5ea0716d1f 100644
+--- a/drivers/pci/endpoint/functions/pci-epf-test.c
++++ b/drivers/pci/endpoint/functions/pci-epf-test.c
+@@ -301,15 +301,20 @@ static void pci_epf_test_clean_dma_chan(struct pci_epf_test *epf_test)
+ if (!epf_test->dma_supported)
+ return;
+
+- dma_release_channel(epf_test->dma_chan_tx);
+- if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
++ if (epf_test->dma_chan_tx) {
++ dma_release_channel(epf_test->dma_chan_tx);
++ if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
++ epf_test->dma_chan_tx = NULL;
++ epf_test->dma_chan_rx = NULL;
++ return;
++ }
+ epf_test->dma_chan_tx = NULL;
+- epf_test->dma_chan_rx = NULL;
+- return;
+ }
+
+- dma_release_channel(epf_test->dma_chan_rx);
+- epf_test->dma_chan_rx = NULL;
++ if (epf_test->dma_chan_rx) {
++ dma_release_channel(epf_test->dma_chan_rx);
++ epf_test->dma_chan_rx = NULL;
++ }
+ }
+
+ static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
+@@ -772,12 +777,24 @@ static void pci_epf_test_disable_doorbell(struct pci_epf_test *epf_test,
+ u32 status = le32_to_cpu(reg->status);
+ struct pci_epf *epf = epf_test->epf;
+ struct pci_epc *epc = epf->epc;
++ int ret;
+
+ if (bar < BAR_0)
+ goto set_status_err;
+
+ pci_epf_test_doorbell_cleanup(epf_test);
+- pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no, &epf_test->db_bar);
++
++ /*
++ * The doorbell feature temporarily overrides the inbound translation
++ * to point to the address stored in epf_test->db_bar.phys_addr, i.e.,
++ * it calls set_bar() twice without ever calling clear_bar(), as
++ * calling clear_bar() would clear the BAR's PCI address assigned by
++ * the host. Thus, when disabling the doorbell, restore the inbound
++ * translation to point to the memory allocated for the BAR.
++ */
++ ret = pci_epc_set_bar(epc, epf->func_no, epf->vfunc_no, &epf->bar[bar]);
++ if (ret)
++ goto set_status_err;
+
+ status |= STATUS_DOORBELL_DISABLE_SUCCESS;
+ reg->status = cpu_to_le32(status);
+diff --git a/drivers/pci/endpoint/pci-ep-msi.c b/drivers/pci/endpoint/pci-ep-msi.c
+index 9ca89cbfec15df..1b58357b905fab 100644
+--- a/drivers/pci/endpoint/pci-ep-msi.c
++++ b/drivers/pci/endpoint/pci-ep-msi.c
+@@ -24,7 +24,7 @@ static void pci_epf_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
+ struct pci_epf *epf;
+
+ epc = pci_epc_get(dev_name(msi_desc_to_dev(desc)));
+- if (!epc)
++ if (IS_ERR(epc))
+ return;
+
+ epf = list_first_entry_or_null(&epc->pci_epf, struct pci_epf, list);
+diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
+index 0938ef7ebabf2d..b11b7f63f0d6fc 100644
+--- a/drivers/pci/msi/irqdomain.c
++++ b/drivers/pci/msi/irqdomain.c
+@@ -148,6 +148,28 @@ static void pci_device_domain_set_desc(msi_alloc_info_t *arg, struct msi_desc *d
+ arg->hwirq = desc->msi_index;
+ }
+
++static void cond_shutdown_parent(struct irq_data *data)
++{
++ struct msi_domain_info *info = data->domain->host_data;
++
++ if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
++ irq_chip_shutdown_parent(data);
++ else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
++ irq_chip_mask_parent(data);
++}
++
++static unsigned int cond_startup_parent(struct irq_data *data)
++{
++ struct msi_domain_info *info = data->domain->host_data;
++
++ if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
++ return irq_chip_startup_parent(data);
++ else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
++ irq_chip_unmask_parent(data);
++
++ return 0;
++}
++
+ static __always_inline void cond_mask_parent(struct irq_data *data)
+ {
+ struct msi_domain_info *info = data->domain->host_data;
+@@ -164,6 +186,23 @@ static __always_inline void cond_unmask_parent(struct irq_data *data)
+ irq_chip_unmask_parent(data);
+ }
+
++static void pci_irq_shutdown_msi(struct irq_data *data)
++{
++ struct msi_desc *desc = irq_data_get_msi_desc(data);
++
++ pci_msi_mask(desc, BIT(data->irq - desc->irq));
++ cond_shutdown_parent(data);
++}
++
++static unsigned int pci_irq_startup_msi(struct irq_data *data)
++{
++ struct msi_desc *desc = irq_data_get_msi_desc(data);
++ unsigned int ret = cond_startup_parent(data);
++
++ pci_msi_unmask(desc, BIT(data->irq - desc->irq));
++ return ret;
++}
++
+ static void pci_irq_mask_msi(struct irq_data *data)
+ {
+ struct msi_desc *desc = irq_data_get_msi_desc(data);
+@@ -194,6 +233,8 @@ static void pci_irq_unmask_msi(struct irq_data *data)
+ static const struct msi_domain_template pci_msi_template = {
+ .chip = {
+ .name = "PCI-MSI",
++ .irq_startup = pci_irq_startup_msi,
++ .irq_shutdown = pci_irq_shutdown_msi,
+ .irq_mask = pci_irq_mask_msi,
+ .irq_unmask = pci_irq_unmask_msi,
+ .irq_write_msi_msg = pci_msi_domain_write_msg,
+@@ -210,6 +251,20 @@ static const struct msi_domain_template pci_msi_template = {
+ },
+ };
+
++static void pci_irq_shutdown_msix(struct irq_data *data)
++{
++ pci_msix_mask(irq_data_get_msi_desc(data));
++ cond_shutdown_parent(data);
++}
++
++static unsigned int pci_irq_startup_msix(struct irq_data *data)
++{
++ unsigned int ret = cond_startup_parent(data);
++
++ pci_msix_unmask(irq_data_get_msi_desc(data));
++ return ret;
++}
++
+ static void pci_irq_mask_msix(struct irq_data *data)
+ {
+ pci_msix_mask(irq_data_get_msi_desc(data));
+@@ -234,6 +289,8 @@ EXPORT_SYMBOL_GPL(pci_msix_prepare_desc);
+ static const struct msi_domain_template pci_msix_template = {
+ .chip = {
+ .name = "PCI-MSIX",
++ .irq_startup = pci_irq_startup_msix,
++ .irq_shutdown = pci_irq_shutdown_msix,
+ .irq_mask = pci_irq_mask_msix,
+ .irq_unmask = pci_irq_unmask_msix,
+ .irq_write_msi_msg = pci_msi_domain_write_msg,
+diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
+index ddb25960ea47d1..9369377725fa03 100644
+--- a/drivers/pci/pci-acpi.c
++++ b/drivers/pci/pci-acpi.c
+@@ -122,6 +122,8 @@ phys_addr_t acpi_pci_root_get_mcfg_addr(acpi_handle handle)
+
+ bool pci_acpi_preserve_config(struct pci_host_bridge *host_bridge)
+ {
++ bool ret = false;
++
+ if (ACPI_HANDLE(&host_bridge->dev)) {
+ union acpi_object *obj;
+
+@@ -135,11 +137,11 @@ bool pci_acpi_preserve_config(struct pci_host_bridge *host_bridge)
+ 1, DSM_PCI_PRESERVE_BOOT_CONFIG,
+ NULL, ACPI_TYPE_INTEGER);
+ if (obj && obj->integer.value == 0)
+- return true;
++ ret = true;
+ ACPI_FREE(obj);
+ }
+
+- return false;
++ return ret;
+ }
+
+ /* _HPX PCI Setting Record (Type 0); same as _HPP */
+diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
+index e286c197d7167a..55abc5e17b8b10 100644
+--- a/drivers/pci/pcie/aer.c
++++ b/drivers/pci/pcie/aer.c
+@@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
+
+ static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
+ {
++ if (!dev->aer_info)
++ return 1;
++
+ switch (severity) {
+ case AER_NONFATAL:
+ return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
+diff --git a/drivers/pci/pwrctrl/slot.c b/drivers/pci/pwrctrl/slot.c
+index 6e138310b45b9f..3320494b62d890 100644
+--- a/drivers/pci/pwrctrl/slot.c
++++ b/drivers/pci/pwrctrl/slot.c
+@@ -49,13 +49,14 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev)
+ ret = regulator_bulk_enable(slot->num_supplies, slot->supplies);
+ if (ret < 0) {
+ dev_err_probe(dev, ret, "Failed to enable slot regulators\n");
+- goto err_regulator_free;
++ regulator_bulk_free(slot->num_supplies, slot->supplies);
++ return ret;
+ }
+
+ ret = devm_add_action_or_reset(dev, devm_pci_pwrctrl_slot_power_off,
+ slot);
+ if (ret)
+- goto err_regulator_disable;
++ return ret;
+
+ clk = devm_clk_get_optional_enabled(dev, NULL);
+ if (IS_ERR(clk)) {
+@@ -70,13 +71,6 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev)
+ return dev_err_probe(dev, ret, "Failed to register pwrctrl driver\n");
+
+ return 0;
+-
+-err_regulator_disable:
+- regulator_bulk_disable(slot->num_supplies, slot->supplies);
+-err_regulator_free:
+- regulator_bulk_free(slot->num_supplies, slot->supplies);
+-
+- return ret;
+ }
+
+ static const struct of_device_id pci_pwrctrl_slot_of_match[] = {
+diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
+index 369e77ad5f13ff..8f14cb324e0183 100644
+--- a/drivers/perf/arm_spe_pmu.c
++++ b/drivers/perf/arm_spe_pmu.c
+@@ -97,7 +97,8 @@ struct arm_spe_pmu {
+ #define to_spe_pmu(p) (container_of(p, struct arm_spe_pmu, pmu))
+
+ /* Convert a free-running index from perf into an SPE buffer offset */
+-#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
++#define PERF_IDX2OFF(idx, buf) \
++ ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT))
+
+ /* Keep track of our dynamic hotplug state */
+ static enum cpuhp_state arm_spe_pmu_online;
+diff --git a/drivers/phy/rockchip/phy-rockchip-naneng-combphy.c b/drivers/phy/rockchip/phy-rockchip-naneng-combphy.c
+index ce91fb1d51671d..17c6310f4b54bf 100644
+--- a/drivers/phy/rockchip/phy-rockchip-naneng-combphy.c
++++ b/drivers/phy/rockchip/phy-rockchip-naneng-combphy.c
+@@ -137,6 +137,8 @@ struct rockchip_combphy_grfcfg {
+ struct combphy_reg pipe_xpcs_phy_ready;
+ struct combphy_reg pipe_pcie1l0_sel;
+ struct combphy_reg pipe_pcie1l1_sel;
++ struct combphy_reg u3otg0_port_en;
++ struct combphy_reg u3otg1_port_en;
+ };
+
+ struct rockchip_combphy_cfg {
+@@ -594,6 +596,14 @@ static int rk3568_combphy_cfg(struct rockchip_combphy_priv *priv)
+ rockchip_combphy_param_write(priv->phy_grf, &cfg->pipe_txcomp_sel, false);
+ rockchip_combphy_param_write(priv->phy_grf, &cfg->pipe_txelec_sel, false);
+ rockchip_combphy_param_write(priv->phy_grf, &cfg->usb_mode_set, true);
++ switch (priv->id) {
++ case 0:
++ rockchip_combphy_param_write(priv->pipe_grf, &cfg->u3otg0_port_en, true);
++ break;
++ case 1:
++ rockchip_combphy_param_write(priv->pipe_grf, &cfg->u3otg1_port_en, true);
++ break;
++ }
+ break;
+
+ case PHY_TYPE_SATA:
+@@ -737,6 +747,8 @@ static const struct rockchip_combphy_grfcfg rk3568_combphy_grfcfgs = {
+ /* pipe-grf */
+ .pipe_con0_for_sata = { 0x0000, 15, 0, 0x00, 0x2220 },
+ .pipe_xpcs_phy_ready = { 0x0040, 2, 2, 0x00, 0x01 },
++ .u3otg0_port_en = { 0x0104, 15, 0, 0x0181, 0x1100 },
++ .u3otg1_port_en = { 0x0144, 15, 0, 0x0181, 0x1100 },
+ };
+
+ static const struct rockchip_combphy_cfg rk3568_combphy_cfgs = {
+diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig
+index be1ca8e85754bc..0402626c4b98bb 100644
+--- a/drivers/pinctrl/Kconfig
++++ b/drivers/pinctrl/Kconfig
+@@ -211,6 +211,8 @@ config PINCTRL_EIC7700
+ depends on ARCH_ESWIN || COMPILE_TEST
+ select PINMUX
+ select GENERIC_PINCONF
++ select REGULATOR
++ select REGULATOR_FIXED_VOLTAGE
+ help
+ This driver support for the pin controller in ESWIN's EIC7700 SoC,
+ which supports pin multiplexing, pin configuration,and rgmii voltage
+diff --git a/drivers/pinctrl/meson/pinctrl-meson-gxl.c b/drivers/pinctrl/meson/pinctrl-meson-gxl.c
+index 9171de657f9780..a75762e4d26418 100644
+--- a/drivers/pinctrl/meson/pinctrl-meson-gxl.c
++++ b/drivers/pinctrl/meson/pinctrl-meson-gxl.c
+@@ -187,6 +187,9 @@ static const unsigned int i2c_sda_c_pins[] = { GPIODV_28 };
+ static const unsigned int i2c_sck_c_dv19_pins[] = { GPIODV_19 };
+ static const unsigned int i2c_sda_c_dv18_pins[] = { GPIODV_18 };
+
++static const unsigned int i2c_sck_d_pins[] = { GPIOX_11 };
++static const unsigned int i2c_sda_d_pins[] = { GPIOX_10 };
++
+ static const unsigned int eth_mdio_pins[] = { GPIOZ_0 };
+ static const unsigned int eth_mdc_pins[] = { GPIOZ_1 };
+ static const unsigned int eth_clk_rx_clk_pins[] = { GPIOZ_2 };
+@@ -411,6 +414,8 @@ static const struct meson_pmx_group meson_gxl_periphs_groups[] = {
+ GPIO_GROUP(GPIO_TEST_N),
+
+ /* Bank X */
++ GROUP(i2c_sda_d, 5, 5),
++ GROUP(i2c_sck_d, 5, 4),
+ GROUP(sdio_d0, 5, 31),
+ GROUP(sdio_d1, 5, 30),
+ GROUP(sdio_d2, 5, 29),
+@@ -651,6 +656,10 @@ static const char * const i2c_c_groups[] = {
+ "i2c_sck_c", "i2c_sda_c", "i2c_sda_c_dv18", "i2c_sck_c_dv19",
+ };
+
++static const char * const i2c_d_groups[] = {
++ "i2c_sck_d", "i2c_sda_d",
++};
++
+ static const char * const eth_groups[] = {
+ "eth_mdio", "eth_mdc", "eth_clk_rx_clk", "eth_rx_dv",
+ "eth_rxd0", "eth_rxd1", "eth_rxd2", "eth_rxd3",
+@@ -777,6 +786,7 @@ static const struct meson_pmx_func meson_gxl_periphs_functions[] = {
+ FUNCTION(i2c_a),
+ FUNCTION(i2c_b),
+ FUNCTION(i2c_c),
++ FUNCTION(i2c_d),
+ FUNCTION(eth),
+ FUNCTION(pwm_a),
+ FUNCTION(pwm_b),
+diff --git a/drivers/pinctrl/pinctrl-eic7700.c b/drivers/pinctrl/pinctrl-eic7700.c
+index 4874b55323439a..ffcd0ec5c2dc6c 100644
+--- a/drivers/pinctrl/pinctrl-eic7700.c
++++ b/drivers/pinctrl/pinctrl-eic7700.c
+@@ -634,7 +634,7 @@ static int eic7700_pinctrl_probe(struct platform_device *pdev)
+ return PTR_ERR(pc->base);
+
+ regulator = devm_regulator_get(dev, "vrgmii");
+- if (IS_ERR_OR_NULL(regulator)) {
++ if (IS_ERR(regulator)) {
+ return dev_err_probe(dev, PTR_ERR(regulator),
+ "failed to get vrgmii regulator\n");
+ }
+diff --git a/drivers/pinctrl/pinmux.c b/drivers/pinctrl/pinmux.c
+index 79814758a08457..07a478b2c48740 100644
+--- a/drivers/pinctrl/pinmux.c
++++ b/drivers/pinctrl/pinmux.c
+@@ -337,7 +337,7 @@ static int pinmux_func_name_to_selector(struct pinctrl_dev *pctldev,
+ while (selector < nfuncs) {
+ const char *fname = ops->get_function_name(pctldev, selector);
+
+- if (!strcmp(function, fname))
++ if (fname && !strcmp(function, fname))
+ return selector;
+
+ selector++;
+diff --git a/drivers/pinctrl/renesas/pinctrl-rzg2l.c b/drivers/pinctrl/renesas/pinctrl-rzg2l.c
+index c52263c2a7b093..22bc5b8f65fdee 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rzg2l.c
++++ b/drivers/pinctrl/renesas/pinctrl-rzg2l.c
+@@ -1124,7 +1124,7 @@ static u32 rzg3s_oen_read(struct rzg2l_pinctrl *pctrl, unsigned int _pin)
+
+ bit = rzg3s_pin_to_oen_bit(pctrl, _pin);
+ if (bit < 0)
+- return bit;
++ return 0;
+
+ return !(readb(pctrl->base + ETH_MODE) & BIT(bit));
+ }
+diff --git a/drivers/pinctrl/renesas/pinctrl.c b/drivers/pinctrl/renesas/pinctrl.c
+index 29d16c9c1bd194..3a742f74ecd1dc 100644
+--- a/drivers/pinctrl/renesas/pinctrl.c
++++ b/drivers/pinctrl/renesas/pinctrl.c
+@@ -726,7 +726,8 @@ static int sh_pfc_pinconf_group_set(struct pinctrl_dev *pctldev, unsigned group,
+ struct sh_pfc_pinctrl *pmx = pinctrl_dev_get_drvdata(pctldev);
+ const unsigned int *pins;
+ unsigned int num_pins;
+- unsigned int i, ret;
++ unsigned int i;
++ int ret;
+
+ pins = pmx->pfc->info->groups[group].pins;
+ num_pins = pmx->pfc->info->groups[group].nr_pins;
+diff --git a/drivers/power/supply/cw2015_battery.c b/drivers/power/supply/cw2015_battery.c
+index f63c3c41045155..382dff8805c623 100644
+--- a/drivers/power/supply/cw2015_battery.c
++++ b/drivers/power/supply/cw2015_battery.c
+@@ -702,8 +702,7 @@ static int cw_bat_probe(struct i2c_client *client)
+ if (!cw_bat->battery_workqueue)
+ return -ENOMEM;
+
+- devm_delayed_work_autocancel(&client->dev,
+- &cw_bat->battery_delay_work, cw_bat_work);
++ devm_delayed_work_autocancel(&client->dev, &cw_bat->battery_delay_work, cw_bat_work);
+ queue_delayed_work(cw_bat->battery_workqueue,
+ &cw_bat->battery_delay_work, msecs_to_jiffies(10));
+ return 0;
+diff --git a/drivers/power/supply/max77705_charger.c b/drivers/power/supply/max77705_charger.c
+index 329b430d0e5065..a8762bdd2c7c6a 100644
+--- a/drivers/power/supply/max77705_charger.c
++++ b/drivers/power/supply/max77705_charger.c
+@@ -40,13 +40,13 @@ static enum power_supply_property max77705_charger_props[] = {
+ POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT,
+ };
+
+-static int max77705_chgin_irq(void *irq_drv_data)
++static irqreturn_t max77705_chgin_irq(int irq, void *irq_drv_data)
+ {
+- struct max77705_charger_data *charger = irq_drv_data;
++ struct max77705_charger_data *chg = irq_drv_data;
+
+- queue_work(charger->wqueue, &charger->chgin_work);
++ queue_work(chg->wqueue, &chg->chgin_work);
+
+- return 0;
++ return IRQ_HANDLED;
+ }
+
+ static const struct regmap_irq max77705_charger_irqs[] = {
+@@ -64,7 +64,6 @@ static struct regmap_irq_chip max77705_charger_irq_chip = {
+ .name = "max77705-charger",
+ .status_base = MAX77705_CHG_REG_INT,
+ .mask_base = MAX77705_CHG_REG_INT_MASK,
+- .handle_post_irq = max77705_chgin_irq,
+ .num_regs = 1,
+ .irqs = max77705_charger_irqs,
+ .num_irqs = ARRAY_SIZE(max77705_charger_irqs),
+@@ -74,8 +73,7 @@ static int max77705_charger_enable(struct max77705_charger_data *chg)
+ {
+ int rv;
+
+- rv = regmap_update_bits(chg->regmap, MAX77705_CHG_REG_CNFG_09,
+- MAX77705_CHG_EN_MASK, MAX77705_CHG_EN_MASK);
++ rv = regmap_field_write(chg->rfield[MAX77705_CHG_EN], 1);
+ if (rv)
+ dev_err(chg->dev, "unable to enable the charger: %d\n", rv);
+
+@@ -87,10 +85,7 @@ static void max77705_charger_disable(void *data)
+ struct max77705_charger_data *chg = data;
+ int rv;
+
+- rv = regmap_update_bits(chg->regmap,
+- MAX77705_CHG_REG_CNFG_09,
+- MAX77705_CHG_EN_MASK,
+- MAX77705_CHG_DISABLE);
++ rv = regmap_field_write(chg->rfield[MAX77705_CHG_EN], MAX77705_CHG_DISABLE);
+ if (rv)
+ dev_err(chg->dev, "unable to disable the charger: %d\n", rv);
+ }
+@@ -109,19 +104,19 @@ static int max77705_get_online(struct regmap *regmap, int *val)
+ return 0;
+ }
+
+-static int max77705_check_battery(struct max77705_charger_data *charger, int *val)
++static int max77705_check_battery(struct max77705_charger_data *chg, int *val)
+ {
+ unsigned int reg_data;
+ unsigned int reg_data2;
+- struct regmap *regmap = charger->regmap;
++ struct regmap *regmap = chg->regmap;
+
+ regmap_read(regmap, MAX77705_CHG_REG_INT_OK, ®_data);
+
+- dev_dbg(charger->dev, "CHG_INT_OK(0x%x)\n", reg_data);
++ dev_dbg(chg->dev, "CHG_INT_OK(0x%x)\n", reg_data);
+
+ regmap_read(regmap, MAX77705_CHG_REG_DETAILS_00, ®_data2);
+
+- dev_dbg(charger->dev, "CHG_DETAILS00(0x%x)\n", reg_data2);
++ dev_dbg(chg->dev, "CHG_DETAILS00(0x%x)\n", reg_data2);
+
+ if ((reg_data & MAX77705_BATP_OK) || !(reg_data2 & MAX77705_BATP_DTLS))
+ *val = true;
+@@ -131,13 +126,13 @@ static int max77705_check_battery(struct max77705_charger_data *charger, int *va
+ return 0;
+ }
+
+-static int max77705_get_charge_type(struct max77705_charger_data *charger, int *val)
++static int max77705_get_charge_type(struct max77705_charger_data *chg, int *val)
+ {
+- struct regmap *regmap = charger->regmap;
+- unsigned int reg_data;
++ struct regmap *regmap = chg->regmap;
++ unsigned int reg_data, chg_en;
+
+- regmap_read(regmap, MAX77705_CHG_REG_CNFG_09, ®_data);
+- if (!MAX77705_CHARGER_CHG_CHARGING(reg_data)) {
++ regmap_field_read(chg->rfield[MAX77705_CHG_EN], &chg_en);
++ if (!chg_en) {
+ *val = POWER_SUPPLY_CHARGE_TYPE_NONE;
+ return 0;
+ }
+@@ -159,13 +154,13 @@ static int max77705_get_charge_type(struct max77705_charger_data *charger, int *
+ return 0;
+ }
+
+-static int max77705_get_status(struct max77705_charger_data *charger, int *val)
++static int max77705_get_status(struct max77705_charger_data *chg, int *val)
+ {
+- struct regmap *regmap = charger->regmap;
+- unsigned int reg_data;
++ struct regmap *regmap = chg->regmap;
++ unsigned int reg_data, chg_en;
+
+- regmap_read(regmap, MAX77705_CHG_REG_CNFG_09, ®_data);
+- if (!MAX77705_CHARGER_CHG_CHARGING(reg_data)) {
++ regmap_field_read(chg->rfield[MAX77705_CHG_EN], &chg_en);
++ if (!chg_en) {
+ *val = POWER_SUPPLY_CHARGE_TYPE_NONE;
+ return 0;
+ }
+@@ -234,10 +229,10 @@ static int max77705_get_vbus_state(struct regmap *regmap, int *value)
+ return 0;
+ }
+
+-static int max77705_get_battery_health(struct max77705_charger_data *charger,
++static int max77705_get_battery_health(struct max77705_charger_data *chg,
+ int *value)
+ {
+- struct regmap *regmap = charger->regmap;
++ struct regmap *regmap = chg->regmap;
+ unsigned int bat_dtls;
+
+ regmap_read(regmap, MAX77705_CHG_REG_DETAILS_01, &bat_dtls);
+@@ -245,16 +240,16 @@ static int max77705_get_battery_health(struct max77705_charger_data *charger,
+
+ switch (bat_dtls) {
+ case MAX77705_BATTERY_NOBAT:
+- dev_dbg(charger->dev, "%s: No battery and the charger is suspended\n",
++ dev_dbg(chg->dev, "%s: No battery and the chg is suspended\n",
+ __func__);
+ *value = POWER_SUPPLY_HEALTH_NO_BATTERY;
+ break;
+ case MAX77705_BATTERY_PREQUALIFICATION:
+- dev_dbg(charger->dev, "%s: battery is okay but its voltage is low(~VPQLB)\n",
++ dev_dbg(chg->dev, "%s: battery is okay but its voltage is low(~VPQLB)\n",
+ __func__);
+ break;
+ case MAX77705_BATTERY_DEAD:
+- dev_dbg(charger->dev, "%s: battery dead\n", __func__);
++ dev_dbg(chg->dev, "%s: battery dead\n", __func__);
+ *value = POWER_SUPPLY_HEALTH_DEAD;
+ break;
+ case MAX77705_BATTERY_GOOD:
+@@ -262,11 +257,11 @@ static int max77705_get_battery_health(struct max77705_charger_data *charger,
+ *value = POWER_SUPPLY_HEALTH_GOOD;
+ break;
+ case MAX77705_BATTERY_OVERVOLTAGE:
+- dev_dbg(charger->dev, "%s: battery ovp\n", __func__);
++ dev_dbg(chg->dev, "%s: battery ovp\n", __func__);
+ *value = POWER_SUPPLY_HEALTH_OVERVOLTAGE;
+ break;
+ default:
+- dev_dbg(charger->dev, "%s: battery unknown\n", __func__);
++ dev_dbg(chg->dev, "%s: battery unknown\n", __func__);
+ *value = POWER_SUPPLY_HEALTH_UNSPEC_FAILURE;
+ break;
+ }
+@@ -274,9 +269,9 @@ static int max77705_get_battery_health(struct max77705_charger_data *charger,
+ return 0;
+ }
+
+-static int max77705_get_health(struct max77705_charger_data *charger, int *val)
++static int max77705_get_health(struct max77705_charger_data *chg, int *val)
+ {
+- struct regmap *regmap = charger->regmap;
++ struct regmap *regmap = chg->regmap;
+ int ret, is_online = 0;
+
+ ret = max77705_get_online(regmap, &is_online);
+@@ -287,24 +282,19 @@ static int max77705_get_health(struct max77705_charger_data *charger, int *val)
+ if (ret || (*val != POWER_SUPPLY_HEALTH_GOOD))
+ return ret;
+ }
+- return max77705_get_battery_health(charger, val);
++ return max77705_get_battery_health(chg, val);
+ }
+
+-static int max77705_get_input_current(struct max77705_charger_data *charger,
++static int max77705_get_input_current(struct max77705_charger_data *chg,
+ int *val)
+ {
+ unsigned int reg_data;
+ int get_current = 0;
+- struct regmap *regmap = charger->regmap;
+-
+- regmap_read(regmap, MAX77705_CHG_REG_CNFG_09, ®_data);
+
+- reg_data &= MAX77705_CHG_CHGIN_LIM_MASK;
++ regmap_field_read(chg->rfield[MAX77705_CHG_CHGIN_LIM], ®_data);
+
+ if (reg_data <= 3)
+ get_current = MAX77705_CURRENT_CHGIN_MIN;
+- else if (reg_data >= MAX77705_CHG_CHGIN_LIM_MASK)
+- get_current = MAX77705_CURRENT_CHGIN_MAX;
+ else
+ get_current = (reg_data + 1) * MAX77705_CURRENT_CHGIN_STEP;
+
+@@ -313,26 +303,23 @@ static int max77705_get_input_current(struct max77705_charger_data *charger,
+ return 0;
+ }
+
+-static int max77705_get_charge_current(struct max77705_charger_data *charger,
++static int max77705_get_charge_current(struct max77705_charger_data *chg,
+ int *val)
+ {
+ unsigned int reg_data;
+- struct regmap *regmap = charger->regmap;
+
+- regmap_read(regmap, MAX77705_CHG_REG_CNFG_02, ®_data);
+- reg_data &= MAX77705_CHG_CC;
++ regmap_field_read(chg->rfield[MAX77705_CHG_CC_LIM], ®_data);
+
+ *val = reg_data <= 0x2 ? MAX77705_CURRENT_CHGIN_MIN : reg_data * MAX77705_CURRENT_CHG_STEP;
+
+ return 0;
+ }
+
+-static int max77705_set_float_voltage(struct max77705_charger_data *charger,
++static int max77705_set_float_voltage(struct max77705_charger_data *chg,
+ int float_voltage)
+ {
+ int float_voltage_mv;
+ unsigned int reg_data = 0;
+- struct regmap *regmap = charger->regmap;
+
+ float_voltage_mv = float_voltage / 1000;
+ reg_data = float_voltage_mv <= 4000 ? 0x0 :
+@@ -340,20 +327,16 @@ static int max77705_set_float_voltage(struct max77705_charger_data *charger,
+ (float_voltage_mv <= 4200) ? (float_voltage_mv - 4000) / 50 :
+ (((float_voltage_mv - 4200) / 10) + 0x04);
+
+- return regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_04,
+- MAX77705_CHG_CV_PRM_MASK,
+- (reg_data << MAX77705_CHG_CV_PRM_SHIFT));
++ return regmap_field_write(chg->rfield[MAX77705_CHG_CV_PRM], reg_data);
+ }
+
+-static int max77705_get_float_voltage(struct max77705_charger_data *charger,
++static int max77705_get_float_voltage(struct max77705_charger_data *chg,
+ int *val)
+ {
+ unsigned int reg_data = 0;
+ int voltage_mv;
+- struct regmap *regmap = charger->regmap;
+
+- regmap_read(regmap, MAX77705_CHG_REG_CNFG_04, ®_data);
+- reg_data &= MAX77705_CHG_PRM_MASK;
++ regmap_field_read(chg->rfield[MAX77705_CHG_CV_PRM], ®_data);
+ voltage_mv = reg_data <= 0x04 ? reg_data * 50 + 4000 :
+ (reg_data - 4) * 10 + 4200;
+ *val = voltage_mv * 1000;
+@@ -365,28 +348,28 @@ static int max77705_chg_get_property(struct power_supply *psy,
+ enum power_supply_property psp,
+ union power_supply_propval *val)
+ {
+- struct max77705_charger_data *charger = power_supply_get_drvdata(psy);
+- struct regmap *regmap = charger->regmap;
++ struct max77705_charger_data *chg = power_supply_get_drvdata(psy);
++ struct regmap *regmap = chg->regmap;
+
+ switch (psp) {
+ case POWER_SUPPLY_PROP_ONLINE:
+ return max77705_get_online(regmap, &val->intval);
+ case POWER_SUPPLY_PROP_PRESENT:
+- return max77705_check_battery(charger, &val->intval);
++ return max77705_check_battery(chg, &val->intval);
+ case POWER_SUPPLY_PROP_STATUS:
+- return max77705_get_status(charger, &val->intval);
++ return max77705_get_status(chg, &val->intval);
+ case POWER_SUPPLY_PROP_CHARGE_TYPE:
+- return max77705_get_charge_type(charger, &val->intval);
++ return max77705_get_charge_type(chg, &val->intval);
+ case POWER_SUPPLY_PROP_HEALTH:
+- return max77705_get_health(charger, &val->intval);
++ return max77705_get_health(chg, &val->intval);
+ case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT:
+- return max77705_get_input_current(charger, &val->intval);
++ return max77705_get_input_current(chg, &val->intval);
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+- return max77705_get_charge_current(charger, &val->intval);
++ return max77705_get_charge_current(chg, &val->intval);
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE:
+- return max77705_get_float_voltage(charger, &val->intval);
++ return max77705_get_float_voltage(chg, &val->intval);
+ case POWER_SUPPLY_PROP_VOLTAGE_MAX_DESIGN:
+- val->intval = charger->bat_info->voltage_max_design_uv;
++ val->intval = chg->bat_info->voltage_max_design_uv;
+ break;
+ case POWER_SUPPLY_PROP_MODEL_NAME:
+ val->strval = max77705_charger_model;
+@@ -410,15 +393,14 @@ static const struct power_supply_desc max77705_charger_psy_desc = {
+
+ static void max77705_chgin_isr_work(struct work_struct *work)
+ {
+- struct max77705_charger_data *charger =
++ struct max77705_charger_data *chg =
+ container_of(work, struct max77705_charger_data, chgin_work);
+
+- power_supply_changed(charger->psy_chg);
++ power_supply_changed(chg->psy_chg);
+ }
+
+ static void max77705_charger_initialize(struct max77705_charger_data *chg)
+ {
+- u8 reg_data;
+ struct power_supply_battery_info *info;
+ struct regmap *regmap = chg->regmap;
+
+@@ -429,45 +411,31 @@ static void max77705_charger_initialize(struct max77705_charger_data *chg)
+
+ /* unlock charger setting protect */
+ /* slowest LX slope */
+- reg_data = MAX77705_CHGPROT_MASK | MAX77705_SLOWEST_LX_SLOPE;
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_06, reg_data,
+- reg_data);
++ regmap_field_write(chg->rfield[MAX77705_CHGPROT], MAX77705_CHGPROT_UNLOCKED);
++ regmap_field_write(chg->rfield[MAX77705_LX_SLOPE], MAX77705_SLOWEST_LX_SLOPE);
+
+ /* fast charge timer disable */
+ /* restart threshold disable */
+ /* pre-qual charge disable */
+- reg_data = (MAX77705_FCHGTIME_DISABLE << MAX77705_FCHGTIME_SHIFT) |
+- (MAX77705_CHG_RSTRT_DISABLE << MAX77705_CHG_RSTRT_SHIFT) |
+- (MAX77705_CHG_PQEN_DISABLE << MAX77705_PQEN_SHIFT);
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_01,
+- (MAX77705_FCHGTIME_MASK |
+- MAX77705_CHG_RSTRT_MASK |
+- MAX77705_PQEN_MASK),
+- reg_data);
+-
+- /* OTG off(UNO on), boost off */
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_00,
+- MAX77705_OTG_CTRL, 0);
++ regmap_field_write(chg->rfield[MAX77705_FCHGTIME], MAX77705_FCHGTIME_DISABLE);
++ regmap_field_write(chg->rfield[MAX77705_CHG_RSTRT], MAX77705_CHG_RSTRT_DISABLE);
++ regmap_field_write(chg->rfield[MAX77705_CHG_PQEN], MAX77705_CHG_PQEN_DISABLE);
++
++ regmap_field_write(chg->rfield[MAX77705_MODE],
++ MAX77705_CHG_MASK | MAX77705_BUCK_MASK);
+
+ /* charge current 450mA(default) */
+ /* otg current limit 900mA */
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_02,
+- MAX77705_OTG_ILIM_MASK,
+- MAX77705_OTG_ILIM_900 << MAX77705_OTG_ILIM_SHIFT);
++ regmap_field_write(chg->rfield[MAX77705_OTG_ILIM], MAX77705_OTG_ILIM_900);
+
+ /* BAT to SYS OCP 4.80A */
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_05,
+- MAX77705_REG_B2SOVRC_MASK,
+- MAX77705_B2SOVRC_4_8A << MAX77705_REG_B2SOVRC_SHIFT);
++ regmap_field_write(chg->rfield[MAX77705_REG_B2SOVRC], MAX77705_B2SOVRC_4_8A);
++
+ /* top off current 150mA */
+ /* top off timer 30min */
+- reg_data = (MAX77705_TO_ITH_150MA << MAX77705_TO_ITH_SHIFT) |
+- (MAX77705_TO_TIME_30M << MAX77705_TO_TIME_SHIFT) |
+- (MAX77705_SYS_TRACK_DISABLE << MAX77705_SYS_TRACK_DIS_SHIFT);
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_03,
+- (MAX77705_TO_ITH_MASK |
+- MAX77705_TO_TIME_MASK |
+- MAX77705_SYS_TRACK_DIS_MASK), reg_data);
++ regmap_field_write(chg->rfield[MAX77705_TO], MAX77705_TO_ITH_150MA);
++ regmap_field_write(chg->rfield[MAX77705_TO_TIME], MAX77705_TO_TIME_30M);
++ regmap_field_write(chg->rfield[MAX77705_SYS_TRACK], MAX77705_SYS_TRACK_DISABLE);
+
+ /* cv voltage 4.2V or 4.35V */
+ /* MINVSYS 3.6V(default) */
+@@ -478,28 +446,21 @@ static void max77705_charger_initialize(struct max77705_charger_data *chg)
+ max77705_set_float_voltage(chg, info->voltage_max_design_uv);
+ }
+
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_12,
+- MAX77705_VCHGIN_REG_MASK, MAX77705_VCHGIN_4_5);
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_12,
+- MAX77705_WCIN_REG_MASK, MAX77705_WCIN_4_5);
++ regmap_field_write(chg->rfield[MAX77705_VCHGIN], MAX77705_VCHGIN_4_5);
++ regmap_field_write(chg->rfield[MAX77705_WCIN], MAX77705_WCIN_4_5);
+
+ /* Watchdog timer */
+ regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_00,
+ MAX77705_WDTEN_MASK, 0);
+
+- /* Active Discharge Enable */
+- regmap_update_bits(regmap, MAX77705_PMIC_REG_MAINCTRL1, 1, 1);
+-
+ /* VBYPSET=5.0V */
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_11, MAX77705_VBYPSET_MASK, 0);
++ regmap_field_write(chg->rfield[MAX77705_VBYPSET], 0);
+
+ /* Switching Frequency : 1.5MHz */
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_08, MAX77705_REG_FSW_MASK,
+- (MAX77705_CHG_FSW_1_5MHz << MAX77705_REG_FSW_SHIFT));
++ regmap_field_write(chg->rfield[MAX77705_REG_FSW], MAX77705_CHG_FSW_1_5MHz);
+
+ /* Auto skip mode */
+- regmap_update_bits(regmap, MAX77705_CHG_REG_CNFG_12, MAX77705_REG_DISKIP_MASK,
+- (MAX77705_AUTO_SKIP << MAX77705_REG_DISKIP_SHIFT));
++ regmap_field_write(chg->rfield[MAX77705_REG_DISKIP], MAX77705_AUTO_SKIP);
+ }
+
+ static int max77705_charger_probe(struct i2c_client *i2c)
+@@ -523,11 +484,13 @@ static int max77705_charger_probe(struct i2c_client *i2c)
+ if (IS_ERR(chg->regmap))
+ return PTR_ERR(chg->regmap);
+
+- ret = regmap_update_bits(chg->regmap,
+- MAX77705_CHG_REG_INT_MASK,
+- MAX77705_CHGIN_IM, 0);
+- if (ret)
+- return ret;
++ for (int i = 0; i < MAX77705_N_REGMAP_FIELDS; i++) {
++ chg->rfield[i] = devm_regmap_field_alloc(dev, chg->regmap,
++ max77705_reg_field[i]);
++ if (IS_ERR(chg->rfield[i]))
++ return dev_err_probe(dev, PTR_ERR(chg->rfield[i]),
++ "cannot allocate regmap field\n");
++ }
+
+ pscfg.fwnode = dev_fwnode(dev);
+ pscfg.drv_data = chg;
+@@ -538,7 +501,7 @@ static int max77705_charger_probe(struct i2c_client *i2c)
+
+ max77705_charger_irq_chip.irq_drv_data = chg;
+ ret = devm_regmap_add_irq_chip(chg->dev, chg->regmap, i2c->irq,
+- IRQF_ONESHOT | IRQF_SHARED, 0,
++ IRQF_ONESHOT, 0,
+ &max77705_charger_irq_chip,
+ &irq_data);
+ if (ret)
+@@ -556,6 +519,15 @@ static int max77705_charger_probe(struct i2c_client *i2c)
+
+ max77705_charger_initialize(chg);
+
++ ret = devm_request_threaded_irq(dev, regmap_irq_get_virq(irq_data, MAX77705_CHGIN_I),
++ NULL, max77705_chgin_irq,
++ IRQF_TRIGGER_NONE,
++ "chgin-irq", chg);
++ if (ret) {
++ dev_err_probe(dev, ret, "Failed to Request chgin IRQ\n");
++ goto destroy_wq;
++ }
++
+ ret = max77705_charger_enable(chg);
+ if (ret) {
+ dev_err_probe(dev, ret, "failed to enable charge\n");
+diff --git a/drivers/pps/kapi.c b/drivers/pps/kapi.c
+index 92d1b62ea239d7..e9389876229eaa 100644
+--- a/drivers/pps/kapi.c
++++ b/drivers/pps/kapi.c
+@@ -109,16 +109,13 @@ struct pps_device *pps_register_source(struct pps_source_info *info,
+ if (err < 0) {
+ pr_err("%s: unable to create char device\n",
+ info->name);
+- goto kfree_pps;
++ goto pps_register_source_exit;
+ }
+
+ dev_dbg(&pps->dev, "new PPS source %s\n", info->name);
+
+ return pps;
+
+-kfree_pps:
+- kfree(pps);
+-
+ pps_register_source_exit:
+ pr_err("%s: unable to register source\n", info->name);
+
+diff --git a/drivers/pps/pps.c b/drivers/pps/pps.c
+index 9463232af8d2e6..c6b8b647827611 100644
+--- a/drivers/pps/pps.c
++++ b/drivers/pps/pps.c
+@@ -374,6 +374,7 @@ int pps_register_cdev(struct pps_device *pps)
+ pps->info.name);
+ err = -EBUSY;
+ }
++ kfree(pps);
+ goto out_unlock;
+ }
+ pps->id = err;
+@@ -383,13 +384,11 @@ int pps_register_cdev(struct pps_device *pps)
+ pps->dev.devt = MKDEV(pps_major, pps->id);
+ dev_set_drvdata(&pps->dev, pps);
+ dev_set_name(&pps->dev, "pps%d", pps->id);
++ pps->dev.release = pps_device_destruct;
+ err = device_register(&pps->dev);
+ if (err)
+ goto free_idr;
+
+- /* Override the release function with our own */
+- pps->dev.release = pps_device_destruct;
+-
+ pr_debug("source %s got cdev (%d:%d)\n", pps->info.name, pps_major,
+ pps->id);
+
+diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h
+index b352df4cd3f972..f329263f33aa12 100644
+--- a/drivers/ptp/ptp_private.h
++++ b/drivers/ptp/ptp_private.h
+@@ -22,6 +22,7 @@
+ #define PTP_MAX_TIMESTAMPS 128
+ #define PTP_BUF_TIMESTAMPS 30
+ #define PTP_DEFAULT_MAX_VCLOCKS 20
++#define PTP_MAX_VCLOCKS_LIMIT (KMALLOC_MAX_SIZE/(sizeof(int)))
+ #define PTP_MAX_CHANNELS 2048
+
+ enum {
+diff --git a/drivers/ptp/ptp_sysfs.c b/drivers/ptp/ptp_sysfs.c
+index 6b1b8f57cd9510..200eaf50069681 100644
+--- a/drivers/ptp/ptp_sysfs.c
++++ b/drivers/ptp/ptp_sysfs.c
+@@ -284,7 +284,7 @@ static ssize_t max_vclocks_store(struct device *dev,
+ size_t size;
+ u32 max;
+
+- if (kstrtou32(buf, 0, &max) || max == 0)
++ if (kstrtou32(buf, 0, &max) || max == 0 || max > PTP_MAX_VCLOCKS_LIMIT)
+ return -EINVAL;
+
+ if (max == ptp->max_vclocks)
+diff --git a/drivers/pwm/pwm-loongson.c b/drivers/pwm/pwm-loongson.c
+index 1ba16168cbb408..31a57edecfd0ba 100644
+--- a/drivers/pwm/pwm-loongson.c
++++ b/drivers/pwm/pwm-loongson.c
+@@ -49,7 +49,7 @@
+ #define LOONGSON_PWM_CTRL_REG_DZONE BIT(10) /* Anti-dead Zone Enable Bit */
+
+ /* default input clk frequency for the ACPI case */
+-#define LOONGSON_PWM_FREQ_DEFAULT 50000 /* Hz */
++#define LOONGSON_PWM_FREQ_DEFAULT 50000000 /* Hz */
+
+ struct pwm_loongson_ddata {
+ struct clk *clk;
+diff --git a/drivers/pwm/pwm-tiehrpwm.c b/drivers/pwm/pwm-tiehrpwm.c
+index 0125e73b98dfb4..7a86cb090f76f1 100644
+--- a/drivers/pwm/pwm-tiehrpwm.c
++++ b/drivers/pwm/pwm-tiehrpwm.c
+@@ -36,7 +36,7 @@
+
+ #define CLKDIV_MAX 7
+ #define HSPCLKDIV_MAX 7
+-#define PERIOD_MAX 0xFFFF
++#define PERIOD_MAX 0x10000
+
+ /* compare module registers */
+ #define CMPA 0x12
+@@ -65,14 +65,10 @@
+ #define AQCTL_ZRO_FRCHIGH BIT(1)
+ #define AQCTL_ZRO_FRCTOGGLE (BIT(1) | BIT(0))
+
+-#define AQCTL_CHANA_POLNORMAL (AQCTL_CAU_FRCLOW | AQCTL_PRD_FRCHIGH | \
+- AQCTL_ZRO_FRCHIGH)
+-#define AQCTL_CHANA_POLINVERSED (AQCTL_CAU_FRCHIGH | AQCTL_PRD_FRCLOW | \
+- AQCTL_ZRO_FRCLOW)
+-#define AQCTL_CHANB_POLNORMAL (AQCTL_CBU_FRCLOW | AQCTL_PRD_FRCHIGH | \
+- AQCTL_ZRO_FRCHIGH)
+-#define AQCTL_CHANB_POLINVERSED (AQCTL_CBU_FRCHIGH | AQCTL_PRD_FRCLOW | \
+- AQCTL_ZRO_FRCLOW)
++#define AQCTL_CHANA_POLNORMAL (AQCTL_CAU_FRCLOW | AQCTL_ZRO_FRCHIGH)
++#define AQCTL_CHANA_POLINVERSED (AQCTL_CAU_FRCHIGH | AQCTL_ZRO_FRCLOW)
++#define AQCTL_CHANB_POLNORMAL (AQCTL_CBU_FRCLOW | AQCTL_ZRO_FRCHIGH)
++#define AQCTL_CHANB_POLINVERSED (AQCTL_CBU_FRCHIGH | AQCTL_ZRO_FRCLOW)
+
+ #define AQSFRC_RLDCSF_MASK (BIT(7) | BIT(6))
+ #define AQSFRC_RLDCSF_ZRO 0
+@@ -108,7 +104,6 @@ struct ehrpwm_pwm_chip {
+ unsigned long clk_rate;
+ void __iomem *mmio_base;
+ unsigned long period_cycles[NUM_PWM_CHANNEL];
+- enum pwm_polarity polarity[NUM_PWM_CHANNEL];
+ struct clk *tbclk;
+ struct ehrpwm_context ctx;
+ };
+@@ -166,7 +161,7 @@ static int set_prescale_div(unsigned long rqst_prescaler, u16 *prescale_div,
+
+ *prescale_div = (1 << clkdiv) *
+ (hspclkdiv ? (hspclkdiv * 2) : 1);
+- if (*prescale_div > rqst_prescaler) {
++ if (*prescale_div >= rqst_prescaler) {
+ *tb_clk_div = (clkdiv << TBCTL_CLKDIV_SHIFT) |
+ (hspclkdiv << TBCTL_HSPCLKDIV_SHIFT);
+ return 0;
+@@ -177,51 +172,20 @@ static int set_prescale_div(unsigned long rqst_prescaler, u16 *prescale_div,
+ return 1;
+ }
+
+-static void configure_polarity(struct ehrpwm_pwm_chip *pc, int chan)
+-{
+- u16 aqctl_val, aqctl_mask;
+- unsigned int aqctl_reg;
+-
+- /*
+- * Configure PWM output to HIGH/LOW level on counter
+- * reaches compare register value and LOW/HIGH level
+- * on counter value reaches period register value and
+- * zero value on counter
+- */
+- if (chan == 1) {
+- aqctl_reg = AQCTLB;
+- aqctl_mask = AQCTL_CBU_MASK;
+-
+- if (pc->polarity[chan] == PWM_POLARITY_INVERSED)
+- aqctl_val = AQCTL_CHANB_POLINVERSED;
+- else
+- aqctl_val = AQCTL_CHANB_POLNORMAL;
+- } else {
+- aqctl_reg = AQCTLA;
+- aqctl_mask = AQCTL_CAU_MASK;
+-
+- if (pc->polarity[chan] == PWM_POLARITY_INVERSED)
+- aqctl_val = AQCTL_CHANA_POLINVERSED;
+- else
+- aqctl_val = AQCTL_CHANA_POLNORMAL;
+- }
+-
+- aqctl_mask |= AQCTL_PRD_MASK | AQCTL_ZRO_MASK;
+- ehrpwm_modify(pc->mmio_base, aqctl_reg, aqctl_mask, aqctl_val);
+-}
+-
+ /*
+ * period_ns = 10^9 * (ps_divval * period_cycles) / PWM_CLK_RATE
+ * duty_ns = 10^9 * (ps_divval * duty_cycles) / PWM_CLK_RATE
+ */
+ static int ehrpwm_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+- u64 duty_ns, u64 period_ns)
++ u64 duty_ns, u64 period_ns, enum pwm_polarity polarity)
+ {
+ struct ehrpwm_pwm_chip *pc = to_ehrpwm_pwm_chip(chip);
+ u32 period_cycles, duty_cycles;
+ u16 ps_divval, tb_divval;
+ unsigned int i, cmp_reg;
+ unsigned long long c;
++ u16 aqctl_val, aqctl_mask;
++ unsigned int aqctl_reg;
+
+ if (period_ns > NSEC_PER_SEC)
+ return -ERANGE;
+@@ -231,15 +195,10 @@ static int ehrpwm_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+ do_div(c, NSEC_PER_SEC);
+ period_cycles = (unsigned long)c;
+
+- if (period_cycles < 1) {
+- period_cycles = 1;
+- duty_cycles = 1;
+- } else {
+- c = pc->clk_rate;
+- c = c * duty_ns;
+- do_div(c, NSEC_PER_SEC);
+- duty_cycles = (unsigned long)c;
+- }
++ c = pc->clk_rate;
++ c = c * duty_ns;
++ do_div(c, NSEC_PER_SEC);
++ duty_cycles = (unsigned long)c;
+
+ /*
+ * Period values should be same for multiple PWM channels as IP uses
+@@ -265,52 +224,73 @@ static int ehrpwm_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+ pc->period_cycles[pwm->hwpwm] = period_cycles;
+
+ /* Configure clock prescaler to support Low frequency PWM wave */
+- if (set_prescale_div(period_cycles/PERIOD_MAX, &ps_divval,
++ if (set_prescale_div(DIV_ROUND_UP(period_cycles, PERIOD_MAX), &ps_divval,
+ &tb_divval)) {
+ dev_err(pwmchip_parent(chip), "Unsupported values\n");
+ return -EINVAL;
+ }
+
+- pm_runtime_get_sync(pwmchip_parent(chip));
+-
+- /* Update clock prescaler values */
+- ehrpwm_modify(pc->mmio_base, TBCTL, TBCTL_CLKDIV_MASK, tb_divval);
+-
+ /* Update period & duty cycle with presacler division */
+ period_cycles = period_cycles / ps_divval;
+ duty_cycles = duty_cycles / ps_divval;
+
+- /* Configure shadow loading on Period register */
+- ehrpwm_modify(pc->mmio_base, TBCTL, TBCTL_PRDLD_MASK, TBCTL_PRDLD_SHDW);
++ if (period_cycles < 1)
++ period_cycles = 1;
+
+- ehrpwm_write(pc->mmio_base, TBPRD, period_cycles);
++ pm_runtime_get_sync(pwmchip_parent(chip));
+
+- /* Configure ehrpwm counter for up-count mode */
+- ehrpwm_modify(pc->mmio_base, TBCTL, TBCTL_CTRMODE_MASK,
+- TBCTL_CTRMODE_UP);
++ /* Update clock prescaler values */
++ ehrpwm_modify(pc->mmio_base, TBCTL, TBCTL_CLKDIV_MASK, tb_divval);
+
+- if (pwm->hwpwm == 1)
++ if (pwm->hwpwm == 1) {
+ /* Channel 1 configured with compare B register */
+ cmp_reg = CMPB;
+- else
++
++ aqctl_reg = AQCTLB;
++ aqctl_mask = AQCTL_CBU_MASK;
++
++ if (polarity == PWM_POLARITY_INVERSED)
++ aqctl_val = AQCTL_CHANB_POLINVERSED;
++ else
++ aqctl_val = AQCTL_CHANB_POLNORMAL;
++
++ /* if duty_cycle is big, don't toggle on CBU */
++ if (duty_cycles > period_cycles)
++ aqctl_val &= ~AQCTL_CBU_MASK;
++
++ } else {
+ /* Channel 0 configured with compare A register */
+ cmp_reg = CMPA;
+
+- ehrpwm_write(pc->mmio_base, cmp_reg, duty_cycles);
++ aqctl_reg = AQCTLA;
++ aqctl_mask = AQCTL_CAU_MASK;
+
+- pm_runtime_put_sync(pwmchip_parent(chip));
++ if (polarity == PWM_POLARITY_INVERSED)
++ aqctl_val = AQCTL_CHANA_POLINVERSED;
++ else
++ aqctl_val = AQCTL_CHANA_POLNORMAL;
+
+- return 0;
+-}
++ /* if duty_cycle is big, don't toggle on CAU */
++ if (duty_cycles > period_cycles)
++ aqctl_val &= ~AQCTL_CAU_MASK;
++ }
+
+-static int ehrpwm_pwm_set_polarity(struct pwm_chip *chip,
+- struct pwm_device *pwm,
+- enum pwm_polarity polarity)
+-{
+- struct ehrpwm_pwm_chip *pc = to_ehrpwm_pwm_chip(chip);
++ aqctl_mask |= AQCTL_PRD_MASK | AQCTL_ZRO_MASK;
++ ehrpwm_modify(pc->mmio_base, aqctl_reg, aqctl_mask, aqctl_val);
++
++ /* Configure shadow loading on Period register */
++ ehrpwm_modify(pc->mmio_base, TBCTL, TBCTL_PRDLD_MASK, TBCTL_PRDLD_SHDW);
++
++ ehrpwm_write(pc->mmio_base, TBPRD, period_cycles - 1);
+
+- /* Configuration of polarity in hardware delayed, do at enable */
+- pc->polarity[pwm->hwpwm] = polarity;
++ /* Configure ehrpwm counter for up-count mode */
++ ehrpwm_modify(pc->mmio_base, TBCTL, TBCTL_CTRMODE_MASK,
++ TBCTL_CTRMODE_UP);
++
++ if (!(duty_cycles > period_cycles))
++ ehrpwm_write(pc->mmio_base, cmp_reg, duty_cycles);
++
++ pm_runtime_put_sync(pwmchip_parent(chip));
+
+ return 0;
+ }
+@@ -339,9 +319,6 @@ static int ehrpwm_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
+
+ ehrpwm_modify(pc->mmio_base, AQCSFRC, aqcsfrc_mask, aqcsfrc_val);
+
+- /* Channels polarity can be configured from action qualifier module */
+- configure_polarity(pc, pwm->hwpwm);
+-
+ /* Enable TBCLK */
+ ret = clk_enable(pc->tbclk);
+ if (ret) {
+@@ -391,12 +368,7 @@ static void ehrpwm_pwm_free(struct pwm_chip *chip, struct pwm_device *pwm)
+ {
+ struct ehrpwm_pwm_chip *pc = to_ehrpwm_pwm_chip(chip);
+
+- if (pwm_is_enabled(pwm)) {
+- dev_warn(pwmchip_parent(chip), "Removing PWM device without disabling\n");
+- pm_runtime_put_sync(pwmchip_parent(chip));
+- }
+-
+- /* set period value to zero on free */
++ /* Don't let a pwm without consumer block requests to the other channel */
+ pc->period_cycles[pwm->hwpwm] = 0;
+ }
+
+@@ -411,10 +383,6 @@ static int ehrpwm_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+ ehrpwm_pwm_disable(chip, pwm);
+ enabled = false;
+ }
+-
+- err = ehrpwm_pwm_set_polarity(chip, pwm, state->polarity);
+- if (err)
+- return err;
+ }
+
+ if (!state->enabled) {
+@@ -423,7 +391,7 @@ static int ehrpwm_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+ return 0;
+ }
+
+- err = ehrpwm_pwm_config(chip, pwm, state->duty_cycle, state->period);
++ err = ehrpwm_pwm_config(chip, pwm, state->duty_cycle, state->period, state->polarity);
+ if (err)
+ return err;
+
+diff --git a/drivers/regulator/scmi-regulator.c b/drivers/regulator/scmi-regulator.c
+index 9df726f10ad121..6d609c42e4793b 100644
+--- a/drivers/regulator/scmi-regulator.c
++++ b/drivers/regulator/scmi-regulator.c
+@@ -257,7 +257,8 @@ static int process_scmi_regulator_of_node(struct scmi_device *sdev,
+ struct device_node *np,
+ struct scmi_regulator_info *rinfo)
+ {
+- u32 dom, ret;
++ u32 dom;
++ int ret;
+
+ ret = of_property_read_u32(np, "reg", &dom);
+ if (ret)
+diff --git a/drivers/remoteproc/pru_rproc.c b/drivers/remoteproc/pru_rproc.c
+index 842e4b6cc5f9fc..5e3eb7b86a0e34 100644
+--- a/drivers/remoteproc/pru_rproc.c
++++ b/drivers/remoteproc/pru_rproc.c
+@@ -340,7 +340,7 @@ EXPORT_SYMBOL_GPL(pru_rproc_put);
+ */
+ int pru_rproc_set_ctable(struct rproc *rproc, enum pru_ctable_idx c, u32 addr)
+ {
+- struct pru_rproc *pru = rproc->priv;
++ struct pru_rproc *pru;
+ unsigned int reg;
+ u32 mask, set;
+ u16 idx;
+@@ -352,6 +352,7 @@ int pru_rproc_set_ctable(struct rproc *rproc, enum pru_ctable_idx c, u32 addr)
+ if (!rproc->dev.parent || !is_pru_rproc(rproc->dev.parent))
+ return -ENODEV;
+
++ pru = rproc->priv;
+ /* pointer is 16 bit and index is 8-bit so mask out the rest */
+ idx_mask = (c >= PRU_C28) ? 0xFFFF : 0xFF;
+
+diff --git a/drivers/remoteproc/qcom_q6v5.c b/drivers/remoteproc/qcom_q6v5.c
+index 4ee5e67a9f03f5..769c6d6d6a7316 100644
+--- a/drivers/remoteproc/qcom_q6v5.c
++++ b/drivers/remoteproc/qcom_q6v5.c
+@@ -156,9 +156,6 @@ int qcom_q6v5_wait_for_start(struct qcom_q6v5 *q6v5, int timeout)
+ int ret;
+
+ ret = wait_for_completion_timeout(&q6v5->start_done, timeout);
+- if (!ret)
+- disable_irq(q6v5->handover_irq);
+-
+ return !ret ? -ETIMEDOUT : 0;
+ }
+ EXPORT_SYMBOL_GPL(qcom_q6v5_wait_for_start);
+diff --git a/drivers/remoteproc/qcom_q6v5_mss.c b/drivers/remoteproc/qcom_q6v5_mss.c
+index 0c0199fb0e68d6..3087d895b87f44 100644
+--- a/drivers/remoteproc/qcom_q6v5_mss.c
++++ b/drivers/remoteproc/qcom_q6v5_mss.c
+@@ -498,6 +498,8 @@ static void q6v5_debug_policy_load(struct q6v5 *qproc, void *mba_region)
+ release_firmware(dp_fw);
+ }
+
++#define MSM8974_B00_OFFSET 0x1000
++
+ static int q6v5_load(struct rproc *rproc, const struct firmware *fw)
+ {
+ struct q6v5 *qproc = rproc->priv;
+@@ -516,7 +518,14 @@ static int q6v5_load(struct rproc *rproc, const struct firmware *fw)
+ return -EBUSY;
+ }
+
+- memcpy(mba_region, fw->data, fw->size);
++ if ((qproc->version == MSS_MSM8974 ||
++ qproc->version == MSS_MSM8226 ||
++ qproc->version == MSS_MSM8926) &&
++ fw->size > MSM8974_B00_OFFSET &&
++ !memcmp(fw->data, ELFMAG, SELFMAG))
++ memcpy(mba_region, fw->data + MSM8974_B00_OFFSET, fw->size - MSM8974_B00_OFFSET);
++ else
++ memcpy(mba_region, fw->data, fw->size);
+ q6v5_debug_policy_load(qproc, mba_region);
+ memunmap(mba_region);
+
+diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
+index 02e29171cbbee2..f3ec5b06261e8b 100644
+--- a/drivers/remoteproc/qcom_q6v5_pas.c
++++ b/drivers/remoteproc/qcom_q6v5_pas.c
+@@ -42,6 +42,7 @@ struct qcom_pas_data {
+ int pas_id;
+ int dtb_pas_id;
+ int lite_pas_id;
++ int lite_dtb_pas_id;
+ unsigned int minidump_id;
+ bool auto_boot;
+ bool decrypt_shutdown;
+@@ -80,6 +81,7 @@ struct qcom_pas {
+ int pas_id;
+ int dtb_pas_id;
+ int lite_pas_id;
++ int lite_dtb_pas_id;
+ unsigned int minidump_id;
+ int crash_reason_smem;
+ unsigned int smem_host_id;
+@@ -226,6 +228,8 @@ static int qcom_pas_load(struct rproc *rproc, const struct firmware *fw)
+
+ if (pas->lite_pas_id)
+ ret = qcom_scm_pas_shutdown(pas->lite_pas_id);
++ if (pas->lite_dtb_pas_id)
++ qcom_scm_pas_shutdown(pas->lite_dtb_pas_id);
+
+ if (pas->dtb_pas_id) {
+ ret = request_firmware(&pas->dtb_firmware, pas->dtb_firmware_name, pas->dev);
+@@ -722,6 +726,7 @@ static int qcom_pas_probe(struct platform_device *pdev)
+ pas->minidump_id = desc->minidump_id;
+ pas->pas_id = desc->pas_id;
+ pas->lite_pas_id = desc->lite_pas_id;
++ pas->lite_dtb_pas_id = desc->lite_dtb_pas_id;
+ pas->info_name = desc->sysmon_name;
+ pas->smem_host_id = desc->smem_host_id;
+ pas->decrypt_shutdown = desc->decrypt_shutdown;
+@@ -1085,6 +1090,7 @@ static const struct qcom_pas_data x1e80100_adsp_resource = {
+ .pas_id = 1,
+ .dtb_pas_id = 0x24,
+ .lite_pas_id = 0x1f,
++ .lite_dtb_pas_id = 0x29,
+ .minidump_id = 5,
+ .auto_boot = true,
+ .proxy_pd_names = (char*[]){
+diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
+index 87c944d4b4f318..1cbe457b4e96fa 100644
+--- a/drivers/rpmsg/qcom_smd.c
++++ b/drivers/rpmsg/qcom_smd.c
+@@ -1368,7 +1368,7 @@ static int qcom_smd_parse_edge(struct device *dev,
+ edge->mbox_client.knows_txdone = true;
+ edge->mbox_chan = mbox_request_channel(&edge->mbox_client, 0);
+ if (IS_ERR(edge->mbox_chan)) {
+- if (PTR_ERR(edge->mbox_chan) != -ENODEV) {
++ if (PTR_ERR(edge->mbox_chan) != -ENOENT) {
+ ret = dev_err_probe(dev, PTR_ERR(edge->mbox_chan),
+ "failed to acquire IPC mailbox\n");
+ goto put_node;
+diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
+index 869b5d4db44cb6..d953225f6cc24c 100644
+--- a/drivers/scsi/libsas/sas_expander.c
++++ b/drivers/scsi/libsas/sas_expander.c
+@@ -1313,10 +1313,7 @@ static int sas_check_parent_topology(struct domain_device *child)
+ int i;
+ int res = 0;
+
+- if (!child->parent)
+- return 0;
+-
+- if (!dev_is_expander(child->parent->dev_type))
++ if (!dev_parent_is_expander(child))
+ return 0;
+
+ parent_ex = &child->parent->ex_dev;
+diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c b/drivers/scsi/mpt3sas/mpt3sas_transport.c
+index dc74ebc6405ace..66fd301f03b0d5 100644
+--- a/drivers/scsi/mpt3sas/mpt3sas_transport.c
++++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c
+@@ -987,11 +987,9 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address,
+ list_for_each_entry_safe(mpt3sas_phy, next_phy,
+ &mpt3sas_port->phy_list, port_siblings) {
+ if ((ioc->logging_level & MPT_DEBUG_TRANSPORT))
+- dev_printk(KERN_INFO, &mpt3sas_port->port->dev,
+- "remove: sas_addr(0x%016llx), phy(%d)\n",
+- (unsigned long long)
+- mpt3sas_port->remote_identify.sas_address,
+- mpt3sas_phy->phy_id);
++ ioc_info(ioc, "remove: sas_addr(0x%016llx), phy(%d)\n",
++ (unsigned long long) mpt3sas_port->remote_identify.sas_address,
++ mpt3sas_phy->phy_id);
+ mpt3sas_phy->phy_belongs_to_port = 0;
+ if (!ioc->remove_host)
+ sas_port_delete_phy(mpt3sas_port->port,
+diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c
+index 95af3bb03834c3..a58abd796603b0 100644
+--- a/drivers/scsi/myrs.c
++++ b/drivers/scsi/myrs.c
+@@ -498,14 +498,14 @@ static bool myrs_enable_mmio_mbox(struct myrs_hba *cs,
+ /* Temporary dma mapping, used only in the scope of this function */
+ mbox = dma_alloc_coherent(&pdev->dev, sizeof(union myrs_cmd_mbox),
+ &mbox_addr, GFP_KERNEL);
+- if (dma_mapping_error(&pdev->dev, mbox_addr))
++ if (!mbox)
+ return false;
+
+ /* These are the base addresses for the command memory mailbox array */
+ cs->cmd_mbox_size = MYRS_MAX_CMD_MBOX * sizeof(union myrs_cmd_mbox);
+ cmd_mbox = dma_alloc_coherent(&pdev->dev, cs->cmd_mbox_size,
+ &cs->cmd_mbox_addr, GFP_KERNEL);
+- if (dma_mapping_error(&pdev->dev, cs->cmd_mbox_addr)) {
++ if (!cmd_mbox) {
+ dev_err(&pdev->dev, "Failed to map command mailbox\n");
+ goto out_free;
+ }
+@@ -520,7 +520,7 @@ static bool myrs_enable_mmio_mbox(struct myrs_hba *cs,
+ cs->stat_mbox_size = MYRS_MAX_STAT_MBOX * sizeof(struct myrs_stat_mbox);
+ stat_mbox = dma_alloc_coherent(&pdev->dev, cs->stat_mbox_size,
+ &cs->stat_mbox_addr, GFP_KERNEL);
+- if (dma_mapping_error(&pdev->dev, cs->stat_mbox_addr)) {
++ if (!stat_mbox) {
+ dev_err(&pdev->dev, "Failed to map status mailbox\n");
+ goto out_free;
+ }
+@@ -533,7 +533,7 @@ static bool myrs_enable_mmio_mbox(struct myrs_hba *cs,
+ cs->fwstat_buf = dma_alloc_coherent(&pdev->dev,
+ sizeof(struct myrs_fwstat),
+ &cs->fwstat_addr, GFP_KERNEL);
+- if (dma_mapping_error(&pdev->dev, cs->fwstat_addr)) {
++ if (!cs->fwstat_buf) {
+ dev_err(&pdev->dev, "Failed to map firmware health buffer\n");
+ cs->fwstat_buf = NULL;
+ goto out_free;
+diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
+index 42a4eeac24c941..8005995a317c1e 100644
+--- a/drivers/scsi/pm8001/pm8001_hwi.c
++++ b/drivers/scsi/pm8001/pm8001_hwi.c
+@@ -2163,8 +2163,7 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
+ /* Print sas address of IO failed device */
+ if ((status != IO_SUCCESS) && (status != IO_OVERFLOW) &&
+ (status != IO_UNDERFLOW)) {
+- if (!((t->dev->parent) &&
+- (dev_is_expander(t->dev->parent->dev_type)))) {
++ if (!dev_parent_is_expander(t->dev)) {
+ for (i = 0, j = 4; j <= 7 && i <= 3; i++, j++)
+ sata_addr_low[i] = pm8001_ha->sas_addr[j];
+ for (i = 0, j = 0; j <= 3 && i <= 3; i++, j++)
+@@ -4168,7 +4167,6 @@ static int pm8001_chip_reg_dev_req(struct pm8001_hba_info *pm8001_ha,
+ u16 firstBurstSize = 0;
+ u16 ITNT = 2000;
+ struct domain_device *dev = pm8001_dev->sas_device;
+- struct domain_device *parent_dev = dev->parent;
+ struct pm8001_port *port = dev->port->lldd_port;
+
+ memset(&payload, 0, sizeof(payload));
+@@ -4186,10 +4184,9 @@ static int pm8001_chip_reg_dev_req(struct pm8001_hba_info *pm8001_ha,
+ dev_is_expander(pm8001_dev->dev_type))
+ stp_sspsmp_sata = 0x01; /*ssp or smp*/
+ }
+- if (parent_dev && dev_is_expander(parent_dev->dev_type))
+- phy_id = parent_dev->ex_dev.ex_phy->phy_id;
+- else
+- phy_id = pm8001_dev->attached_phy;
++
++ phy_id = pm80xx_get_local_phy_id(dev);
++
+ opc = OPC_INB_REG_DEV;
+ linkrate = (pm8001_dev->sas_device->linkrate < dev->port->linkrate) ?
+ pm8001_dev->sas_device->linkrate : dev->port->linkrate;
+diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
+index f7067878b34f3b..c5354263b45e86 100644
+--- a/drivers/scsi/pm8001/pm8001_sas.c
++++ b/drivers/scsi/pm8001/pm8001_sas.c
+@@ -130,6 +130,16 @@ static void pm80xx_get_tag_opcodes(struct sas_task *task, int *ata_op,
+ }
+ }
+
++u32 pm80xx_get_local_phy_id(struct domain_device *dev)
++{
++ struct pm8001_device *pm8001_dev = dev->lldd_dev;
++
++ if (dev_parent_is_expander(dev))
++ return dev->parent->ex_dev.ex_phy->phy_id;
++
++ return pm8001_dev->attached_phy;
++}
++
+ void pm80xx_show_pending_commands(struct pm8001_hba_info *pm8001_ha,
+ struct pm8001_device *target_pm8001_dev)
+ {
+@@ -477,7 +487,7 @@ int pm8001_queue_command(struct sas_task *task, gfp_t gfp_flags)
+ struct pm8001_device *pm8001_dev = dev->lldd_dev;
+ bool internal_abort = sas_is_internal_abort(task);
+ struct pm8001_hba_info *pm8001_ha;
+- struct pm8001_port *port = NULL;
++ struct pm8001_port *port;
+ struct pm8001_ccb_info *ccb;
+ unsigned long flags;
+ u32 n_elem = 0;
+@@ -502,8 +512,7 @@ int pm8001_queue_command(struct sas_task *task, gfp_t gfp_flags)
+
+ spin_lock_irqsave(&pm8001_ha->lock, flags);
+
+- pm8001_dev = dev->lldd_dev;
+- port = pm8001_ha->phy[pm8001_dev->attached_phy].port;
++ port = dev->port->lldd_port;
+
+ if (!internal_abort &&
+ (DEV_IS_GONE(pm8001_dev) || !port || !port->port_attached)) {
+@@ -701,7 +710,7 @@ static int pm8001_dev_found_notify(struct domain_device *dev)
+ dev->lldd_dev = pm8001_device;
+ pm8001_device->dev_type = dev->dev_type;
+ pm8001_device->dcompletion = &completion;
+- if (parent_dev && dev_is_expander(parent_dev->dev_type)) {
++ if (dev_parent_is_expander(dev)) {
+ int phy_id;
+
+ phy_id = sas_find_attached_phy_id(&parent_dev->ex_dev, dev);
+@@ -766,7 +775,13 @@ static void pm8001_dev_gone_notify(struct domain_device *dev)
+ spin_lock_irqsave(&pm8001_ha->lock, flags);
+ }
+ PM8001_CHIP_DISP->dereg_dev_req(pm8001_ha, device_id);
+- pm8001_ha->phy[pm8001_dev->attached_phy].phy_attached = 0;
++
++ /*
++ * The phy array only contains local phys. Thus, we cannot clear
++ * phy_attached for a device behind an expander.
++ */
++ if (!dev_parent_is_expander(dev))
++ pm8001_ha->phy[pm8001_dev->attached_phy].phy_attached = 0;
+ pm8001_free_dev(pm8001_dev);
+ } else {
+ pm8001_dbg(pm8001_ha, DISC, "Found dev has gone.\n");
+@@ -1048,7 +1063,7 @@ int pm8001_abort_task(struct sas_task *task)
+ struct pm8001_hba_info *pm8001_ha;
+ struct pm8001_device *pm8001_dev;
+ int rc = TMF_RESP_FUNC_FAILED, ret;
+- u32 phy_id, port_id;
++ u32 port_id;
+ struct sas_task_slow slow_task;
+
+ if (!task->lldd_task || !task->dev)
+@@ -1057,7 +1072,6 @@ int pm8001_abort_task(struct sas_task *task)
+ dev = task->dev;
+ pm8001_dev = dev->lldd_dev;
+ pm8001_ha = pm8001_find_ha_by_dev(dev);
+- phy_id = pm8001_dev->attached_phy;
+
+ if (PM8001_CHIP_DISP->fatal_errors(pm8001_ha)) {
+ // If the controller is seeing fatal errors
+@@ -1089,7 +1103,8 @@ int pm8001_abort_task(struct sas_task *task)
+ if (pm8001_ha->chip_id == chip_8006) {
+ DECLARE_COMPLETION_ONSTACK(completion_reset);
+ DECLARE_COMPLETION_ONSTACK(completion);
+- struct pm8001_phy *phy = pm8001_ha->phy + phy_id;
++ u32 phy_id = pm80xx_get_local_phy_id(dev);
++ struct pm8001_phy *phy = &pm8001_ha->phy[phy_id];
+ port_id = phy->port->port_id;
+
+ /* 1. Set Device state as Recovery */
+diff --git a/drivers/scsi/pm8001/pm8001_sas.h b/drivers/scsi/pm8001/pm8001_sas.h
+index 334485bb2c12d3..91b2cdf3535cdd 100644
+--- a/drivers/scsi/pm8001/pm8001_sas.h
++++ b/drivers/scsi/pm8001/pm8001_sas.h
+@@ -798,6 +798,7 @@ void pm8001_setds_completion(struct domain_device *dev);
+ void pm8001_tmf_aborted(struct sas_task *task);
+ void pm80xx_show_pending_commands(struct pm8001_hba_info *pm8001_ha,
+ struct pm8001_device *dev);
++u32 pm80xx_get_local_phy_id(struct domain_device *dev);
+
+ #endif
+
+diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
+index c1bae995a41284..31960b72c1e92c 100644
+--- a/drivers/scsi/pm8001/pm80xx_hwi.c
++++ b/drivers/scsi/pm8001/pm80xx_hwi.c
+@@ -2340,8 +2340,7 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha,
+ /* Print sas address of IO failed device */
+ if ((status != IO_SUCCESS) && (status != IO_OVERFLOW) &&
+ (status != IO_UNDERFLOW)) {
+- if (!((t->dev->parent) &&
+- (dev_is_expander(t->dev->parent->dev_type)))) {
++ if (!dev_parent_is_expander(t->dev)) {
+ for (i = 0, j = 4; i <= 3 && j <= 7; i++, j++)
+ sata_addr_low[i] = pm8001_ha->sas_addr[j];
+ for (i = 0, j = 0; i <= 3 && j <= 3; i++, j++)
+@@ -4780,7 +4779,6 @@ static int pm80xx_chip_reg_dev_req(struct pm8001_hba_info *pm8001_ha,
+ u16 firstBurstSize = 0;
+ u16 ITNT = 2000;
+ struct domain_device *dev = pm8001_dev->sas_device;
+- struct domain_device *parent_dev = dev->parent;
+ struct pm8001_port *port = dev->port->lldd_port;
+
+ memset(&payload, 0, sizeof(payload));
+@@ -4799,10 +4797,8 @@ static int pm80xx_chip_reg_dev_req(struct pm8001_hba_info *pm8001_ha,
+ dev_is_expander(pm8001_dev->dev_type))
+ stp_sspsmp_sata = 0x01; /*ssp or smp*/
+ }
+- if (parent_dev && dev_is_expander(parent_dev->dev_type))
+- phy_id = parent_dev->ex_dev.ex_phy->phy_id;
+- else
+- phy_id = pm8001_dev->attached_phy;
++
++ phy_id = pm80xx_get_local_phy_id(dev);
+
+ opc = OPC_INB_REG_DEV;
+
+diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
+index 91bbd3b75bff97..ccd4485087a106 100644
+--- a/drivers/scsi/qla2xxx/qla_edif.c
++++ b/drivers/scsi/qla2xxx/qla_edif.c
+@@ -1798,7 +1798,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
+ switch (rval) {
+ case QLA_SUCCESS:
+ break;
+- case EAGAIN:
++ case -EAGAIN:
+ msleep(EDIF_MSLEEP_INTERVAL);
+ cnt++;
+ if (cnt < EDIF_RETRY_COUNT)
+@@ -3649,7 +3649,7 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ p->e.extra_rx_xchg_address, p->e.extra_control_flags,
+ sp->handle, sp->remap.req.len, bsg_job);
+ break;
+- case EAGAIN:
++ case -EAGAIN:
+ msleep(EDIF_MSLEEP_INTERVAL);
+ cnt++;
+ if (cnt < EDIF_RETRY_COUNT)
+diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
+index be211ff22acbd1..6a2e1c7fd1251a 100644
+--- a/drivers/scsi/qla2xxx/qla_init.c
++++ b/drivers/scsi/qla2xxx/qla_init.c
+@@ -2059,11 +2059,11 @@ static void qla_marker_sp_done(srb_t *sp, int res)
+ int cnt = 5; \
+ do { \
+ if (_chip_gen != sp->vha->hw->chip_reset || _login_gen != sp->fcport->login_gen) {\
+- _rval = EINVAL; \
++ _rval = -EINVAL; \
+ break; \
+ } \
+ _rval = qla2x00_start_sp(_sp); \
+- if (_rval == EAGAIN) \
++ if (_rval == -EAGAIN) \
+ msleep(1); \
+ else \
+ break; \
+diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
+index 8ee2e337c9e1b7..316594aa40cc5a 100644
+--- a/drivers/scsi/qla2xxx/qla_nvme.c
++++ b/drivers/scsi/qla2xxx/qla_nvme.c
+@@ -419,7 +419,7 @@ static int qla_nvme_xmt_ls_rsp(struct nvme_fc_local_port *lport,
+ switch (rval) {
+ case QLA_SUCCESS:
+ break;
+- case EAGAIN:
++ case -EAGAIN:
+ msleep(PURLS_MSLEEP_INTERVAL);
+ cnt++;
+ if (cnt < PURLS_RETRY_COUNT)
+diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c
+index 7c349a94b45c03..f45537546553ec 100644
+--- a/drivers/soc/mediatek/mtk-svs.c
++++ b/drivers/soc/mediatek/mtk-svs.c
+@@ -2165,10 +2165,18 @@ static struct device *svs_add_device_link(struct svs_platform *svsp,
+ return dev;
+ }
+
++static void svs_put_device(void *_dev)
++{
++ struct device *dev = _dev;
++
++ put_device(dev);
++}
++
+ static int svs_mt8192_platform_probe(struct svs_platform *svsp)
+ {
+ struct device *dev;
+ u32 idx;
++ int ret;
+
+ svsp->rst = devm_reset_control_get_optional(svsp->dev, "svs_rst");
+ if (IS_ERR(svsp->rst))
+@@ -2179,6 +2187,7 @@ static int svs_mt8192_platform_probe(struct svs_platform *svsp)
+ if (IS_ERR(dev))
+ return dev_err_probe(svsp->dev, PTR_ERR(dev),
+ "failed to get lvts device\n");
++ put_device(dev);
+
+ for (idx = 0; idx < svsp->bank_max; idx++) {
+ struct svs_bank *svsb = &svsp->banks[idx];
+@@ -2188,6 +2197,7 @@ static int svs_mt8192_platform_probe(struct svs_platform *svsp)
+ case SVSB_SWID_CPU_LITTLE:
+ case SVSB_SWID_CPU_BIG:
+ svsb->opp_dev = get_cpu_device(bdata->cpu_id);
++ get_device(svsb->opp_dev);
+ break;
+ case SVSB_SWID_CCI:
+ svsb->opp_dev = svs_add_device_link(svsp, "cci");
+@@ -2207,6 +2217,11 @@ static int svs_mt8192_platform_probe(struct svs_platform *svsp)
+ return dev_err_probe(svsp->dev, PTR_ERR(svsb->opp_dev),
+ "failed to get OPP device for bank %d\n",
+ idx);
++
++ ret = devm_add_action_or_reset(svsp->dev, svs_put_device,
++ svsb->opp_dev);
++ if (ret)
++ return ret;
+ }
+
+ return 0;
+@@ -2216,11 +2231,13 @@ static int svs_mt8183_platform_probe(struct svs_platform *svsp)
+ {
+ struct device *dev;
+ u32 idx;
++ int ret;
+
+ dev = svs_add_device_link(svsp, "thermal-sensor");
+ if (IS_ERR(dev))
+ return dev_err_probe(svsp->dev, PTR_ERR(dev),
+ "failed to get thermal device\n");
++ put_device(dev);
+
+ for (idx = 0; idx < svsp->bank_max; idx++) {
+ struct svs_bank *svsb = &svsp->banks[idx];
+@@ -2230,6 +2247,7 @@ static int svs_mt8183_platform_probe(struct svs_platform *svsp)
+ case SVSB_SWID_CPU_LITTLE:
+ case SVSB_SWID_CPU_BIG:
+ svsb->opp_dev = get_cpu_device(bdata->cpu_id);
++ get_device(svsb->opp_dev);
+ break;
+ case SVSB_SWID_CCI:
+ svsb->opp_dev = svs_add_device_link(svsp, "cci");
+@@ -2246,6 +2264,11 @@ static int svs_mt8183_platform_probe(struct svs_platform *svsp)
+ return dev_err_probe(svsp->dev, PTR_ERR(svsb->opp_dev),
+ "failed to get OPP device for bank %d\n",
+ idx);
++
++ ret = devm_add_action_or_reset(svsp->dev, svs_put_device,
++ svsb->opp_dev);
++ if (ret)
++ return ret;
+ }
+
+ return 0;
+diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
+index fdab2b1067dbb1..c6f7d5c9c493d9 100644
+--- a/drivers/soc/qcom/rpmh-rsc.c
++++ b/drivers/soc/qcom/rpmh-rsc.c
+@@ -453,13 +453,10 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
+
+ trace_rpmh_tx_done(drv, i, req);
+
+- /*
+- * If wake tcs was re-purposed for sending active
+- * votes, clear AMC trigger & enable modes and
++ /* Clear AMC trigger & enable modes and
+ * disable interrupt for this TCS
+ */
+- if (!drv->tcs[ACTIVE_TCS].num_tcs)
+- __tcs_set_trigger(drv, i, false);
++ __tcs_set_trigger(drv, i, false);
+ skip:
+ /* Reclaim the TCS */
+ write_tcs_reg(drv, drv->regs[RSC_DRV_CMD_ENABLE], i, 0);
+diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
+index a388f372b27a7f..19c2a6eb9922ac 100644
+--- a/drivers/spi/spi.c
++++ b/drivers/spi/spi.c
+@@ -2449,7 +2449,7 @@ static int of_spi_parse_dt(struct spi_controller *ctlr, struct spi_device *spi,
+ if (rc > ctlr->num_chipselect) {
+ dev_err(&ctlr->dev, "%pOF has number of CS > ctlr->num_chipselect (%d)\n",
+ nc, rc);
+- return rc;
++ return -EINVAL;
+ }
+ if ((of_property_present(nc, "parallel-memories")) &&
+ (!(ctlr->flags & SPI_CONTROLLER_MULTI_CS))) {
+diff --git a/drivers/staging/media/ipu7/ipu7.c b/drivers/staging/media/ipu7/ipu7.c
+index 1b4f01db13ca2c..ee6b63717ed369 100644
+--- a/drivers/staging/media/ipu7/ipu7.c
++++ b/drivers/staging/media/ipu7/ipu7.c
+@@ -2248,20 +2248,13 @@ void ipu7_dump_fw_error_log(const struct ipu7_bus_device *adev)
+ }
+ EXPORT_SYMBOL_NS_GPL(ipu7_dump_fw_error_log, "INTEL_IPU7");
+
+-static int ipu7_pci_config_setup(struct pci_dev *dev)
++static void ipu7_pci_config_setup(struct pci_dev *dev)
+ {
+ u16 pci_command;
+- int ret;
+
+ pci_read_config_word(dev, PCI_COMMAND, &pci_command);
+ pci_command |= PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER;
+ pci_write_config_word(dev, PCI_COMMAND, pci_command);
+-
+- ret = pci_enable_msi(dev);
+- if (ret)
+- dev_err(&dev->dev, "Failed to enable msi (%d)\n", ret);
+-
+- return ret;
+ }
+
+ static int ipu7_map_fw_code_region(struct ipu7_bus_device *sys,
+@@ -2435,7 +2428,6 @@ static int ipu7_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ if (!isp)
+ return -ENOMEM;
+
+- dev_set_name(dev, "intel-ipu7");
+ isp->pdev = pdev;
+ INIT_LIST_HEAD(&isp->devices);
+
+@@ -2510,13 +2502,15 @@ static int ipu7_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+
+ dma_set_max_seg_size(dev, UINT_MAX);
+
+- ret = ipu7_pci_config_setup(pdev);
+- if (ret)
+- return ret;
++ ipu7_pci_config_setup(pdev);
++
++ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
++ if (ret < 0)
++ return dev_err_probe(dev, ret, "Failed to alloc irq vector\n");
+
+ ret = ipu_buttress_init(isp);
+ if (ret)
+- return ret;
++ goto pci_irq_free;
+
+ dev_info(dev, "firmware cpd file: %s\n", isp->cpd_fw_name);
+
+@@ -2632,6 +2626,8 @@ static int ipu7_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ release_firmware(isp->cpd_fw);
+ buttress_exit:
+ ipu_buttress_exit(isp);
++pci_irq_free:
++ pci_free_irq_vectors(pdev);
+
+ return ret;
+ }
+@@ -2648,6 +2644,9 @@ static void ipu7_pci_remove(struct pci_dev *pdev)
+ if (!IS_ERR_OR_NULL(isp->fw_code_region))
+ vfree(isp->fw_code_region);
+
++ ipu7_mmu_cleanup(isp->isys->mmu);
++ ipu7_mmu_cleanup(isp->psys->mmu);
++
+ ipu7_bus_del_devices(pdev);
+
+ pm_runtime_forbid(&pdev->dev);
+@@ -2656,9 +2655,6 @@ static void ipu7_pci_remove(struct pci_dev *pdev)
+ ipu_buttress_exit(isp);
+
+ release_firmware(isp->cpd_fw);
+-
+- ipu7_mmu_cleanup(isp->psys->mmu);
+- ipu7_mmu_cleanup(isp->isys->mmu);
+ }
+
+ static void ipu7_pci_reset_prepare(struct pci_dev *pdev)
+diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
+index 2a7d253d9c554c..8e50476eb71fbc 100644
+--- a/drivers/tee/tee_shm.c
++++ b/drivers/tee/tee_shm.c
+@@ -321,6 +321,14 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
+ if (unlikely(len <= 0)) {
+ ret = len ? ERR_PTR(len) : ERR_PTR(-ENOMEM);
+ goto err_free_shm_pages;
++ } else if (DIV_ROUND_UP(len + off, PAGE_SIZE) != num_pages) {
++ /*
++ * If we only got a few pages, update to release the
++ * correct amount below.
++ */
++ shm->num_pages = len / PAGE_SIZE;
++ ret = ERR_PTR(-ENOMEM);
++ goto err_put_shm_pages;
+ }
+
+ /*
+diff --git a/drivers/thermal/qcom/Kconfig b/drivers/thermal/qcom/Kconfig
+index 2c7f3f9a26ebbb..a6bb01082ec697 100644
+--- a/drivers/thermal/qcom/Kconfig
++++ b/drivers/thermal/qcom/Kconfig
+@@ -34,7 +34,8 @@ config QCOM_SPMI_TEMP_ALARM
+
+ config QCOM_LMH
+ tristate "Qualcomm Limits Management Hardware"
+- depends on ARCH_QCOM && QCOM_SCM
++ depends on ARCH_QCOM || COMPILE_TEST
++ select QCOM_SCM
+ help
+ This enables initialization of Qualcomm limits management
+ hardware(LMh). LMh allows for hardware-enforced mitigation for cpus based on
+diff --git a/drivers/thermal/qcom/lmh.c b/drivers/thermal/qcom/lmh.c
+index 75eaa9a68ab8aa..c681a3c89ffa0b 100644
+--- a/drivers/thermal/qcom/lmh.c
++++ b/drivers/thermal/qcom/lmh.c
+@@ -5,6 +5,8 @@
+ */
+ #include <linux/module.h>
+ #include <linux/interrupt.h>
++#include <linux/irq.h>
++#include <linux/irqdesc.h>
+ #include <linux/irqdomain.h>
+ #include <linux/err.h>
+ #include <linux/platform_device.h>
+diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c
+index d52efe3f658ce6..8333fc7f3d551e 100644
+--- a/drivers/thunderbolt/tunnel.c
++++ b/drivers/thunderbolt/tunnel.c
+@@ -1073,6 +1073,7 @@ static void tb_dp_dprx_work(struct work_struct *work)
+
+ if (tunnel->callback)
+ tunnel->callback(tunnel, tunnel->callback_data);
++ tb_tunnel_put(tunnel);
+ }
+
+ static int tb_dp_dprx_start(struct tb_tunnel *tunnel)
+@@ -1100,8 +1101,8 @@ static void tb_dp_dprx_stop(struct tb_tunnel *tunnel)
+ if (tunnel->dprx_started) {
+ tunnel->dprx_started = false;
+ tunnel->dprx_canceled = true;
+- cancel_delayed_work(&tunnel->dprx_work);
+- tb_tunnel_put(tunnel);
++ if (cancel_delayed_work(&tunnel->dprx_work))
++ tb_tunnel_put(tunnel);
+ }
+ }
+
+diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
+index 7fc535452c0b30..553d8c70352b18 100644
+--- a/drivers/tty/n_gsm.c
++++ b/drivers/tty/n_gsm.c
+@@ -461,6 +461,7 @@ static int gsm_send_packet(struct gsm_mux *gsm, struct gsm_msg *msg);
+ static struct gsm_dlci *gsm_dlci_alloc(struct gsm_mux *gsm, int addr);
+ static void gsmld_write_trigger(struct gsm_mux *gsm);
+ static void gsmld_write_task(struct work_struct *work);
++static int gsm_modem_send_initial_msc(struct gsm_dlci *dlci);
+
+ /**
+ * gsm_fcs_add - update FCS
+@@ -2174,7 +2175,7 @@ static void gsm_dlci_open(struct gsm_dlci *dlci)
+ pr_debug("DLCI %d goes open.\n", dlci->addr);
+ /* Send current modem state */
+ if (dlci->addr) {
+- gsm_modem_update(dlci, 0);
++ gsm_modem_send_initial_msc(dlci);
+ } else {
+ /* Start keep-alive control */
+ gsm->ka_num = 0;
+@@ -4161,6 +4162,28 @@ static int gsm_modem_upd_via_msc(struct gsm_dlci *dlci, u8 brk)
+ return gsm_control_wait(dlci->gsm, ctrl);
+ }
+
++/**
++ * gsm_modem_send_initial_msc - Send initial modem status message
++ *
++ * @dlci channel
++ *
++ * Send an initial MSC message after DLCI open to set the initial
++ * modem status lines. This is only done for basic mode.
++ * Does not wait for a response as we cannot block the input queue
++ * processing.
++ */
++static int gsm_modem_send_initial_msc(struct gsm_dlci *dlci)
++{
++ u8 modembits[2];
++
++ if (dlci->adaption != 1 || dlci->gsm->encoding != GSM_BASIC_OPT)
++ return 0;
++
++ modembits[0] = (dlci->addr << 2) | 2 | EA; /* DLCI, Valid, EA */
++ modembits[1] = (gsm_encode_modem(dlci) << 1) | EA;
++ return gsm_control_command(dlci->gsm, CMD_MSC, (const u8 *)&modembits, 2);
++}
++
+ /**
+ * gsm_modem_update - send modem status line state
+ * @dlci: channel
+diff --git a/drivers/tty/serial/max310x.c b/drivers/tty/serial/max310x.c
+index ce260e9949c3c2..d9a0100b92d2b9 100644
+--- a/drivers/tty/serial/max310x.c
++++ b/drivers/tty/serial/max310x.c
+@@ -1644,6 +1644,8 @@ static int max310x_i2c_probe(struct i2c_client *client)
+ port_client = devm_i2c_new_dummy_device(&client->dev,
+ client->adapter,
+ port_addr);
++ if (IS_ERR(port_client))
++ return PTR_ERR(port_client);
+
+ regcfg_i2c.name = max310x_regmap_name(i);
+ regmaps[i] = devm_regmap_init_i2c(port_client, ®cfg_i2c);
+diff --git a/drivers/ufs/core/ufs-sysfs.c b/drivers/ufs/core/ufs-sysfs.c
+index 4bd7d491e3c5ac..0086816b27cd90 100644
+--- a/drivers/ufs/core/ufs-sysfs.c
++++ b/drivers/ufs/core/ufs-sysfs.c
+@@ -512,6 +512,8 @@ static ssize_t pm_qos_enable_show(struct device *dev,
+ {
+ struct ufs_hba *hba = dev_get_drvdata(dev);
+
++ guard(mutex)(&hba->pm_qos_mutex);
++
+ return sysfs_emit(buf, "%d\n", hba->pm_qos_enabled);
+ }
+
+diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
+index 9a43102b2b21e8..96a0f5fcc0e577 100644
+--- a/drivers/ufs/core/ufshcd.c
++++ b/drivers/ufs/core/ufshcd.c
+@@ -1045,6 +1045,7 @@ EXPORT_SYMBOL_GPL(ufshcd_is_hba_active);
+ */
+ void ufshcd_pm_qos_init(struct ufs_hba *hba)
+ {
++ guard(mutex)(&hba->pm_qos_mutex);
+
+ if (hba->pm_qos_enabled)
+ return;
+@@ -1061,6 +1062,8 @@ void ufshcd_pm_qos_init(struct ufs_hba *hba)
+ */
+ void ufshcd_pm_qos_exit(struct ufs_hba *hba)
+ {
++ guard(mutex)(&hba->pm_qos_mutex);
++
+ if (!hba->pm_qos_enabled)
+ return;
+
+@@ -1075,6 +1078,8 @@ void ufshcd_pm_qos_exit(struct ufs_hba *hba)
+ */
+ static void ufshcd_pm_qos_update(struct ufs_hba *hba, bool on)
+ {
++ guard(mutex)(&hba->pm_qos_mutex);
++
+ if (!hba->pm_qos_enabled)
+ return;
+
+@@ -10669,6 +10674,9 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
+ */
+ spin_lock_init(&hba->clk_gating.lock);
+
++ /* Initialize mutex for PM QoS request synchronization */
++ mutex_init(&hba->pm_qos_mutex);
++
+ /*
+ * Set the default power management level for runtime and system PM.
+ * Host controller drivers can override them in their
+@@ -10756,6 +10764,7 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
+ mutex_init(&hba->ee_ctrl_mutex);
+
+ mutex_init(&hba->wb_mutex);
++
+ init_rwsem(&hba->clk_scaling_lock);
+
+ ufshcd_init_clk_gating(hba);
+diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
+index f19efad4d6f8d9..3f8e2e27697fbe 100644
+--- a/drivers/uio/uio_hv_generic.c
++++ b/drivers/uio/uio_hv_generic.c
+@@ -111,7 +111,6 @@ static void hv_uio_channel_cb(void *context)
+ struct hv_device *hv_dev;
+ struct hv_uio_private_data *pdata;
+
+- chan->inbound.ring_buffer->interrupt_mask = 1;
+ virt_mb();
+
+ /*
+@@ -183,8 +182,6 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
+ return;
+ }
+
+- /* Disable interrupts on sub channel */
+- new_sc->inbound.ring_buffer->interrupt_mask = 1;
+ set_channel_read_mode(new_sc, HV_CALL_ISR);
+ ret = hv_create_ring_sysfs(new_sc, hv_uio_ring_mmap);
+ if (ret) {
+@@ -227,9 +224,7 @@ hv_uio_open(struct uio_info *info, struct inode *inode)
+
+ ret = vmbus_connect_ring(dev->channel,
+ hv_uio_channel_cb, dev->channel);
+- if (ret == 0)
+- dev->channel->inbound.ring_buffer->interrupt_mask = 1;
+- else
++ if (ret)
+ atomic_dec(&pdata->refcnt);
+
+ return ret;
+diff --git a/drivers/usb/cdns3/cdnsp-pci.c b/drivers/usb/cdns3/cdnsp-pci.c
+index 8c361b8394e959..5e7b88ca8b96c0 100644
+--- a/drivers/usb/cdns3/cdnsp-pci.c
++++ b/drivers/usb/cdns3/cdnsp-pci.c
+@@ -85,7 +85,7 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
+ cdnsp = kzalloc(sizeof(*cdnsp), GFP_KERNEL);
+ if (!cdnsp) {
+ ret = -ENOMEM;
+- goto disable_pci;
++ goto put_pci;
+ }
+ }
+
+@@ -168,9 +168,6 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
+ if (!pci_is_enabled(func))
+ kfree(cdnsp);
+
+-disable_pci:
+- pci_disable_device(pdev);
+-
+ put_pci:
+ pci_dev_put(func);
+
+diff --git a/drivers/usb/gadget/configfs.c b/drivers/usb/gadget/configfs.c
+index f94ea196ce547b..6bcac85c55501d 100644
+--- a/drivers/usb/gadget/configfs.c
++++ b/drivers/usb/gadget/configfs.c
+@@ -1750,6 +1750,8 @@ static int configfs_composite_bind(struct usb_gadget *gadget,
+ cdev->use_os_string = true;
+ cdev->b_vendor_code = gi->b_vendor_code;
+ memcpy(cdev->qw_sign, gi->qw_sign, OS_STRING_QW_SIGN_LEN);
++ } else {
++ cdev->use_os_string = false;
+ }
+
+ if (gadget_is_otg(gadget) && !otg_desc[0]) {
+diff --git a/drivers/usb/host/max3421-hcd.c b/drivers/usb/host/max3421-hcd.c
+index dcf31a592f5d11..4b5f03f683f775 100644
+--- a/drivers/usb/host/max3421-hcd.c
++++ b/drivers/usb/host/max3421-hcd.c
+@@ -1916,7 +1916,7 @@ max3421_probe(struct spi_device *spi)
+ if (hcd) {
+ kfree(max3421_hcd->tx);
+ kfree(max3421_hcd->rx);
+- if (max3421_hcd->spi_thread)
++ if (!IS_ERR_OR_NULL(max3421_hcd->spi_thread))
+ kthread_stop(max3421_hcd->spi_thread);
+ usb_put_hcd(hcd);
+ }
+diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
+index 4f8f5aab109d0c..6309200e93dc3c 100644
+--- a/drivers/usb/host/xhci-ring.c
++++ b/drivers/usb/host/xhci-ring.c
+@@ -1262,19 +1262,16 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id,
+ * Stopped state, but it will soon change to Running.
+ *
+ * Assume this bug on unexpected Stop Endpoint failures.
+- * Keep retrying until the EP starts and stops again.
++ * Keep retrying until the EP starts and stops again, on
++ * chips where this is known to help. Wait for 100ms.
+ */
++ if (time_is_before_jiffies(ep->stop_time + msecs_to_jiffies(100)))
++ break;
+ fallthrough;
+ case EP_STATE_RUNNING:
+ /* Race, HW handled stop ep cmd before ep was running */
+ xhci_dbg(xhci, "Stop ep completion ctx error, ctx_state %d\n",
+ GET_EP_CTX_STATE(ep_ctx));
+- /*
+- * Don't retry forever if we guessed wrong or a defective HC never starts
+- * the EP or says 'Running' but fails the command. We must give back TDs.
+- */
+- if (time_is_before_jiffies(ep->stop_time + msecs_to_jiffies(100)))
+- break;
+
+ command = xhci_alloc_command(xhci, false, GFP_ATOMIC);
+ if (!command) {
+diff --git a/drivers/usb/misc/Kconfig b/drivers/usb/misc/Kconfig
+index 6497c4e81e951a..9bf8fc6247baca 100644
+--- a/drivers/usb/misc/Kconfig
++++ b/drivers/usb/misc/Kconfig
+@@ -147,6 +147,7 @@ config USB_APPLEDISPLAY
+ config USB_QCOM_EUD
+ tristate "QCOM Embedded USB Debugger(EUD) Driver"
+ depends on ARCH_QCOM || COMPILE_TEST
++ select QCOM_SCM
+ select USB_ROLE_SWITCH
+ help
+ This module enables support for Qualcomm Technologies, Inc.
+diff --git a/drivers/usb/misc/qcom_eud.c b/drivers/usb/misc/qcom_eud.c
+index 83079c414b4f28..05c8bdc943a88d 100644
+--- a/drivers/usb/misc/qcom_eud.c
++++ b/drivers/usb/misc/qcom_eud.c
+@@ -15,6 +15,7 @@
+ #include <linux/slab.h>
+ #include <linux/sysfs.h>
+ #include <linux/usb/role.h>
++#include <linux/firmware/qcom/qcom_scm.h>
+
+ #define EUD_REG_INT1_EN_MASK 0x0024
+ #define EUD_REG_INT_STATUS_1 0x0044
+@@ -34,7 +35,7 @@ struct eud_chip {
+ struct device *dev;
+ struct usb_role_switch *role_sw;
+ void __iomem *base;
+- void __iomem *mode_mgr;
++ phys_addr_t mode_mgr;
+ unsigned int int_status;
+ int irq;
+ bool enabled;
+@@ -43,18 +44,29 @@ struct eud_chip {
+
+ static int enable_eud(struct eud_chip *priv)
+ {
++ int ret;
++
++ ret = qcom_scm_io_writel(priv->mode_mgr + EUD_REG_EUD_EN2, 1);
++ if (ret)
++ return ret;
++
+ writel(EUD_ENABLE, priv->base + EUD_REG_CSR_EUD_EN);
+ writel(EUD_INT_VBUS | EUD_INT_SAFE_MODE,
+ priv->base + EUD_REG_INT1_EN_MASK);
+- writel(1, priv->mode_mgr + EUD_REG_EUD_EN2);
+
+ return usb_role_switch_set_role(priv->role_sw, USB_ROLE_DEVICE);
+ }
+
+-static void disable_eud(struct eud_chip *priv)
++static int disable_eud(struct eud_chip *priv)
+ {
++ int ret;
++
++ ret = qcom_scm_io_writel(priv->mode_mgr + EUD_REG_EUD_EN2, 0);
++ if (ret)
++ return ret;
++
+ writel(0, priv->base + EUD_REG_CSR_EUD_EN);
+- writel(0, priv->mode_mgr + EUD_REG_EUD_EN2);
++ return 0;
+ }
+
+ static ssize_t enable_show(struct device *dev,
+@@ -82,11 +94,12 @@ static ssize_t enable_store(struct device *dev,
+ chip->enabled = enable;
+ else
+ disable_eud(chip);
++
+ } else {
+- disable_eud(chip);
++ ret = disable_eud(chip);
+ }
+
+- return count;
++ return ret < 0 ? ret : count;
+ }
+
+ static DEVICE_ATTR_RW(enable);
+@@ -178,6 +191,7 @@ static void eud_role_switch_release(void *data)
+ static int eud_probe(struct platform_device *pdev)
+ {
+ struct eud_chip *chip;
++ struct resource *res;
+ int ret;
+
+ chip = devm_kzalloc(&pdev->dev, sizeof(*chip), GFP_KERNEL);
+@@ -200,9 +214,10 @@ static int eud_probe(struct platform_device *pdev)
+ if (IS_ERR(chip->base))
+ return PTR_ERR(chip->base);
+
+- chip->mode_mgr = devm_platform_ioremap_resource(pdev, 1);
+- if (IS_ERR(chip->mode_mgr))
+- return PTR_ERR(chip->mode_mgr);
++ res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
++ if (!res)
++ return -ENODEV;
++ chip->mode_mgr = res->start;
+
+ chip->irq = platform_get_irq(pdev, 0);
+ if (chip->irq < 0)
+diff --git a/drivers/usb/phy/phy-twl6030-usb.c b/drivers/usb/phy/phy-twl6030-usb.c
+index 49d79c1257f3a4..8c09db750bfd63 100644
+--- a/drivers/usb/phy/phy-twl6030-usb.c
++++ b/drivers/usb/phy/phy-twl6030-usb.c
+@@ -328,9 +328,8 @@ static int twl6030_set_vbus(struct phy_companion *comparator, bool enabled)
+
+ static int twl6030_usb_probe(struct platform_device *pdev)
+ {
+- u32 ret;
+ struct twl6030_usb *twl;
+- int status, err;
++ int status, err, ret;
+ struct device_node *np = pdev->dev.of_node;
+ struct device *dev = &pdev->dev;
+
+diff --git a/drivers/usb/typec/tipd/core.c b/drivers/usb/typec/tipd/core.c
+index dcf141ada07812..1c80296c3b273e 100644
+--- a/drivers/usb/typec/tipd/core.c
++++ b/drivers/usb/typec/tipd/core.c
+@@ -545,24 +545,23 @@ static irqreturn_t cd321x_interrupt(int irq, void *data)
+ if (!event)
+ goto err_unlock;
+
++ tps6598x_write64(tps, TPS_REG_INT_CLEAR1, event);
++
+ if (!tps6598x_read_status(tps, &status))
+- goto err_clear_ints;
++ goto err_unlock;
+
+ if (event & APPLE_CD_REG_INT_POWER_STATUS_UPDATE)
+ if (!tps6598x_read_power_status(tps))
+- goto err_clear_ints;
++ goto err_unlock;
+
+ if (event & APPLE_CD_REG_INT_DATA_STATUS_UPDATE)
+ if (!tps6598x_read_data_status(tps))
+- goto err_clear_ints;
++ goto err_unlock;
+
+ /* Handle plug insert or removal */
+ if (event & APPLE_CD_REG_INT_PLUG_EVENT)
+ tps6598x_handle_plug_event(tps, status);
+
+-err_clear_ints:
+- tps6598x_write64(tps, TPS_REG_INT_CLEAR1, event);
+-
+ err_unlock:
+ mutex_unlock(&tps->lock);
+
+@@ -668,25 +667,24 @@ static irqreturn_t tps6598x_interrupt(int irq, void *data)
+ if (!(event1[0] | event1[1] | event2[0] | event2[1]))
+ goto err_unlock;
+
++ tps6598x_block_write(tps, TPS_REG_INT_CLEAR1, event1, intev_len);
++ tps6598x_block_write(tps, TPS_REG_INT_CLEAR2, event2, intev_len);
++
+ if (!tps6598x_read_status(tps, &status))
+- goto err_clear_ints;
++ goto err_unlock;
+
+ if ((event1[0] | event2[0]) & TPS_REG_INT_POWER_STATUS_UPDATE)
+ if (!tps6598x_read_power_status(tps))
+- goto err_clear_ints;
++ goto err_unlock;
+
+ if ((event1[0] | event2[0]) & TPS_REG_INT_DATA_STATUS_UPDATE)
+ if (!tps6598x_read_data_status(tps))
+- goto err_clear_ints;
++ goto err_unlock;
+
+ /* Handle plug insert or removal */
+ if ((event1[0] | event2[0]) & TPS_REG_INT_PLUG_EVENT)
+ tps6598x_handle_plug_event(tps, status);
+
+-err_clear_ints:
+- tps6598x_block_write(tps, TPS_REG_INT_CLEAR1, event1, intev_len);
+- tps6598x_block_write(tps, TPS_REG_INT_CLEAR2, event2, intev_len);
+-
+ err_unlock:
+ mutex_unlock(&tps->lock);
+
+diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
+index e70fba9f55d6a0..0d6c10a8490c0b 100644
+--- a/drivers/usb/usbip/vhci_hcd.c
++++ b/drivers/usb/usbip/vhci_hcd.c
+@@ -765,6 +765,17 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
+ ctrlreq->wValue, vdev->rhport);
+
+ vdev->udev = usb_get_dev(urb->dev);
++ /*
++ * NOTE: A similar operation has been done via
++ * USB_REQ_GET_DESCRIPTOR handler below, which is
++ * supposed to always precede USB_REQ_SET_ADDRESS.
++ *
++ * It's not entirely clear if operating on a different
++ * usb_device instance here is a real possibility,
++ * otherwise this call and vdev->udev assignment above
++ * should be dropped.
++ */
++ dev_pm_syscore_device(&vdev->udev->dev, true);
+ usb_put_dev(old);
+
+ spin_lock(&vdev->ud.lock);
+@@ -785,6 +796,17 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
+ "Not yet?:Get_Descriptor to device 0 (get max pipe size)\n");
+
+ vdev->udev = usb_get_dev(urb->dev);
++ /*
++ * Set syscore PM flag for the virtually attached
++ * devices to ensure they will not enter suspend on
++ * the client side.
++ *
++ * Note this doesn't have any impact on the physical
++ * devices attached to the host system on the server
++ * side, hence there is no need to undo the operation
++ * on disconnect.
++ */
++ dev_pm_syscore_device(&vdev->udev->dev, true);
+ usb_put_dev(old);
+ goto out;
+
+diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+index 397f5e44513639..fde33f54e99ec5 100644
+--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
++++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+@@ -1612,8 +1612,10 @@ static void hisi_acc_vfio_debug_init(struct hisi_acc_vf_core_device *hisi_acc_vd
+ }
+
+ migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+- if (!migf)
++ if (!migf) {
++ dput(vfio_dev_migration);
+ return;
++ }
+ hisi_acc_vdev->debug_migf = migf;
+
+ vfio_hisi_acc = debugfs_create_dir("hisi_acc", vfio_dev_migration);
+@@ -1623,6 +1625,8 @@ static void hisi_acc_vfio_debug_init(struct hisi_acc_vf_core_device *hisi_acc_vd
+ hisi_acc_vf_migf_read);
+ debugfs_create_devm_seqfile(dev, "cmd_state", vfio_hisi_acc,
+ hisi_acc_vf_debug_cmd);
++
++ dput(vfio_dev_migration);
+ }
+
+ static void hisi_acc_vf_debugfs_exit(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+diff --git a/drivers/vfio/pci/pds/dirty.c b/drivers/vfio/pci/pds/dirty.c
+index c51f5e4c3dd6d2..481992142f7901 100644
+--- a/drivers/vfio/pci/pds/dirty.c
++++ b/drivers/vfio/pci/pds/dirty.c
+@@ -82,7 +82,7 @@ static int pds_vfio_dirty_alloc_bitmaps(struct pds_vfio_region *region,
+
+ host_ack_bmp = vzalloc(bytes);
+ if (!host_ack_bmp) {
+- bitmap_free(host_seq_bmp);
++ vfree(host_seq_bmp);
+ return -ENOMEM;
+ }
+
+diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
+index 9f27c3f6091b80..925858cc60964b 100644
+--- a/drivers/vhost/vringh.c
++++ b/drivers/vhost/vringh.c
+@@ -1115,6 +1115,7 @@ static inline int copy_from_iotlb(const struct vringh *vrh, void *dst,
+ struct iov_iter iter;
+ u64 translated;
+ int ret;
++ size_t size;
+
+ ret = iotlb_translate(vrh, (u64)(uintptr_t)src,
+ len - total_translated, &translated,
+@@ -1132,9 +1133,9 @@ static inline int copy_from_iotlb(const struct vringh *vrh, void *dst,
+ translated);
+ }
+
+- ret = copy_from_iter(dst, translated, &iter);
+- if (ret < 0)
+- return ret;
++ size = copy_from_iter(dst, translated, &iter);
++ if (size != translated)
++ return -EFAULT;
+
+ src += translated;
+ dst += translated;
+@@ -1161,6 +1162,7 @@ static inline int copy_to_iotlb(const struct vringh *vrh, void *dst,
+ struct iov_iter iter;
+ u64 translated;
+ int ret;
++ size_t size;
+
+ ret = iotlb_translate(vrh, (u64)(uintptr_t)dst,
+ len - total_translated, &translated,
+@@ -1178,9 +1180,9 @@ static inline int copy_to_iotlb(const struct vringh *vrh, void *dst,
+ translated);
+ }
+
+- ret = copy_to_iter(src, translated, &iter);
+- if (ret < 0)
+- return ret;
++ size = copy_to_iter(src, translated, &iter);
++ if (size != translated)
++ return -EFAULT;
+
+ src += translated;
+ dst += translated;
+diff --git a/drivers/video/fbdev/simplefb.c b/drivers/video/fbdev/simplefb.c
+index 1893815dc67f4c..6acf5a00c2bacf 100644
+--- a/drivers/video/fbdev/simplefb.c
++++ b/drivers/video/fbdev/simplefb.c
+@@ -93,6 +93,7 @@ struct simplefb_par {
+
+ static void simplefb_clocks_destroy(struct simplefb_par *par);
+ static void simplefb_regulators_destroy(struct simplefb_par *par);
++static void simplefb_detach_genpds(void *res);
+
+ /*
+ * fb_ops.fb_destroy is called by the last put_fb_info() call at the end
+@@ -105,6 +106,7 @@ static void simplefb_destroy(struct fb_info *info)
+
+ simplefb_regulators_destroy(info->par);
+ simplefb_clocks_destroy(info->par);
++ simplefb_detach_genpds(info->par);
+ if (info->screen_base)
+ iounmap(info->screen_base);
+
+@@ -445,13 +447,14 @@ static void simplefb_detach_genpds(void *res)
+ if (!IS_ERR_OR_NULL(par->genpds[i]))
+ dev_pm_domain_detach(par->genpds[i], true);
+ }
++ par->num_genpds = 0;
+ }
+
+ static int simplefb_attach_genpds(struct simplefb_par *par,
+ struct platform_device *pdev)
+ {
+ struct device *dev = &pdev->dev;
+- unsigned int i;
++ unsigned int i, num_genpds;
+ int err;
+
+ err = of_count_phandle_with_args(dev->of_node, "power-domains",
+@@ -465,26 +468,35 @@ static int simplefb_attach_genpds(struct simplefb_par *par,
+ return err;
+ }
+
+- par->num_genpds = err;
++ num_genpds = err;
+
+ /*
+ * Single power-domain devices are handled by the driver core, so
+ * nothing to do here.
+ */
+- if (par->num_genpds <= 1)
++ if (num_genpds <= 1) {
++ par->num_genpds = num_genpds;
+ return 0;
++ }
+
+- par->genpds = devm_kcalloc(dev, par->num_genpds, sizeof(*par->genpds),
++ par->genpds = devm_kcalloc(dev, num_genpds, sizeof(*par->genpds),
+ GFP_KERNEL);
+ if (!par->genpds)
+ return -ENOMEM;
+
+- par->genpd_links = devm_kcalloc(dev, par->num_genpds,
++ par->genpd_links = devm_kcalloc(dev, num_genpds,
+ sizeof(*par->genpd_links),
+ GFP_KERNEL);
+ if (!par->genpd_links)
+ return -ENOMEM;
+
++ /*
++ * Set par->num_genpds only after genpds and genpd_links are allocated
++ * to exit early from simplefb_detach_genpds() without full
++ * initialisation.
++ */
++ par->num_genpds = num_genpds;
++
+ for (i = 0; i < par->num_genpds; i++) {
+ par->genpds[i] = dev_pm_domain_attach_by_id(dev, i);
+ if (IS_ERR(par->genpds[i])) {
+@@ -506,9 +518,10 @@ static int simplefb_attach_genpds(struct simplefb_par *par,
+ dev_warn(dev, "failed to link power-domain %u\n", i);
+ }
+
+- return devm_add_action_or_reset(dev, simplefb_detach_genpds, par);
++ return 0;
+ }
+ #else
++static void simplefb_detach_genpds(void *res) { }
+ static int simplefb_attach_genpds(struct simplefb_par *par,
+ struct platform_device *pdev)
+ {
+@@ -622,18 +635,20 @@ static int simplefb_probe(struct platform_device *pdev)
+ ret = devm_aperture_acquire_for_platform_device(pdev, par->base, par->size);
+ if (ret) {
+ dev_err(&pdev->dev, "Unable to acquire aperture: %d\n", ret);
+- goto error_regulators;
++ goto error_genpds;
+ }
+ ret = register_framebuffer(info);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "Unable to register simplefb: %d\n", ret);
+- goto error_regulators;
++ goto error_genpds;
+ }
+
+ dev_info(&pdev->dev, "fb%d: simplefb registered!\n", info->node);
+
+ return 0;
+
++error_genpds:
++ simplefb_detach_genpds(par);
+ error_regulators:
+ simplefb_regulators_destroy(par);
+ error_clocks:
+diff --git a/drivers/watchdog/intel_oc_wdt.c b/drivers/watchdog/intel_oc_wdt.c
+index 7c0551106981b0..a39892c10770eb 100644
+--- a/drivers/watchdog/intel_oc_wdt.c
++++ b/drivers/watchdog/intel_oc_wdt.c
+@@ -41,6 +41,7 @@
+ struct intel_oc_wdt {
+ struct watchdog_device wdd;
+ struct resource *ctrl_res;
++ struct watchdog_info info;
+ bool locked;
+ };
+
+@@ -115,7 +116,6 @@ static const struct watchdog_ops intel_oc_wdt_ops = {
+
+ static int intel_oc_wdt_setup(struct intel_oc_wdt *oc_wdt)
+ {
+- struct watchdog_info *info;
+ unsigned long val;
+
+ val = inl(INTEL_OC_WDT_CTRL_REG(oc_wdt));
+@@ -134,7 +134,6 @@ static int intel_oc_wdt_setup(struct intel_oc_wdt *oc_wdt)
+ set_bit(WDOG_HW_RUNNING, &oc_wdt->wdd.status);
+
+ if (oc_wdt->locked) {
+- info = (struct watchdog_info *)&intel_oc_wdt_info;
+ /*
+ * Set nowayout unconditionally as we cannot stop
+ * the watchdog.
+@@ -145,7 +144,7 @@ static int intel_oc_wdt_setup(struct intel_oc_wdt *oc_wdt)
+ * and inform the core we can't change it.
+ */
+ oc_wdt->wdd.timeout = (val & INTEL_OC_WDT_TOV) + 1;
+- info->options &= ~WDIOF_SETTIMEOUT;
++ oc_wdt->info.options &= ~WDIOF_SETTIMEOUT;
+
+ dev_info(oc_wdt->wdd.parent,
+ "Register access locked, heartbeat fixed at: %u s\n",
+@@ -193,7 +192,8 @@ static int intel_oc_wdt_probe(struct platform_device *pdev)
+ wdd->min_timeout = INTEL_OC_WDT_MIN_TOV;
+ wdd->max_timeout = INTEL_OC_WDT_MAX_TOV;
+ wdd->timeout = INTEL_OC_WDT_DEF_TOV;
+- wdd->info = &intel_oc_wdt_info;
++ oc_wdt->info = intel_oc_wdt_info;
++ wdd->info = &oc_wdt->info;
+ wdd->ops = &intel_oc_wdt_ops;
+ wdd->parent = dev;
+
+diff --git a/drivers/watchdog/mpc8xxx_wdt.c b/drivers/watchdog/mpc8xxx_wdt.c
+index 867f9f31137971..a4b497ecfa2051 100644
+--- a/drivers/watchdog/mpc8xxx_wdt.c
++++ b/drivers/watchdog/mpc8xxx_wdt.c
+@@ -100,6 +100,8 @@ static int mpc8xxx_wdt_start(struct watchdog_device *w)
+ ddata->swtc = tmp >> 16;
+ set_bit(WDOG_HW_RUNNING, &ddata->wdd.status);
+
++ mpc8xxx_wdt_keepalive(ddata);
++
+ return 0;
+ }
+
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index b21cb72835ccf4..4eafe3817e11c8 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -1621,7 +1621,7 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
+ unsigned long range_bitmap = 0;
+ bool submitted_io = false;
+- bool error = false;
++ int found_error = 0;
+ const u64 folio_start = folio_pos(folio);
+ const unsigned int blocks_per_folio = btrfs_blocks_per_folio(fs_info, folio);
+ u64 cur;
+@@ -1685,7 +1685,8 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
+ */
+ btrfs_mark_ordered_io_finished(inode, folio, cur,
+ fs_info->sectorsize, false);
+- error = true;
++ if (!found_error)
++ found_error = ret;
+ continue;
+ }
+ submitted_io = true;
+@@ -1702,11 +1703,11 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
+ * If we hit any error, the corresponding sector will have its dirty
+ * flag cleared and writeback finished, thus no need to handle the error case.
+ */
+- if (!submitted_io && !error) {
++ if (!submitted_io && !found_error) {
+ btrfs_folio_set_writeback(fs_info, folio, start, len);
+ btrfs_folio_clear_writeback(fs_info, folio, start, len);
+ }
+- return ret;
++ return found_error;
+ }
+
+ /*
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 18db1053cdf087..cd8a09e3d1dc01 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -6479,6 +6479,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
+ if (!args->subvol)
+ btrfs_inherit_iflags(BTRFS_I(inode), BTRFS_I(dir));
+
++ btrfs_set_inode_mapping_order(BTRFS_I(inode));
+ if (S_ISREG(inode->i_mode)) {
+ if (btrfs_test_opt(fs_info, NODATASUM))
+ BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
+@@ -6486,7 +6487,6 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
+ BTRFS_I(inode)->flags |= BTRFS_INODE_NODATACOW |
+ BTRFS_INODE_NODATASUM;
+ btrfs_update_inode_mapping_flags(BTRFS_I(inode));
+- btrfs_set_inode_mapping_order(BTRFS_I(inode));
+ }
+
+ ret = btrfs_insert_inode_locked(inode);
+diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
+index b002e9b734f99c..56c8005b24a344 100644
+--- a/fs/cramfs/inode.c
++++ b/fs/cramfs/inode.c
+@@ -412,7 +412,7 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
+ vm_fault_t vmf;
+ unsigned long off = i * PAGE_SIZE;
+ vmf = vmf_insert_mixed(vma, vma->vm_start + off,
+- address + off);
++ PHYS_PFN(address + off));
+ if (vmf & VM_FAULT_ERROR)
+ ret = vm_fault_to_errno(vmf, 0);
+ }
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index 2d73297003d25a..625b8ae8f67f09 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -1835,7 +1835,7 @@ static void z_erofs_pcluster_readmore(struct z_erofs_frontend *f,
+ map->m_la = end;
+ err = z_erofs_map_blocks_iter(inode, map,
+ EROFS_GET_BLOCKS_READMORE);
+- if (err)
++ if (err || !(map->m_flags & EROFS_MAP_ENCODED))
+ return;
+
+ /* expand ra for the trailing edge if readahead */
+@@ -1847,7 +1847,7 @@ static void z_erofs_pcluster_readmore(struct z_erofs_frontend *f,
+ end = round_up(end, PAGE_SIZE);
+ } else {
+ end = round_up(map->m_la, PAGE_SIZE);
+- if (!map->m_llen)
++ if (!(map->m_flags & EROFS_MAP_ENCODED) || !map->m_llen)
+ return;
+ }
+
+diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
+index 01a6e2de7fc3ef..72e02df72c4c94 100644
+--- a/fs/ext4/ext4.h
++++ b/fs/ext4/ext4.h
+@@ -1981,6 +1981,16 @@ static inline bool ext4_verity_in_progress(struct inode *inode)
+
+ #define NEXT_ORPHAN(inode) EXT4_I(inode)->i_dtime
+
++/*
++ * Check whether the inode is tracked as orphan (either in orphan file or
++ * orphan list).
++ */
++static inline bool ext4_inode_orphan_tracked(struct inode *inode)
++{
++ return ext4_test_inode_state(inode, EXT4_STATE_ORPHAN_FILE) ||
++ !list_empty(&EXT4_I(inode)->i_orphan);
++}
++
+ /*
+ * Codes for operating systems
+ */
+diff --git a/fs/ext4/file.c b/fs/ext4/file.c
+index 93240e35ee363e..7a8b3093218921 100644
+--- a/fs/ext4/file.c
++++ b/fs/ext4/file.c
+@@ -354,7 +354,7 @@ static void ext4_inode_extension_cleanup(struct inode *inode, bool need_trunc)
+ * to cleanup the orphan list in ext4_handle_inode_extension(). Do it
+ * now.
+ */
+- if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) {
++ if (ext4_inode_orphan_tracked(inode) && inode->i_nlink) {
+ handle_t *handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
+
+ if (IS_ERR(handle)) {
+diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
+index 5b7a15db4953a3..5230452e29dd8b 100644
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -4748,7 +4748,7 @@ static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode
+ * old inodes get re-used with the upper 16 bits of the
+ * uid/gid intact.
+ */
+- if (ei->i_dtime && list_empty(&ei->i_orphan)) {
++ if (ei->i_dtime && !ext4_inode_orphan_tracked(inode)) {
+ raw_inode->i_uid_high = 0;
+ raw_inode->i_gid_high = 0;
+ } else {
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 5898d92ba19f14..6070d3c86678e3 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -3655,16 +3655,26 @@ static void ext4_discard_work(struct work_struct *work)
+
+ static inline void ext4_mb_avg_fragment_size_destroy(struct ext4_sb_info *sbi)
+ {
++ if (!sbi->s_mb_avg_fragment_size)
++ return;
++
+ for (int i = 0; i < MB_NUM_ORDERS(sbi->s_sb); i++)
+ xa_destroy(&sbi->s_mb_avg_fragment_size[i]);
++
+ kfree(sbi->s_mb_avg_fragment_size);
++ sbi->s_mb_avg_fragment_size = NULL;
+ }
+
+ static inline void ext4_mb_largest_free_orders_destroy(struct ext4_sb_info *sbi)
+ {
++ if (!sbi->s_mb_largest_free_orders)
++ return;
++
+ for (int i = 0; i < MB_NUM_ORDERS(sbi->s_sb); i++)
+ xa_destroy(&sbi->s_mb_largest_free_orders[i]);
++
+ kfree(sbi->s_mb_largest_free_orders);
++ sbi->s_mb_largest_free_orders = NULL;
+ }
+
+ int ext4_mb_init(struct super_block *sb)
+diff --git a/fs/ext4/orphan.c b/fs/ext4/orphan.c
+index 524d4658fa408d..0fbcce67ffd4e4 100644
+--- a/fs/ext4/orphan.c
++++ b/fs/ext4/orphan.c
+@@ -109,11 +109,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
+
+ WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
+ !inode_is_locked(inode));
+- /*
+- * Inode orphaned in orphan file or in orphan list?
+- */
+- if (ext4_test_inode_state(inode, EXT4_STATE_ORPHAN_FILE) ||
+- !list_empty(&EXT4_I(inode)->i_orphan))
++ if (ext4_inode_orphan_tracked(inode))
+ return 0;
+
+ /*
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index 699c15db28a82f..ba497387b9c863 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -1438,9 +1438,9 @@ static void ext4_free_in_core_inode(struct inode *inode)
+
+ static void ext4_destroy_inode(struct inode *inode)
+ {
+- if (!list_empty(&(EXT4_I(inode)->i_orphan))) {
++ if (ext4_inode_orphan_tracked(inode)) {
+ ext4_msg(inode->i_sb, KERN_ERR,
+- "Inode %lu (%p): orphan list check failed!",
++ "Inode %lu (%p): inode tracked as orphan!",
+ inode->i_ino, EXT4_I(inode));
+ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_ADDRESS, 16, 4,
+ EXT4_I(inode), sizeof(struct ext4_inode_info),
+diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
+index 5c1f47e45dab47..72bc05b913af74 100644
+--- a/fs/f2fs/compress.c
++++ b/fs/f2fs/compress.c
+@@ -1245,20 +1245,29 @@ int f2fs_truncate_partial_cluster(struct inode *inode, u64 from, bool lock)
+
+ for (i = cluster_size - 1; i >= 0; i--) {
+ struct folio *folio = page_folio(rpages[i]);
+- loff_t start = folio->index << PAGE_SHIFT;
++ loff_t start = (loff_t)folio->index << PAGE_SHIFT;
++ loff_t offset = from > start ? from - start : 0;
+
+- if (from <= start) {
+- folio_zero_segment(folio, 0, folio_size(folio));
+- } else {
+- folio_zero_segment(folio, from - start,
+- folio_size(folio));
++ folio_zero_segment(folio, offset, folio_size(folio));
++
++ if (from >= start)
+ break;
+- }
+ }
+
+ f2fs_compress_write_end(inode, fsdata, start_idx, true);
++
++ err = filemap_write_and_wait_range(inode->i_mapping,
++ round_down(from, cluster_size << PAGE_SHIFT),
++ LLONG_MAX);
++ if (err)
++ return err;
++
++ truncate_pagecache(inode, from);
++
++ err = f2fs_do_truncate_blocks(inode,
++ round_up(from, PAGE_SIZE), lock);
+ }
+- return 0;
++ return err;
+ }
+
+ static int f2fs_write_compressed_pages(struct compress_ctx *cc,
+diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
+index 7961e0ddfca3aa..50c90bd0392357 100644
+--- a/fs/f2fs/data.c
++++ b/fs/f2fs/data.c
+@@ -911,7 +911,7 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio)
+ if (fio->io_wbc)
+ wbc_account_cgroup_owner(fio->io_wbc, folio, folio_size(folio));
+
+- inc_page_count(fio->sbi, WB_DATA_TYPE(data_folio, false));
++ inc_page_count(fio->sbi, WB_DATA_TYPE(folio, false));
+
+ *fio->last_block = fio->new_blkaddr;
+ *fio->bio = bio;
+@@ -1778,12 +1778,13 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int flag)
+ if (map->m_flags & F2FS_MAP_MAPPED) {
+ unsigned int ofs = start_pgofs - map->m_lblk;
+
+- f2fs_update_read_extent_cache_range(&dn,
+- start_pgofs, map->m_pblk + ofs,
+- map->m_len - ofs);
++ if (map->m_len > ofs)
++ f2fs_update_read_extent_cache_range(&dn,
++ start_pgofs, map->m_pblk + ofs,
++ map->m_len - ofs);
+ }
+ if (map->m_next_extent)
+- *map->m_next_extent = pgofs + 1;
++ *map->m_next_extent = is_hole ? pgofs + 1 : pgofs;
+ }
+ f2fs_put_dnode(&dn);
+ unlock_out:
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index b4b62ac46bc640..dac7d44885e471 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -2361,8 +2361,6 @@ static inline bool __allow_reserved_blocks(struct f2fs_sb_info *sbi,
+ {
+ if (!inode)
+ return true;
+- if (!test_opt(sbi, RESERVE_ROOT))
+- return false;
+ if (IS_NOQUOTA(inode))
+ return true;
+ if (uid_eq(F2FS_OPTION(sbi).s_resuid, current_fsuid()))
+@@ -2383,7 +2381,7 @@ static inline unsigned int get_available_block_count(struct f2fs_sb_info *sbi,
+ avail_user_block_count = sbi->user_block_count -
+ sbi->current_reserved_blocks;
+
+- if (!__allow_reserved_blocks(sbi, inode, cap))
++ if (test_opt(sbi, RESERVE_ROOT) && !__allow_reserved_blocks(sbi, inode, cap))
+ avail_user_block_count -= F2FS_OPTION(sbi).root_reserved_blocks;
+
+ if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index 42faaed6a02da0..ffa045b39c01de 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -35,15 +35,23 @@
+ #include <trace/events/f2fs.h>
+ #include <uapi/linux/f2fs.h>
+
+-static void f2fs_zero_post_eof_page(struct inode *inode, loff_t new_size)
++static void f2fs_zero_post_eof_page(struct inode *inode,
++ loff_t new_size, bool lock)
+ {
+ loff_t old_size = i_size_read(inode);
+
+ if (old_size >= new_size)
+ return;
+
++ if (mapping_empty(inode->i_mapping))
++ return;
++
++ if (lock)
++ filemap_invalidate_lock(inode->i_mapping);
+ /* zero or drop pages only in range of [old_size, new_size] */
+- truncate_pagecache(inode, old_size);
++ truncate_inode_pages_range(inode->i_mapping, old_size, new_size);
++ if (lock)
++ filemap_invalidate_unlock(inode->i_mapping);
+ }
+
+ static vm_fault_t f2fs_filemap_fault(struct vm_fault *vmf)
+@@ -114,9 +122,7 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf)
+
+ f2fs_bug_on(sbi, f2fs_has_inline_data(inode));
+
+- filemap_invalidate_lock(inode->i_mapping);
+- f2fs_zero_post_eof_page(inode, (folio->index + 1) << PAGE_SHIFT);
+- filemap_invalidate_unlock(inode->i_mapping);
++ f2fs_zero_post_eof_page(inode, (folio->index + 1) << PAGE_SHIFT, true);
+
+ file_update_time(vmf->vma->vm_file);
+ filemap_invalidate_lock_shared(inode->i_mapping);
+@@ -904,8 +910,16 @@ int f2fs_truncate(struct inode *inode)
+ /* we should check inline_data size */
+ if (!f2fs_may_inline_data(inode)) {
+ err = f2fs_convert_inline_inode(inode);
+- if (err)
++ if (err) {
++ /*
++ * Always truncate page #0 to avoid page cache
++ * leak in evict() path.
++ */
++ truncate_inode_pages_range(inode->i_mapping,
++ F2FS_BLK_TO_BYTES(0),
++ F2FS_BLK_END_BYTES(0));
+ return err;
++ }
+ }
+
+ err = f2fs_truncate_blocks(inode, i_size_read(inode), true);
+@@ -1141,7 +1155,7 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
+ filemap_invalidate_lock(inode->i_mapping);
+
+ if (attr->ia_size > old_size)
+- f2fs_zero_post_eof_page(inode, attr->ia_size);
++ f2fs_zero_post_eof_page(inode, attr->ia_size, false);
+ truncate_setsize(inode, attr->ia_size);
+
+ if (attr->ia_size <= old_size)
+@@ -1260,9 +1274,7 @@ static int f2fs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
+ if (ret)
+ return ret;
+
+- filemap_invalidate_lock(inode->i_mapping);
+- f2fs_zero_post_eof_page(inode, offset + len);
+- filemap_invalidate_unlock(inode->i_mapping);
++ f2fs_zero_post_eof_page(inode, offset + len, true);
+
+ pg_start = ((unsigned long long) offset) >> PAGE_SHIFT;
+ pg_end = ((unsigned long long) offset + len) >> PAGE_SHIFT;
+@@ -1547,7 +1559,7 @@ static int f2fs_do_collapse(struct inode *inode, loff_t offset, loff_t len)
+ f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
+ filemap_invalidate_lock(inode->i_mapping);
+
+- f2fs_zero_post_eof_page(inode, offset + len);
++ f2fs_zero_post_eof_page(inode, offset + len, false);
+
+ f2fs_lock_op(sbi);
+ f2fs_drop_extent_tree(inode);
+@@ -1670,9 +1682,7 @@ static int f2fs_zero_range(struct inode *inode, loff_t offset, loff_t len,
+ if (ret)
+ return ret;
+
+- filemap_invalidate_lock(mapping);
+- f2fs_zero_post_eof_page(inode, offset + len);
+- filemap_invalidate_unlock(mapping);
++ f2fs_zero_post_eof_page(inode, offset + len, true);
+
+ pg_start = ((unsigned long long) offset) >> PAGE_SHIFT;
+ pg_end = ((unsigned long long) offset + len) >> PAGE_SHIFT;
+@@ -1806,7 +1816,7 @@ static int f2fs_insert_range(struct inode *inode, loff_t offset, loff_t len)
+ f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
+ filemap_invalidate_lock(mapping);
+
+- f2fs_zero_post_eof_page(inode, offset + len);
++ f2fs_zero_post_eof_page(inode, offset + len, false);
+ truncate_pagecache(inode, offset);
+
+ while (!ret && idx > pg_start) {
+@@ -1864,9 +1874,7 @@ static int f2fs_expand_inode_data(struct inode *inode, loff_t offset,
+ if (err)
+ return err;
+
+- filemap_invalidate_lock(inode->i_mapping);
+- f2fs_zero_post_eof_page(inode, offset + len);
+- filemap_invalidate_unlock(inode->i_mapping);
++ f2fs_zero_post_eof_page(inode, offset + len, true);
+
+ f2fs_balance_fs(sbi, true);
+
+@@ -4914,9 +4922,8 @@ static ssize_t f2fs_write_checks(struct kiocb *iocb, struct iov_iter *from)
+ if (err)
+ return err;
+
+- filemap_invalidate_lock(inode->i_mapping);
+- f2fs_zero_post_eof_page(inode, iocb->ki_pos + iov_iter_count(from));
+- filemap_invalidate_unlock(inode->i_mapping);
++ f2fs_zero_post_eof_page(inode,
++ iocb->ki_pos + iov_iter_count(from), true);
+ return count;
+ }
+
+diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
+index c0f209f7468829..5734e038646852 100644
+--- a/fs/f2fs/gc.c
++++ b/fs/f2fs/gc.c
+@@ -1794,6 +1794,13 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
+ struct folio *sum_folio = filemap_get_folio(META_MAPPING(sbi),
+ GET_SUM_BLOCK(sbi, segno));
+
++ if (is_cursec(sbi, GET_SEC_FROM_SEG(sbi, segno))) {
++ f2fs_err(sbi, "%s: segment %u is used by log",
++ __func__, segno);
++ f2fs_bug_on(sbi, 1);
++ goto skip;
++ }
++
+ if (get_valid_blocks(sbi, segno, false) == 0)
+ goto freed;
+ if (gc_type == BG_GC && __is_large_section(sbi) &&
+@@ -1805,7 +1812,7 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
+
+ sum = folio_address(sum_folio);
+ if (type != GET_SUM_TYPE((&sum->footer))) {
+- f2fs_err(sbi, "Inconsistent segment (%u) type [%d, %d] in SSA and SIT",
++ f2fs_err(sbi, "Inconsistent segment (%u) type [%d, %d] in SIT and SSA",
+ segno, type, GET_SUM_TYPE((&sum->footer)));
+ f2fs_stop_checkpoint(sbi, false,
+ STOP_CP_REASON_CORRUPTED_SUMMARY);
+@@ -2068,6 +2075,13 @@ int f2fs_gc_range(struct f2fs_sb_info *sbi,
+ .iroot = RADIX_TREE_INIT(gc_list.iroot, GFP_NOFS),
+ };
+
++ /*
++ * avoid migrating empty section, as it can be allocated by
++ * log in parallel.
++ */
++ if (!get_valid_blocks(sbi, segno, true))
++ continue;
++
+ if (is_cursec(sbi, GET_SEC_FROM_SEG(sbi, segno)))
+ continue;
+
+diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
+index e16c4e2830c298..8086a3456e4d36 100644
+--- a/fs/f2fs/super.c
++++ b/fs/f2fs/super.c
+@@ -988,6 +988,10 @@ static int f2fs_parse_param(struct fs_context *fc, struct fs_parameter *param)
+ ctx_set_opt(ctx, F2FS_MOUNT_DISABLE_CHECKPOINT);
+ break;
+ case Opt_checkpoint_enable:
++ F2FS_CTX_INFO(ctx).unusable_cap_perc = 0;
++ ctx->spec_mask |= F2FS_SPEC_checkpoint_disable_cap_perc;
++ F2FS_CTX_INFO(ctx).unusable_cap = 0;
++ ctx->spec_mask |= F2FS_SPEC_checkpoint_disable_cap;
+ ctx_clear_opt(ctx, F2FS_MOUNT_DISABLE_CHECKPOINT);
+ break;
+ default:
+@@ -1185,7 +1189,11 @@ static int f2fs_check_quota_consistency(struct fs_context *fc,
+ goto err_jquota_change;
+
+ if (old_qname) {
+- if (strcmp(old_qname, new_qname) == 0) {
++ if (!new_qname) {
++ f2fs_info(sbi, "remove qf_name %s",
++ old_qname);
++ continue;
++ } else if (strcmp(old_qname, new_qname) == 0) {
+ ctx->qname_mask &= ~(1 << i);
+ continue;
+ }
+diff --git a/fs/fuse/file.c b/fs/fuse/file.c
+index 4adcf09d4b01a6..c7351ca0706524 100644
+--- a/fs/fuse/file.c
++++ b/fs/fuse/file.c
+@@ -1175,7 +1175,6 @@ static ssize_t fuse_fill_write_pages(struct fuse_io_args *ia,
+ num = min(iov_iter_count(ii), fc->max_write);
+
+ ap->args.in_pages = true;
+- ap->descs[0].offset = offset;
+
+ while (num && ap->num_folios < max_folios) {
+ size_t tmp;
+diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
+index 72d95185a39f61..bc67fa058c8459 100644
+--- a/fs/gfs2/file.c
++++ b/fs/gfs2/file.c
+@@ -1442,6 +1442,7 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
+ struct gfs2_inode *ip = GFS2_I(file->f_mapping->host);
+ struct gfs2_sbd *sdp = GFS2_SB(file->f_mapping->host);
+ struct lm_lockstruct *ls = &sdp->sd_lockstruct;
++ int ret;
+
+ if (!(fl->c.flc_flags & FL_POSIX))
+ return -ENOLCK;
+@@ -1450,14 +1451,20 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
+ locks_lock_file_wait(file, fl);
+ return -EIO;
+ }
+- if (cmd == F_CANCELLK)
+- return dlm_posix_cancel(ls->ls_dlm, ip->i_no_addr, file, fl);
+- else if (IS_GETLK(cmd))
+- return dlm_posix_get(ls->ls_dlm, ip->i_no_addr, file, fl);
+- else if (lock_is_unlock(fl))
+- return dlm_posix_unlock(ls->ls_dlm, ip->i_no_addr, file, fl);
+- else
+- return dlm_posix_lock(ls->ls_dlm, ip->i_no_addr, file, cmd, fl);
++ down_read(&ls->ls_sem);
++ ret = -ENODEV;
++ if (likely(ls->ls_dlm != NULL)) {
++ if (cmd == F_CANCELLK)
++ ret = dlm_posix_cancel(ls->ls_dlm, ip->i_no_addr, file, fl);
++ else if (IS_GETLK(cmd))
++ ret = dlm_posix_get(ls->ls_dlm, ip->i_no_addr, file, fl);
++ else if (lock_is_unlock(fl))
++ ret = dlm_posix_unlock(ls->ls_dlm, ip->i_no_addr, file, fl);
++ else
++ ret = dlm_posix_lock(ls->ls_dlm, ip->i_no_addr, file, cmd, fl);
++ }
++ up_read(&ls->ls_sem);
++ return ret;
+ }
+
+ static void __flock_holder_uninit(struct file *file, struct gfs2_holder *fl_gh)
+diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
+index b6fd1cb17de7ba..a6535413a0b46a 100644
+--- a/fs/gfs2/glock.c
++++ b/fs/gfs2/glock.c
+@@ -502,7 +502,7 @@ static bool do_promote(struct gfs2_glock *gl)
+ */
+ if (list_is_first(&gh->gh_list, &gl->gl_holders))
+ return false;
+- do_error(gl, 0);
++ do_error(gl, 0); /* Fail queued try locks */
+ break;
+ }
+ set_bit(HIF_HOLDER, &gh->gh_iflags);
+@@ -703,44 +703,25 @@ __acquires(&gl->gl_lockref.lock)
+ lck_flags &= (LM_FLAG_TRY | LM_FLAG_TRY_1CB | LM_FLAG_NOEXP);
+ GLOCK_BUG_ON(gl, gl->gl_state == target);
+ GLOCK_BUG_ON(gl, gl->gl_state == gl->gl_target);
+- if ((target == LM_ST_UNLOCKED || target == LM_ST_DEFERRED) &&
+- glops->go_inval) {
+- /*
+- * If another process is already doing the invalidate, let that
+- * finish first. The glock state machine will get back to this
+- * holder again later.
+- */
+- if (test_and_set_bit(GLF_INVALIDATE_IN_PROGRESS,
+- &gl->gl_flags))
+- return;
+- do_error(gl, 0); /* Fail queued try locks */
+- }
+- gl->gl_req = target;
+- set_bit(GLF_BLOCKING, &gl->gl_flags);
+- if ((gl->gl_req == LM_ST_UNLOCKED) ||
+- (gl->gl_state == LM_ST_EXCLUSIVE) ||
+- (lck_flags & (LM_FLAG_TRY|LM_FLAG_TRY_1CB)))
+- clear_bit(GLF_BLOCKING, &gl->gl_flags);
+- if (!glops->go_inval && !glops->go_sync)
++ if (!glops->go_inval || !glops->go_sync)
+ goto skip_inval;
+
+ spin_unlock(&gl->gl_lockref.lock);
+- if (glops->go_sync) {
+- ret = glops->go_sync(gl);
+- /* If we had a problem syncing (due to io errors or whatever,
+- * we should not invalidate the metadata or tell dlm to
+- * release the glock to other nodes.
+- */
+- if (ret) {
+- if (cmpxchg(&sdp->sd_log_error, 0, ret)) {
+- fs_err(sdp, "Error %d syncing glock \n", ret);
+- gfs2_dump_glock(NULL, gl, true);
+- }
+- spin_lock(&gl->gl_lockref.lock);
+- goto skip_inval;
++ ret = glops->go_sync(gl);
++ /* If we had a problem syncing (due to io errors or whatever,
++ * we should not invalidate the metadata or tell dlm to
++ * release the glock to other nodes.
++ */
++ if (ret) {
++ if (cmpxchg(&sdp->sd_log_error, 0, ret)) {
++ fs_err(sdp, "Error %d syncing glock\n", ret);
++ gfs2_dump_glock(NULL, gl, true);
+ }
++ spin_lock(&gl->gl_lockref.lock);
++ goto skip_inval;
+ }
+- if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags)) {
++
++ if (target == LM_ST_UNLOCKED || target == LM_ST_DEFERRED) {
+ /*
+ * The call to go_sync should have cleared out the ail list.
+ * If there are still items, we have a problem. We ought to
+@@ -755,7 +736,6 @@ __acquires(&gl->gl_lockref.lock)
+ gfs2_dump_glock(NULL, gl, true);
+ }
+ glops->go_inval(gl, target == LM_ST_DEFERRED ? 0 : DIO_METADATA);
+- clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ }
+ spin_lock(&gl->gl_lockref.lock);
+
+@@ -805,8 +785,6 @@ __acquires(&gl->gl_lockref.lock)
+ clear_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
+ gfs2_glock_queue_work(gl, GL_GLOCK_DFT_HOLD);
+ return;
+- } else {
+- clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ }
+ }
+
+@@ -816,21 +794,22 @@ __acquires(&gl->gl_lockref.lock)
+ ret = ls->ls_ops->lm_lock(gl, target, lck_flags);
+ spin_lock(&gl->gl_lockref.lock);
+
+- if (ret == -EINVAL && gl->gl_target == LM_ST_UNLOCKED &&
+- target == LM_ST_UNLOCKED &&
+- test_bit(DFL_UNMOUNT, &ls->ls_recover_flags)) {
++ if (!ret) {
++ /* The operation will be completed asynchronously. */
++ return;
++ }
++ clear_bit(GLF_PENDING_REPLY, &gl->gl_flags);
++
++ if (ret == -ENODEV && gl->gl_target == LM_ST_UNLOCKED &&
++ target == LM_ST_UNLOCKED) {
+ /*
+ * The lockspace has been released and the lock has
+ * been unlocked implicitly.
+ */
+- } else if (ret) {
++ } else {
+ fs_err(sdp, "lm_lock ret %d\n", ret);
+ target = gl->gl_state | LM_OUT_ERROR;
+- } else {
+- /* The operation will be completed asynchronously. */
+- return;
+ }
+- clear_bit(GLF_PENDING_REPLY, &gl->gl_flags);
+ }
+
+ /* Complete the operation now. */
+@@ -1462,6 +1441,24 @@ void gfs2_print_dbg(struct seq_file *seq, const char *fmt, ...)
+ va_end(args);
+ }
+
++static bool gfs2_should_queue_trylock(struct gfs2_glock *gl,
++ struct gfs2_holder *gh)
++{
++ struct gfs2_holder *current_gh, *gh2;
++
++ current_gh = find_first_holder(gl);
++ if (current_gh && !may_grant(gl, current_gh, gh))
++ return false;
++
++ list_for_each_entry(gh2, &gl->gl_holders, gh_list) {
++ if (test_bit(HIF_HOLDER, &gh2->gh_iflags))
++ continue;
++ if (!(gh2->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)))
++ return false;
++ }
++ return true;
++}
++
+ static inline bool pid_is_meaningful(const struct gfs2_holder *gh)
+ {
+ if (!(gh->gh_flags & GL_NOPID))
+@@ -1480,27 +1477,20 @@ static inline bool pid_is_meaningful(const struct gfs2_holder *gh)
+ */
+
+ static inline void add_to_queue(struct gfs2_holder *gh)
+-__releases(&gl->gl_lockref.lock)
+-__acquires(&gl->gl_lockref.lock)
+ {
+ struct gfs2_glock *gl = gh->gh_gl;
+ struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+ struct gfs2_holder *gh2;
+- int try_futile = 0;
+
+ GLOCK_BUG_ON(gl, gh->gh_owner_pid == NULL);
+ if (test_and_set_bit(HIF_WAIT, &gh->gh_iflags))
+ GLOCK_BUG_ON(gl, true);
+
+- if (gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)) {
+- if (test_bit(GLF_LOCK, &gl->gl_flags)) {
+- struct gfs2_holder *current_gh;
+-
+- current_gh = find_first_holder(gl);
+- try_futile = !may_grant(gl, current_gh, gh);
+- }
+- if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
+- goto fail;
++ if ((gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)) &&
++ !gfs2_should_queue_trylock(gl, gh)) {
++ gh->gh_error = GLR_TRYFAILED;
++ gfs2_holder_wake(gh);
++ return;
+ }
+
+ list_for_each_entry(gh2, &gl->gl_holders, gh_list) {
+@@ -1512,15 +1502,6 @@ __acquires(&gl->gl_lockref.lock)
+ continue;
+ goto trap_recursive;
+ }
+- list_for_each_entry(gh2, &gl->gl_holders, gh_list) {
+- if (try_futile &&
+- !(gh2->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB))) {
+-fail:
+- gh->gh_error = GLR_TRYFAILED;
+- gfs2_holder_wake(gh);
+- return;
+- }
+- }
+ trace_gfs2_glock_queue(gh, 1);
+ gfs2_glstats_inc(gl, GFS2_LKS_QCOUNT);
+ gfs2_sbstats_inc(gl, GFS2_LKS_QCOUNT);
+@@ -2321,8 +2302,6 @@ static const char *gflags2str(char *buf, const struct gfs2_glock *gl)
+ *p++ = 'y';
+ if (test_bit(GLF_LFLUSH, gflags))
+ *p++ = 'f';
+- if (test_bit(GLF_INVALIDATE_IN_PROGRESS, gflags))
+- *p++ = 'i';
+ if (test_bit(GLF_PENDING_REPLY, gflags))
+ *p++ = 'R';
+ if (test_bit(GLF_HAVE_REPLY, gflags))
+diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
+index 9339a3bff6eeb1..d041b922b45e3b 100644
+--- a/fs/gfs2/glock.h
++++ b/fs/gfs2/glock.h
+@@ -68,6 +68,10 @@ enum {
+ * also be granted in SHARED. The preferred state is whichever is compatible
+ * with other granted locks, or the specified state if no other locks exist.
+ *
++ * In addition, when a lock is already held in EX mode locally, a SHARED or
++ * DEFERRED mode request with the LM_FLAG_ANY flag set will be granted.
++ * (The LM_FLAG_ANY flag is only use for SHARED mode requests currently.)
++ *
+ * LM_FLAG_NODE_SCOPE
+ * This holder agrees to share the lock within this node. In other words,
+ * the glock is held in EX mode according to DLM, but local holders on the
+diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
+index d4ad82f47eeea4..3fcb7ab198d474 100644
+--- a/fs/gfs2/incore.h
++++ b/fs/gfs2/incore.h
+@@ -319,7 +319,6 @@ enum {
+ GLF_DEMOTE_IN_PROGRESS = 5,
+ GLF_DIRTY = 6,
+ GLF_LFLUSH = 7,
+- GLF_INVALIDATE_IN_PROGRESS = 8,
+ GLF_HAVE_REPLY = 9,
+ GLF_INITIAL = 10,
+ GLF_HAVE_FROZEN_REPLY = 11,
+@@ -658,6 +657,8 @@ struct lm_lockstruct {
+ struct completion ls_sync_wait; /* {control,mounted}_{lock,unlock} */
+ char *ls_lvb_bits;
+
++ struct rw_semaphore ls_sem;
++
+ spinlock_t ls_recover_spin; /* protects following fields */
+ unsigned long ls_recover_flags; /* DFL_ */
+ uint32_t ls_recover_mount; /* gen in first recover_done cb */
+diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
+index cee5d199d2d870..6db37c20587d18 100644
+--- a/fs/gfs2/lock_dlm.c
++++ b/fs/gfs2/lock_dlm.c
+@@ -58,6 +58,7 @@ static inline void gfs2_update_stats(struct gfs2_lkstats *s, unsigned index,
+ /**
+ * gfs2_update_reply_times - Update locking statistics
+ * @gl: The glock to update
++ * @blocking: The operation may have been blocking
+ *
+ * This assumes that gl->gl_dstamp has been set earlier.
+ *
+@@ -72,12 +73,12 @@ static inline void gfs2_update_stats(struct gfs2_lkstats *s, unsigned index,
+ * TRY_1CB flags are set are classified as non-blocking. All
+ * other DLM requests are counted as (potentially) blocking.
+ */
+-static inline void gfs2_update_reply_times(struct gfs2_glock *gl)
++static inline void gfs2_update_reply_times(struct gfs2_glock *gl,
++ bool blocking)
+ {
+ struct gfs2_pcpu_lkstats *lks;
+ const unsigned gltype = gl->gl_name.ln_type;
+- unsigned index = test_bit(GLF_BLOCKING, &gl->gl_flags) ?
+- GFS2_LKS_SRTTB : GFS2_LKS_SRTT;
++ unsigned index = blocking ? GFS2_LKS_SRTTB : GFS2_LKS_SRTT;
+ s64 rtt;
+
+ preempt_disable();
+@@ -119,14 +120,18 @@ static inline void gfs2_update_request_times(struct gfs2_glock *gl)
+ static void gdlm_ast(void *arg)
+ {
+ struct gfs2_glock *gl = arg;
++ bool blocking;
+ unsigned ret;
+
++ blocking = test_bit(GLF_BLOCKING, &gl->gl_flags);
++ gfs2_update_reply_times(gl, blocking);
++ clear_bit(GLF_BLOCKING, &gl->gl_flags);
++
+ /* If the glock is dead, we only react to a dlm_unlock() reply. */
+ if (__lockref_is_dead(&gl->gl_lockref) &&
+ gl->gl_lksb.sb_status != -DLM_EUNLOCK)
+ return;
+
+- gfs2_update_reply_times(gl);
+ BUG_ON(gl->gl_lksb.sb_flags & DLM_SBF_DEMOTED);
+
+ if ((gl->gl_lksb.sb_flags & DLM_SBF_VALNOTVALID) && gl->gl_lksb.sb_lvbptr)
+@@ -241,7 +246,7 @@ static bool down_conversion(int cur, int req)
+ }
+
+ static u32 make_flags(struct gfs2_glock *gl, const unsigned int gfs_flags,
+- const int cur, const int req)
++ const int req, bool blocking)
+ {
+ u32 lkf = 0;
+
+@@ -274,7 +279,7 @@ static u32 make_flags(struct gfs2_glock *gl, const unsigned int gfs_flags,
+ * "upward" lock conversions or else DLM will reject the
+ * request as invalid.
+ */
+- if (!down_conversion(cur, req))
++ if (blocking)
+ lkf |= DLM_LKF_QUECVT;
+ }
+
+@@ -294,14 +299,20 @@ static int gdlm_lock(struct gfs2_glock *gl, unsigned int req_state,
+ unsigned int flags)
+ {
+ struct lm_lockstruct *ls = &gl->gl_name.ln_sbd->sd_lockstruct;
++ bool blocking;
+ int cur, req;
+ u32 lkf;
+ char strname[GDLM_STRNAME_BYTES] = "";
+ int error;
+
++ gl->gl_req = req_state;
+ cur = make_mode(gl->gl_name.ln_sbd, gl->gl_state);
+ req = make_mode(gl->gl_name.ln_sbd, req_state);
+- lkf = make_flags(gl, flags, cur, req);
++ blocking = !down_conversion(cur, req) &&
++ !(flags & (LM_FLAG_TRY|LM_FLAG_TRY_1CB));
++ lkf = make_flags(gl, flags, req, blocking);
++ if (blocking)
++ set_bit(GLF_BLOCKING, &gl->gl_flags);
+ gfs2_glstats_inc(gl, GFS2_LKS_DCOUNT);
+ gfs2_sbstats_inc(gl, GFS2_LKS_DCOUNT);
+ if (test_bit(GLF_INITIAL, &gl->gl_flags)) {
+@@ -318,8 +329,13 @@ static int gdlm_lock(struct gfs2_glock *gl, unsigned int req_state,
+ */
+
+ again:
+- error = dlm_lock(ls->ls_dlm, req, &gl->gl_lksb, lkf, strname,
+- GDLM_STRNAME_BYTES - 1, 0, gdlm_ast, gl, gdlm_bast);
++ down_read(&ls->ls_sem);
++ error = -ENODEV;
++ if (likely(ls->ls_dlm != NULL)) {
++ error = dlm_lock(ls->ls_dlm, req, &gl->gl_lksb, lkf, strname,
++ GDLM_STRNAME_BYTES - 1, 0, gdlm_ast, gl, gdlm_bast);
++ }
++ up_read(&ls->ls_sem);
+ if (error == -EBUSY) {
+ msleep(20);
+ goto again;
+@@ -341,7 +357,6 @@ static void gdlm_put_lock(struct gfs2_glock *gl)
+ return;
+ }
+
+- clear_bit(GLF_BLOCKING, &gl->gl_flags);
+ gfs2_glstats_inc(gl, GFS2_LKS_DCOUNT);
+ gfs2_sbstats_inc(gl, GFS2_LKS_DCOUNT);
+ gfs2_update_request_times(gl);
+@@ -369,8 +384,13 @@ static void gdlm_put_lock(struct gfs2_glock *gl)
+ flags |= DLM_LKF_VALBLK;
+
+ again:
+- error = dlm_unlock(ls->ls_dlm, gl->gl_lksb.sb_lkid, flags,
+- NULL, gl);
++ down_read(&ls->ls_sem);
++ error = -ENODEV;
++ if (likely(ls->ls_dlm != NULL)) {
++ error = dlm_unlock(ls->ls_dlm, gl->gl_lksb.sb_lkid, flags,
++ NULL, gl);
++ }
++ up_read(&ls->ls_sem);
+ if (error == -EBUSY) {
+ msleep(20);
+ goto again;
+@@ -386,7 +406,12 @@ static void gdlm_put_lock(struct gfs2_glock *gl)
+ static void gdlm_cancel(struct gfs2_glock *gl)
+ {
+ struct lm_lockstruct *ls = &gl->gl_name.ln_sbd->sd_lockstruct;
+- dlm_unlock(ls->ls_dlm, gl->gl_lksb.sb_lkid, DLM_LKF_CANCEL, NULL, gl);
++
++ down_read(&ls->ls_sem);
++ if (likely(ls->ls_dlm != NULL)) {
++ dlm_unlock(ls->ls_dlm, gl->gl_lksb.sb_lkid, DLM_LKF_CANCEL, NULL, gl);
++ }
++ up_read(&ls->ls_sem);
+ }
+
+ /*
+@@ -567,7 +592,11 @@ static int sync_unlock(struct gfs2_sbd *sdp, struct dlm_lksb *lksb, char *name)
+ struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ int error;
+
+- error = dlm_unlock(ls->ls_dlm, lksb->sb_lkid, 0, lksb, ls);
++ down_read(&ls->ls_sem);
++ error = -ENODEV;
++ if (likely(ls->ls_dlm != NULL))
++ error = dlm_unlock(ls->ls_dlm, lksb->sb_lkid, 0, lksb, ls);
++ up_read(&ls->ls_sem);
+ if (error) {
+ fs_err(sdp, "%s lkid %x error %d\n",
+ name, lksb->sb_lkid, error);
+@@ -594,9 +623,14 @@ static int sync_lock(struct gfs2_sbd *sdp, int mode, uint32_t flags,
+ memset(strname, 0, GDLM_STRNAME_BYTES);
+ snprintf(strname, GDLM_STRNAME_BYTES, "%8x%16x", LM_TYPE_NONDISK, num);
+
+- error = dlm_lock(ls->ls_dlm, mode, lksb, flags,
+- strname, GDLM_STRNAME_BYTES - 1,
+- 0, sync_wait_cb, ls, NULL);
++ down_read(&ls->ls_sem);
++ error = -ENODEV;
++ if (likely(ls->ls_dlm != NULL)) {
++ error = dlm_lock(ls->ls_dlm, mode, lksb, flags,
++ strname, GDLM_STRNAME_BYTES - 1,
++ 0, sync_wait_cb, ls, NULL);
++ }
++ up_read(&ls->ls_sem);
+ if (error) {
+ fs_err(sdp, "%s lkid %x flags %x mode %d error %d\n",
+ name, lksb->sb_lkid, flags, mode, error);
+@@ -1323,6 +1357,7 @@ static int gdlm_mount(struct gfs2_sbd *sdp, const char *table)
+ */
+
+ INIT_DELAYED_WORK(&sdp->sd_control_work, gfs2_control_func);
++ ls->ls_dlm = NULL;
+ spin_lock_init(&ls->ls_recover_spin);
+ ls->ls_recover_flags = 0;
+ ls->ls_recover_mount = 0;
+@@ -1357,6 +1392,7 @@ static int gdlm_mount(struct gfs2_sbd *sdp, const char *table)
+ * create/join lockspace
+ */
+
++ init_rwsem(&ls->ls_sem);
+ error = dlm_new_lockspace(fsname, cluster, flags, GDLM_LVB_SIZE,
+ &gdlm_lockspace_ops, sdp, &ops_result,
+ &ls->ls_dlm);
+@@ -1436,10 +1472,12 @@ static void gdlm_unmount(struct gfs2_sbd *sdp)
+
+ /* mounted_lock and control_lock will be purged in dlm recovery */
+ release:
++ down_write(&ls->ls_sem);
+ if (ls->ls_dlm) {
+ dlm_release_lockspace(ls->ls_dlm, 2);
+ ls->ls_dlm = NULL;
+ }
++ up_write(&ls->ls_sem);
+
+ free_recover_size(ls);
+ }
+diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
+index 26036ffc3f338e..1c2507a273180b 100644
+--- a/fs/gfs2/trace_gfs2.h
++++ b/fs/gfs2/trace_gfs2.h
+@@ -52,7 +52,6 @@
+ {(1UL << GLF_DEMOTE_IN_PROGRESS), "p" }, \
+ {(1UL << GLF_DIRTY), "y" }, \
+ {(1UL << GLF_LFLUSH), "f" }, \
+- {(1UL << GLF_INVALIDATE_IN_PROGRESS), "i" }, \
+ {(1UL << GLF_PENDING_REPLY), "R" }, \
+ {(1UL << GLF_HAVE_REPLY), "r" }, \
+ {(1UL << GLF_INITIAL), "a" }, \
+diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c
+index 876bbb80fb4dce..1b3e27a0d5e038 100644
+--- a/fs/hfsplus/dir.c
++++ b/fs/hfsplus/dir.c
+@@ -204,7 +204,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
+ fd.entrylength);
+ type = be16_to_cpu(entry.type);
+ len = NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN;
+- err = hfsplus_uni2asc(sb, &fd.key->cat.name, strbuf, &len);
++ err = hfsplus_uni2asc_str(sb, &fd.key->cat.name, strbuf, &len);
+ if (err)
+ goto out;
+ if (type == HFSPLUS_FOLDER) {
+diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
+index 96a5c24813dd6d..2311e4be4e865b 100644
+--- a/fs/hfsplus/hfsplus_fs.h
++++ b/fs/hfsplus/hfsplus_fs.h
+@@ -521,8 +521,12 @@ int hfsplus_strcasecmp(const struct hfsplus_unistr *s1,
+ const struct hfsplus_unistr *s2);
+ int hfsplus_strcmp(const struct hfsplus_unistr *s1,
+ const struct hfsplus_unistr *s2);
+-int hfsplus_uni2asc(struct super_block *sb, const struct hfsplus_unistr *ustr,
+- char *astr, int *len_p);
++int hfsplus_uni2asc_str(struct super_block *sb,
++ const struct hfsplus_unistr *ustr, char *astr,
++ int *len_p);
++int hfsplus_uni2asc_xattr_str(struct super_block *sb,
++ const struct hfsplus_attr_unistr *ustr,
++ char *astr, int *len_p);
+ int hfsplus_asc2uni(struct super_block *sb, struct hfsplus_unistr *ustr,
+ int max_unistr_len, const char *astr, int len);
+ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str);
+diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
+index 36b6cf2a3abba4..862ba27f1628a8 100644
+--- a/fs/hfsplus/unicode.c
++++ b/fs/hfsplus/unicode.c
+@@ -119,9 +119,8 @@ static u16 *hfsplus_compose_lookup(u16 *p, u16 cc)
+ return NULL;
+ }
+
+-int hfsplus_uni2asc(struct super_block *sb,
+- const struct hfsplus_unistr *ustr,
+- char *astr, int *len_p)
++static int hfsplus_uni2asc(struct super_block *sb, const struct hfsplus_unistr *ustr,
++ int max_len, char *astr, int *len_p)
+ {
+ const hfsplus_unichr *ip;
+ struct nls_table *nls = HFSPLUS_SB(sb)->nls;
+@@ -134,8 +133,8 @@ int hfsplus_uni2asc(struct super_block *sb,
+ ip = ustr->unicode;
+
+ ustrlen = be16_to_cpu(ustr->length);
+- if (ustrlen > HFSPLUS_MAX_STRLEN) {
+- ustrlen = HFSPLUS_MAX_STRLEN;
++ if (ustrlen > max_len) {
++ ustrlen = max_len;
+ pr_err("invalid length %u has been corrected to %d\n",
+ be16_to_cpu(ustr->length), ustrlen);
+ }
+@@ -256,6 +255,21 @@ int hfsplus_uni2asc(struct super_block *sb,
+ return res;
+ }
+
++inline int hfsplus_uni2asc_str(struct super_block *sb,
++ const struct hfsplus_unistr *ustr, char *astr,
++ int *len_p)
++{
++ return hfsplus_uni2asc(sb, ustr, HFSPLUS_MAX_STRLEN, astr, len_p);
++}
++
++inline int hfsplus_uni2asc_xattr_str(struct super_block *sb,
++ const struct hfsplus_attr_unistr *ustr,
++ char *astr, int *len_p)
++{
++ return hfsplus_uni2asc(sb, (const struct hfsplus_unistr *)ustr,
++ HFSPLUS_ATTR_MAX_STRLEN, astr, len_p);
++}
++
+ /*
+ * Convert one or more ASCII characters into a single unicode character.
+ * Returns the number of ASCII characters corresponding to the unicode char.
+diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
+index 18dc3d254d218c..c951fa9835aa12 100644
+--- a/fs/hfsplus/xattr.c
++++ b/fs/hfsplus/xattr.c
+@@ -735,9 +735,9 @@ ssize_t hfsplus_listxattr(struct dentry *dentry, char *buffer, size_t size)
+ goto end_listxattr;
+
+ xattr_name_len = NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN;
+- if (hfsplus_uni2asc(inode->i_sb,
+- (const struct hfsplus_unistr *)&fd.key->attr.key_name,
+- strbuf, &xattr_name_len)) {
++ if (hfsplus_uni2asc_xattr_str(inode->i_sb,
++ &fd.key->attr.key_name, strbuf,
++ &xattr_name_len)) {
+ pr_err("unicode conversion failed\n");
+ res = -EIO;
+ goto end_listxattr;
+diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
+index 97abf62f109d2e..31ce210f032acd 100644
+--- a/fs/nfs/localio.c
++++ b/fs/nfs/localio.c
+@@ -49,11 +49,6 @@ struct nfs_local_fsync_ctx {
+ static bool localio_enabled __read_mostly = true;
+ module_param(localio_enabled, bool, 0644);
+
+-static bool localio_O_DIRECT_semantics __read_mostly = false;
+-module_param(localio_O_DIRECT_semantics, bool, 0644);
+-MODULE_PARM_DESC(localio_O_DIRECT_semantics,
+- "LOCALIO will use O_DIRECT semantics to filesystem.");
+-
+ static inline bool nfs_client_is_local(const struct nfs_client *clp)
+ {
+ return !!rcu_access_pointer(clp->cl_uuid.net);
+@@ -321,12 +316,9 @@ nfs_local_iocb_alloc(struct nfs_pgio_header *hdr,
+ return NULL;
+ }
+
+- if (localio_O_DIRECT_semantics &&
+- test_bit(NFS_IOHDR_ODIRECT, &hdr->flags)) {
+- iocb->kiocb.ki_filp = file;
++ init_sync_kiocb(&iocb->kiocb, file);
++ if (test_bit(NFS_IOHDR_ODIRECT, &hdr->flags))
+ iocb->kiocb.ki_flags = IOCB_DIRECT;
+- } else
+- init_sync_kiocb(&iocb->kiocb, file);
+
+ iocb->kiocb.ki_pos = hdr->args.offset;
+ iocb->hdr = hdr;
+@@ -336,6 +328,30 @@ nfs_local_iocb_alloc(struct nfs_pgio_header *hdr,
+ return iocb;
+ }
+
++static bool nfs_iov_iter_aligned_bvec(const struct iov_iter *i,
++ loff_t offset, unsigned int addr_mask, unsigned int len_mask)
++{
++ const struct bio_vec *bvec = i->bvec;
++ size_t skip = i->iov_offset;
++ size_t size = i->count;
++
++ if ((offset | size) & len_mask)
++ return false;
++ do {
++ size_t len = bvec->bv_len;
++
++ if (len > size)
++ len = size;
++ if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
++ return false;
++ bvec++;
++ size -= len;
++ skip = 0;
++ } while (size);
++
++ return true;
++}
++
+ static void
+ nfs_local_iter_init(struct iov_iter *i, struct nfs_local_kiocb *iocb, int dir)
+ {
+@@ -345,6 +361,25 @@ nfs_local_iter_init(struct iov_iter *i, struct nfs_local_kiocb *iocb, int dir)
+ hdr->args.count + hdr->args.pgbase);
+ if (hdr->args.pgbase != 0)
+ iov_iter_advance(i, hdr->args.pgbase);
++
++ if (iocb->kiocb.ki_flags & IOCB_DIRECT) {
++ u32 nf_dio_mem_align, nf_dio_offset_align, nf_dio_read_offset_align;
++ /* Verify the IO is DIO-aligned as required */
++ nfs_to->nfsd_file_dio_alignment(iocb->localio, &nf_dio_mem_align,
++ &nf_dio_offset_align,
++ &nf_dio_read_offset_align);
++ if (dir == READ)
++ nf_dio_offset_align = nf_dio_read_offset_align;
++
++ if (nf_dio_mem_align && nf_dio_offset_align &&
++ nfs_iov_iter_aligned_bvec(i, hdr->args.offset,
++ nf_dio_mem_align - 1,
++ nf_dio_offset_align - 1))
++ return; /* is DIO-aligned */
++
++ /* Fallback to using buffered for this misaligned IO */
++ iocb->kiocb.ki_flags &= ~IOCB_DIRECT;
++ }
+ }
+
+ static void
+@@ -405,6 +440,11 @@ nfs_local_read_done(struct nfs_local_kiocb *iocb, long status)
+ struct nfs_pgio_header *hdr = iocb->hdr;
+ struct file *filp = iocb->kiocb.ki_filp;
+
++ if ((iocb->kiocb.ki_flags & IOCB_DIRECT) && status == -EINVAL) {
++ /* Underlying FS will return -EINVAL if misaligned DIO is attempted. */
++ pr_info_ratelimited("nfs: Unexpected direct I/O read alignment failure\n");
++ }
++
+ nfs_local_pgio_done(hdr, status);
+
+ /*
+@@ -597,6 +637,11 @@ nfs_local_write_done(struct nfs_local_kiocb *iocb, long status)
+
+ dprintk("%s: wrote %ld bytes.\n", __func__, status > 0 ? status : 0);
+
++ if ((iocb->kiocb.ki_flags & IOCB_DIRECT) && status == -EINVAL) {
++ /* Underlying FS will return -EINVAL if misaligned DIO is attempted. */
++ pr_info_ratelimited("nfs: Unexpected direct I/O write alignment failure\n");
++ }
++
+ /* Handle short writes as if they are ENOSPC */
+ if (status > 0 && status < hdr->args.count) {
+ hdr->mds_offset += status;
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index ce61253efd45b4..611e6283c194ff 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -9442,7 +9442,7 @@ static int nfs4_verify_back_channel_attrs(struct nfs41_create_session_args *args
+ goto out;
+ if (rcvd->max_rqst_sz > sent->max_rqst_sz)
+ return -EINVAL;
+- if (rcvd->max_resp_sz < sent->max_resp_sz)
++ if (rcvd->max_resp_sz > sent->max_resp_sz)
+ return -EINVAL;
+ if (rcvd->max_resp_sz_cached > sent->max_resp_sz_cached)
+ return -EINVAL;
+diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
+index 732abf6b92a569..7ca1dedf4e04a7 100644
+--- a/fs/nfsd/filecache.c
++++ b/fs/nfsd/filecache.c
+@@ -231,6 +231,9 @@ nfsd_file_alloc(struct net *net, struct inode *inode, unsigned char need,
+ refcount_set(&nf->nf_ref, 1);
+ nf->nf_may = need;
+ nf->nf_mark = NULL;
++ nf->nf_dio_mem_align = 0;
++ nf->nf_dio_offset_align = 0;
++ nf->nf_dio_read_offset_align = 0;
+ return nf;
+ }
+
+@@ -1069,6 +1072,35 @@ nfsd_file_is_cached(struct inode *inode)
+ return ret;
+ }
+
++static __be32
++nfsd_file_get_dio_attrs(const struct svc_fh *fhp, struct nfsd_file *nf)
++{
++ struct inode *inode = file_inode(nf->nf_file);
++ struct kstat stat;
++ __be32 status;
++
++ /* Currently only need to get DIO alignment info for regular files */
++ if (!S_ISREG(inode->i_mode))
++ return nfs_ok;
++
++ status = fh_getattr(fhp, &stat);
++ if (status != nfs_ok)
++ return status;
++
++ trace_nfsd_file_get_dio_attrs(inode, &stat);
++
++ if (stat.result_mask & STATX_DIOALIGN) {
++ nf->nf_dio_mem_align = stat.dio_mem_align;
++ nf->nf_dio_offset_align = stat.dio_offset_align;
++ }
++ if (stat.result_mask & STATX_DIO_READ_ALIGN)
++ nf->nf_dio_read_offset_align = stat.dio_read_offset_align;
++ else
++ nf->nf_dio_read_offset_align = nf->nf_dio_offset_align;
++
++ return nfs_ok;
++}
++
+ static __be32
+ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
+ struct svc_cred *cred,
+@@ -1187,6 +1219,8 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
+ }
+ status = nfserrno(ret);
+ trace_nfsd_file_open(nf, status);
++ if (status == nfs_ok)
++ status = nfsd_file_get_dio_attrs(fhp, nf);
+ }
+ } else
+ status = nfserr_jukebox;
+diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
+index 722b26c71e454a..237a05c74211b0 100644
+--- a/fs/nfsd/filecache.h
++++ b/fs/nfsd/filecache.h
+@@ -54,6 +54,10 @@ struct nfsd_file {
+ struct list_head nf_gc;
+ struct rcu_head nf_rcu;
+ ktime_t nf_birthtime;
++
++ u32 nf_dio_mem_align;
++ u32 nf_dio_offset_align;
++ u32 nf_dio_read_offset_align;
+ };
+
+ int nfsd_file_cache_init(void);
+diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
+index cb237f1b902a76..9e0a37cd29d8af 100644
+--- a/fs/nfsd/localio.c
++++ b/fs/nfsd/localio.c
+@@ -117,6 +117,16 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
+ return localio;
+ }
+
++static void nfsd_file_dio_alignment(struct nfsd_file *nf,
++ u32 *nf_dio_mem_align,
++ u32 *nf_dio_offset_align,
++ u32 *nf_dio_read_offset_align)
++{
++ *nf_dio_mem_align = nf->nf_dio_mem_align;
++ *nf_dio_offset_align = nf->nf_dio_offset_align;
++ *nf_dio_read_offset_align = nf->nf_dio_read_offset_align;
++}
++
+ static const struct nfsd_localio_operations nfsd_localio_ops = {
+ .nfsd_net_try_get = nfsd_net_try_get,
+ .nfsd_net_put = nfsd_net_put,
+@@ -124,6 +134,7 @@ static const struct nfsd_localio_operations nfsd_localio_ops = {
+ .nfsd_file_put_local = nfsd_file_put_local,
+ .nfsd_file_get_local = nfsd_file_get_local,
+ .nfsd_file_file = nfsd_file_file,
++ .nfsd_file_dio_alignment = nfsd_file_dio_alignment,
+ };
+
+ void nfsd_localio_ops_init(void)
+diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
+index a664fdf1161e9a..6e2c8e2aab10a9 100644
+--- a/fs/nfsd/trace.h
++++ b/fs/nfsd/trace.h
+@@ -1133,6 +1133,33 @@ TRACE_EVENT(nfsd_file_alloc,
+ )
+ );
+
++TRACE_EVENT(nfsd_file_get_dio_attrs,
++ TP_PROTO(
++ const struct inode *inode,
++ const struct kstat *stat
++ ),
++ TP_ARGS(inode, stat),
++ TP_STRUCT__entry(
++ __field(const void *, inode)
++ __field(unsigned long, mask)
++ __field(u32, mem_align)
++ __field(u32, offset_align)
++ __field(u32, read_offset_align)
++ ),
++ TP_fast_assign(
++ __entry->inode = inode;
++ __entry->mask = stat->result_mask;
++ __entry->mem_align = stat->dio_mem_align;
++ __entry->offset_align = stat->dio_offset_align;
++ __entry->read_offset_align = stat->dio_read_offset_align;
++ ),
++ TP_printk("inode=%p flags=%s mem_align=%u offset_align=%u read_offset_align=%u",
++ __entry->inode, show_statx_mask(__entry->mask),
++ __entry->mem_align, __entry->offset_align,
++ __entry->read_offset_align
++ )
++);
++
+ TRACE_EVENT(nfsd_file_acquire,
+ TP_PROTO(
+ const struct svc_rqst *rqstp,
+diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
+index eff04959606fe5..fde3e0c11dbafb 100644
+--- a/fs/nfsd/vfs.h
++++ b/fs/nfsd/vfs.h
+@@ -185,6 +185,10 @@ static inline __be32 fh_getattr(const struct svc_fh *fh, struct kstat *stat)
+ u32 request_mask = STATX_BASIC_STATS;
+ struct path p = {.mnt = fh->fh_export->ex_path.mnt,
+ .dentry = fh->fh_dentry};
++ struct inode *inode = d_inode(p.dentry);
++
++ if (S_ISREG(inode->i_mode))
++ request_mask |= (STATX_DIOALIGN | STATX_DIO_READ_ALIGN);
+
+ if (fh->fh_maxsize == NFS4_FHSIZE)
+ request_mask |= (STATX_BTIME | STATX_CHANGE_COOKIE);
+diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
+index b192ee068a7aca..561339b4cf7525 100644
+--- a/fs/notify/fanotify/fanotify_user.c
++++ b/fs/notify/fanotify/fanotify_user.c
+@@ -1999,7 +1999,10 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
+ user_ns = path.mnt->mnt_sb->s_user_ns;
+ obj = path.mnt->mnt_sb;
+ } else if (obj_type == FSNOTIFY_OBJ_TYPE_MNTNS) {
++ ret = -EINVAL;
+ mntns = mnt_ns_from_dentry(path.dentry);
++ if (!mntns)
++ goto path_put_and_out;
+ user_ns = mntns->user_ns;
+ obj = mntns;
+ }
+diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
+index 1bf2a6593dec66..6d1bf890929d92 100644
+--- a/fs/ntfs3/index.c
++++ b/fs/ntfs3/index.c
+@@ -1508,6 +1508,16 @@ static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
+ bmp_size = bmp_size_v = le32_to_cpu(bmp->res.data_size);
+ }
+
++ /*
++ * Index blocks exist, but $BITMAP has zero valid bits.
++ * This implies an on-disk corruption and must be rejected.
++ */
++ if (in->name == I30_NAME &&
++ unlikely(bmp_size_v == 0 && indx->alloc_run.count)) {
++ err = -EINVAL;
++ goto out1;
++ }
++
+ bit = bmp_size << 3;
+ }
+
+diff --git a/fs/ntfs3/run.c b/fs/ntfs3/run.c
+index 6e86d66197ef29..88550085f74575 100644
+--- a/fs/ntfs3/run.c
++++ b/fs/ntfs3/run.c
+@@ -9,6 +9,7 @@
+ #include <linux/blkdev.h>
+ #include <linux/fs.h>
+ #include <linux/log2.h>
++#include <linux/overflow.h>
+
+ #include "debug.h"
+ #include "ntfs.h"
+@@ -982,14 +983,18 @@ int run_unpack(struct runs_tree *run, struct ntfs_sb_info *sbi, CLST ino,
+
+ if (!dlcn)
+ return -EINVAL;
+- lcn = prev_lcn + dlcn;
++
++ if (check_add_overflow(prev_lcn, dlcn, &lcn))
++ return -EINVAL;
+ prev_lcn = lcn;
+ } else {
+ /* The size of 'dlcn' can't be > 8. */
+ return -EINVAL;
+ }
+
+- next_vcn = vcn64 + len;
++ if (check_add_overflow(vcn64, len, &next_vcn))
++ return -EINVAL;
++
+ /* Check boundary. */
+ if (next_vcn > evcn + 1)
+ return -EINVAL;
+@@ -1153,7 +1158,8 @@ int run_get_highest_vcn(CLST vcn, const u8 *run_buf, u64 *highest_vcn)
+ return -EINVAL;
+
+ run_buf += size_size + offset_size;
+- vcn64 += len;
++ if (check_add_overflow(vcn64, len, &vcn64))
++ return -EINVAL;
+
+ #ifndef CONFIG_NTFS3_64BIT_CLUSTER
+ if (vcn64 > 0x100000000ull)
+diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c
+index 0f045e45fa0c3e..439742cec3c262 100644
+--- a/fs/ocfs2/stack_user.c
++++ b/fs/ocfs2/stack_user.c
+@@ -1011,6 +1011,7 @@ static int user_cluster_connect(struct ocfs2_cluster_connection *conn)
+ printk(KERN_ERR "ocfs2: Could not determine"
+ " locking version\n");
+ user_cluster_disconnect(conn);
++ lc = NULL;
+ goto out;
+ }
+ wait_event(lc->oc_wait, (atomic_read(&lc->oc_this_node) > 0));
+diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
+index e586f3f4b5c937..68286673afc999 100644
+--- a/fs/smb/client/smb2ops.c
++++ b/fs/smb/client/smb2ops.c
+@@ -4219,7 +4219,7 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, unsigned int orig_len,
+ static void *smb2_aead_req_alloc(struct crypto_aead *tfm, const struct smb_rqst *rqst,
+ int num_rqst, const u8 *sig, u8 **iv,
+ struct aead_request **req, struct sg_table *sgt,
+- unsigned int *num_sgs, size_t *sensitive_size)
++ unsigned int *num_sgs)
+ {
+ unsigned int req_size = sizeof(**req) + crypto_aead_reqsize(tfm);
+ unsigned int iv_size = crypto_aead_ivsize(tfm);
+@@ -4236,9 +4236,8 @@ static void *smb2_aead_req_alloc(struct crypto_aead *tfm, const struct smb_rqst
+ len += req_size;
+ len = ALIGN(len, __alignof__(struct scatterlist));
+ len += array_size(*num_sgs, sizeof(struct scatterlist));
+- *sensitive_size = len;
+
+- p = kvzalloc(len, GFP_NOFS);
++ p = kzalloc(len, GFP_NOFS);
+ if (!p)
+ return ERR_PTR(-ENOMEM);
+
+@@ -4252,16 +4251,14 @@ static void *smb2_aead_req_alloc(struct crypto_aead *tfm, const struct smb_rqst
+
+ static void *smb2_get_aead_req(struct crypto_aead *tfm, struct smb_rqst *rqst,
+ int num_rqst, const u8 *sig, u8 **iv,
+- struct aead_request **req, struct scatterlist **sgl,
+- size_t *sensitive_size)
++ struct aead_request **req, struct scatterlist **sgl)
+ {
+ struct sg_table sgtable = {};
+ unsigned int skip, num_sgs, i, j;
+ ssize_t rc;
+ void *p;
+
+- p = smb2_aead_req_alloc(tfm, rqst, num_rqst, sig, iv, req, &sgtable,
+- &num_sgs, sensitive_size);
++ p = smb2_aead_req_alloc(tfm, rqst, num_rqst, sig, iv, req, &sgtable, &num_sgs);
+ if (IS_ERR(p))
+ return ERR_CAST(p);
+
+@@ -4350,7 +4347,6 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst,
+ DECLARE_CRYPTO_WAIT(wait);
+ unsigned int crypt_len = le32_to_cpu(tr_hdr->OriginalMessageSize);
+ void *creq;
+- size_t sensitive_size;
+
+ rc = smb2_get_enc_key(server, le64_to_cpu(tr_hdr->SessionId), enc, key);
+ if (rc) {
+@@ -4376,8 +4372,7 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst,
+ return rc;
+ }
+
+- creq = smb2_get_aead_req(tfm, rqst, num_rqst, sign, &iv, &req, &sg,
+- &sensitive_size);
++ creq = smb2_get_aead_req(tfm, rqst, num_rqst, sign, &iv, &req, &sg);
+ if (IS_ERR(creq))
+ return PTR_ERR(creq);
+
+@@ -4407,7 +4402,7 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst,
+ if (!rc && enc)
+ memcpy(&tr_hdr->Signature, sign, SMB2_SIGNATURE_SIZE);
+
+- kvfree_sensitive(creq, sensitive_size);
++ kfree_sensitive(creq);
+ return rc;
+ }
+
+diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c
+index e0fce5033004c7..6480945c245923 100644
+--- a/fs/smb/client/smbdirect.c
++++ b/fs/smb/client/smbdirect.c
+@@ -179,6 +179,8 @@ static int smbd_conn_upcall(
+ struct smbd_connection *info = id->context;
+ struct smbdirect_socket *sc = &info->socket;
+ const char *event_name = rdma_event_msg(event->event);
++ u8 peer_initiator_depth;
++ u8 peer_responder_resources;
+
+ log_rdma_event(INFO, "event=%s status=%d\n",
+ event_name, event->status);
+@@ -204,6 +206,85 @@ static int smbd_conn_upcall(
+
+ case RDMA_CM_EVENT_ESTABLISHED:
+ log_rdma_event(INFO, "connected event=%s\n", event_name);
++
++ /*
++ * Here we work around an inconsistency between
++ * iWarp and other devices (at least rxe and irdma using RoCEv2)
++ */
++ if (rdma_protocol_iwarp(id->device, id->port_num)) {
++ /*
++ * iWarp devices report the peer's values
++ * with the perspective of the peer here.
++ * Tested with siw and irdma (in iwarp mode)
++ * We need to change to our perspective here,
++ * so we need to switch the values.
++ */
++ peer_initiator_depth = event->param.conn.responder_resources;
++ peer_responder_resources = event->param.conn.initiator_depth;
++ } else {
++ /*
++ * Non iWarp devices report the peer's values
++ * already changed to our perspective here.
++ * Tested with rxe and irdma (in roce mode).
++ */
++ peer_initiator_depth = event->param.conn.initiator_depth;
++ peer_responder_resources = event->param.conn.responder_resources;
++ }
++ if (rdma_protocol_iwarp(id->device, id->port_num) &&
++ event->param.conn.private_data_len == 8) {
++ /*
++ * Legacy clients with only iWarp MPA v1 support
++ * need a private blob in order to negotiate
++ * the IRD/ORD values.
++ */
++ const __be32 *ird_ord_hdr = event->param.conn.private_data;
++ u32 ird32 = be32_to_cpu(ird_ord_hdr[0]);
++ u32 ord32 = be32_to_cpu(ird_ord_hdr[1]);
++
++ /*
++ * cifs.ko sends the legacy IRD/ORD negotiation
++ * event if iWarp MPA v2 was used.
++ *
++ * Here we check that the values match and only
++ * mark the client as legacy if they don't match.
++ */
++ if ((u32)event->param.conn.initiator_depth != ird32 ||
++ (u32)event->param.conn.responder_resources != ord32) {
++ /*
++ * There are broken clients (old cifs.ko)
++ * using little endian and also
++ * struct rdma_conn_param only uses u8
++ * for initiator_depth and responder_resources,
++ * so we truncate the value to U8_MAX.
++ *
++ * smb_direct_accept_client() will then
++ * do the real negotiation in order to
++ * select the minimum between client and
++ * server.
++ */
++ ird32 = min_t(u32, ird32, U8_MAX);
++ ord32 = min_t(u32, ord32, U8_MAX);
++
++ info->legacy_iwarp = true;
++ peer_initiator_depth = (u8)ird32;
++ peer_responder_resources = (u8)ord32;
++ }
++ }
++
++ /*
++ * negotiate the value by using the minimum
++ * between client and server if the client provided
++ * non 0 values.
++ */
++ if (peer_initiator_depth != 0)
++ info->initiator_depth =
++ min_t(u8, info->initiator_depth,
++ peer_initiator_depth);
++ if (peer_responder_resources != 0)
++ info->responder_resources =
++ min_t(u8, info->responder_resources,
++ peer_responder_resources);
++
+ sc->status = SMBDIRECT_SOCKET_CONNECTED;
+ wake_up_interruptible(&info->status_wait);
+ break;
+@@ -1551,7 +1632,7 @@ static struct smbd_connection *_smbd_get_connection(
+ struct ib_qp_init_attr qp_attr;
+ struct sockaddr_in *addr_in = (struct sockaddr_in *) dstaddr;
+ struct ib_port_immutable port_immutable;
+- u32 ird_ord_hdr[2];
++ __be32 ird_ord_hdr[2];
+
+ info = kzalloc(sizeof(struct smbd_connection), GFP_KERNEL);
+ if (!info)
+@@ -1559,6 +1640,9 @@ static struct smbd_connection *_smbd_get_connection(
+ sc = &info->socket;
+ sp = &sc->parameters;
+
++ info->initiator_depth = 1;
++ info->responder_resources = SMBD_CM_RESPONDER_RESOURCES;
++
+ sc->status = SMBDIRECT_SOCKET_CONNECTING;
+ rc = smbd_ia_open(info, dstaddr, port);
+ if (rc) {
+@@ -1639,22 +1723,22 @@ static struct smbd_connection *_smbd_get_connection(
+ }
+ sc->ib.qp = sc->rdma.cm_id->qp;
+
+- memset(&conn_param, 0, sizeof(conn_param));
+- conn_param.initiator_depth = 0;
+-
+- conn_param.responder_resources =
+- min(sc->ib.dev->attrs.max_qp_rd_atom,
+- SMBD_CM_RESPONDER_RESOURCES);
+- info->responder_resources = conn_param.responder_resources;
++ info->responder_resources =
++ min_t(u8, info->responder_resources,
++ sc->ib.dev->attrs.max_qp_rd_atom);
+ log_rdma_mr(INFO, "responder_resources=%d\n",
+ info->responder_resources);
+
++ memset(&conn_param, 0, sizeof(conn_param));
++ conn_param.initiator_depth = info->initiator_depth;
++ conn_param.responder_resources = info->responder_resources;
++
+ /* Need to send IRD/ORD in private data for iWARP */
+ sc->ib.dev->ops.get_port_immutable(
+ sc->ib.dev, sc->rdma.cm_id->port_num, &port_immutable);
+ if (port_immutable.core_cap_flags & RDMA_CORE_PORT_IWARP) {
+- ird_ord_hdr[0] = info->responder_resources;
+- ird_ord_hdr[1] = 1;
++ ird_ord_hdr[0] = cpu_to_be32(conn_param.responder_resources);
++ ird_ord_hdr[1] = cpu_to_be32(conn_param.initiator_depth);
+ conn_param.private_data = ird_ord_hdr;
+ conn_param.private_data_len = sizeof(ird_ord_hdr);
+ } else {
+@@ -2121,6 +2205,12 @@ static int allocate_mr_list(struct smbd_connection *info)
+ atomic_set(&info->mr_used_count, 0);
+ init_waitqueue_head(&info->wait_for_mr_cleanup);
+ INIT_WORK(&info->mr_recovery_work, smbd_mr_recovery_work);
++
++ if (info->responder_resources == 0) {
++ log_rdma_mr(ERR, "responder_resources negotiated as 0\n");
++ return -EINVAL;
++ }
++
+ /* Allocate more MRs (2x) than hardware responder_resources */
+ for (i = 0; i < info->responder_resources * 2; i++) {
+ smbdirect_mr = kzalloc(sizeof(*smbdirect_mr), GFP_KERNEL);
+diff --git a/fs/smb/client/smbdirect.h b/fs/smb/client/smbdirect.h
+index e45aa9ddd71da5..4ca9b2b2c57f93 100644
+--- a/fs/smb/client/smbdirect.h
++++ b/fs/smb/client/smbdirect.h
+@@ -67,7 +67,9 @@ struct smbd_connection {
+
+ /* Memory registrations */
+ /* Maximum number of RDMA read/write outstanding on this connection */
+- int responder_resources;
++ bool legacy_iwarp;
++ u8 initiator_depth;
++ u8 responder_resources;
+ /* Maximum number of pages in a single RDMA write/read on this connection */
+ int max_frmr_depth;
+ /*
+diff --git a/fs/smb/server/ksmbd_netlink.h b/fs/smb/server/ksmbd_netlink.h
+index 3f07a612c05b40..8ccd57fd904bc2 100644
+--- a/fs/smb/server/ksmbd_netlink.h
++++ b/fs/smb/server/ksmbd_netlink.h
+@@ -112,10 +112,11 @@ struct ksmbd_startup_request {
+ __u32 smbd_max_io_size; /* smbd read write size */
+ __u32 max_connections; /* Number of maximum simultaneous connections */
+ __s8 bind_interfaces_only;
+- __s8 reserved[503]; /* Reserved room */
++ __u32 max_ip_connections; /* Number of maximum connection per ip address */
++ __s8 reserved[499]; /* Reserved room */
+ __u32 ifc_list_sz; /* interfaces list size */
+ __s8 ____payload[];
+-};
++} __packed;
+
+ #define KSMBD_STARTUP_CONFIG_INTERFACES(s) ((s)->____payload)
+
+diff --git a/fs/smb/server/mgmt/user_session.c b/fs/smb/server/mgmt/user_session.c
+index 9dec4c2940bc04..b36d0676dbe584 100644
+--- a/fs/smb/server/mgmt/user_session.c
++++ b/fs/smb/server/mgmt/user_session.c
+@@ -104,29 +104,32 @@ int ksmbd_session_rpc_open(struct ksmbd_session *sess, char *rpc_name)
+ if (!entry)
+ return -ENOMEM;
+
+- down_read(&sess->rpc_lock);
+ entry->method = method;
+ entry->id = id = ksmbd_ipc_id_alloc();
+ if (id < 0)
+ goto free_entry;
++
++ down_write(&sess->rpc_lock);
+ old = xa_store(&sess->rpc_handle_list, id, entry, KSMBD_DEFAULT_GFP);
+- if (xa_is_err(old))
++ if (xa_is_err(old)) {
++ up_write(&sess->rpc_lock);
+ goto free_id;
++ }
+
+ resp = ksmbd_rpc_open(sess, id);
+- if (!resp)
+- goto erase_xa;
++ if (!resp) {
++ xa_erase(&sess->rpc_handle_list, entry->id);
++ up_write(&sess->rpc_lock);
++ goto free_id;
++ }
+
+- up_read(&sess->rpc_lock);
++ up_write(&sess->rpc_lock);
+ kvfree(resp);
+ return id;
+-erase_xa:
+- xa_erase(&sess->rpc_handle_list, entry->id);
+ free_id:
+ ksmbd_rpc_id_free(entry->id);
+ free_entry:
+ kfree(entry);
+- up_read(&sess->rpc_lock);
+ return -EINVAL;
+ }
+
+@@ -144,9 +147,14 @@ void ksmbd_session_rpc_close(struct ksmbd_session *sess, int id)
+ int ksmbd_session_rpc_method(struct ksmbd_session *sess, int id)
+ {
+ struct ksmbd_session_rpc *entry;
++ int method;
+
++ down_read(&sess->rpc_lock);
+ entry = xa_load(&sess->rpc_handle_list, id);
+- return entry ? entry->method : 0;
++ method = entry ? entry->method : 0;
++ up_read(&sess->rpc_lock);
++
++ return method;
+ }
+
+ void ksmbd_session_destroy(struct ksmbd_session *sess)
+diff --git a/fs/smb/server/server.h b/fs/smb/server/server.h
+index 995555febe7d16..b8a7317be86b4e 100644
+--- a/fs/smb/server/server.h
++++ b/fs/smb/server/server.h
+@@ -43,6 +43,7 @@ struct ksmbd_server_config {
+ unsigned int auth_mechs;
+ unsigned int max_connections;
+ unsigned int max_inflight_req;
++ unsigned int max_ip_connections;
+
+ char *conf[SERVER_CONF_WORK_GROUP + 1];
+ struct task_struct *dh_task;
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index a565fc36cee6df..a1db006ab6e924 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -5628,7 +5628,8 @@ static int smb2_get_info_filesystem(struct ksmbd_work *work,
+
+ if (!work->tcon->posix_extensions) {
+ pr_err("client doesn't negotiate with SMB3.1.1 POSIX Extensions\n");
+- rc = -EOPNOTSUPP;
++ path_put(&path);
++ return -EOPNOTSUPP;
+ } else {
+ info = (struct filesystem_posix_info *)(rsp->Buffer);
+ info->OptimalTransferSize = cpu_to_le32(stfs.f_bsize);
+diff --git a/fs/smb/server/transport_ipc.c b/fs/smb/server/transport_ipc.c
+index 2a3e2b0ce5570a..2aa1b29bea0804 100644
+--- a/fs/smb/server/transport_ipc.c
++++ b/fs/smb/server/transport_ipc.c
+@@ -335,6 +335,9 @@ static int ipc_server_config_on_startup(struct ksmbd_startup_request *req)
+ if (req->max_connections)
+ server_conf.max_connections = req->max_connections;
+
++ if (req->max_ip_connections)
++ server_conf.max_ip_connections = req->max_ip_connections;
++
+ ret = ksmbd_set_netbios_name(req->netbios_name);
+ ret |= ksmbd_set_server_string(req->server_string);
+ ret |= ksmbd_set_work_group(req->work_group);
+diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c
+index 74dfb6496095db..e1f659d3b4cf57 100644
+--- a/fs/smb/server/transport_rdma.c
++++ b/fs/smb/server/transport_rdma.c
+@@ -153,6 +153,10 @@ struct smb_direct_transport {
+ struct work_struct disconnect_work;
+
+ bool negotiation_requested;
++
++ bool legacy_iwarp;
++ u8 initiator_depth;
++ u8 responder_resources;
+ };
+
+ #define KSMBD_TRANS(t) ((struct ksmbd_transport *)&((t)->transport))
+@@ -347,6 +351,9 @@ static struct smb_direct_transport *alloc_transport(struct rdma_cm_id *cm_id)
+ t->cm_id = cm_id;
+ cm_id->context = t;
+
++ t->initiator_depth = SMB_DIRECT_CM_INITIATOR_DEPTH;
++ t->responder_resources = 1;
++
+ t->status = SMB_DIRECT_CS_NEW;
+ init_waitqueue_head(&t->wait_status);
+
+@@ -1676,21 +1683,21 @@ static int smb_direct_send_negotiate_response(struct smb_direct_transport *t,
+ static int smb_direct_accept_client(struct smb_direct_transport *t)
+ {
+ struct rdma_conn_param conn_param;
+- struct ib_port_immutable port_immutable;
+- u32 ird_ord_hdr[2];
++ __be32 ird_ord_hdr[2];
+ int ret;
+
++ /*
++ * smb_direct_handle_connect_request()
++ * already negotiated t->initiator_depth
++ * and t->responder_resources
++ */
+ memset(&conn_param, 0, sizeof(conn_param));
+- conn_param.initiator_depth = min_t(u8, t->cm_id->device->attrs.max_qp_rd_atom,
+- SMB_DIRECT_CM_INITIATOR_DEPTH);
+- conn_param.responder_resources = 0;
+-
+- t->cm_id->device->ops.get_port_immutable(t->cm_id->device,
+- t->cm_id->port_num,
+- &port_immutable);
+- if (port_immutable.core_cap_flags & RDMA_CORE_PORT_IWARP) {
+- ird_ord_hdr[0] = conn_param.responder_resources;
+- ird_ord_hdr[1] = 1;
++ conn_param.initiator_depth = t->initiator_depth;
++ conn_param.responder_resources = t->responder_resources;
++
++ if (t->legacy_iwarp) {
++ ird_ord_hdr[0] = cpu_to_be32(conn_param.responder_resources);
++ ird_ord_hdr[1] = cpu_to_be32(conn_param.initiator_depth);
+ conn_param.private_data = ird_ord_hdr;
+ conn_param.private_data_len = sizeof(ird_ord_hdr);
+ } else {
+@@ -2081,10 +2088,13 @@ static bool rdma_frwr_is_supported(struct ib_device_attr *attrs)
+ return true;
+ }
+
+-static int smb_direct_handle_connect_request(struct rdma_cm_id *new_cm_id)
++static int smb_direct_handle_connect_request(struct rdma_cm_id *new_cm_id,
++ struct rdma_cm_event *event)
+ {
+ struct smb_direct_transport *t;
+ struct task_struct *handler;
++ u8 peer_initiator_depth;
++ u8 peer_responder_resources;
+ int ret;
+
+ if (!rdma_frwr_is_supported(&new_cm_id->device->attrs)) {
+@@ -2098,6 +2108,67 @@ static int smb_direct_handle_connect_request(struct rdma_cm_id *new_cm_id)
+ if (!t)
+ return -ENOMEM;
+
++ peer_initiator_depth = event->param.conn.initiator_depth;
++ peer_responder_resources = event->param.conn.responder_resources;
++ if (rdma_protocol_iwarp(new_cm_id->device, new_cm_id->port_num) &&
++ event->param.conn.private_data_len == 8) {
++ /*
++ * Legacy clients with only iWarp MPA v1 support
++ * need a private blob in order to negotiate
++ * the IRD/ORD values.
++ */
++ const __be32 *ird_ord_hdr = event->param.conn.private_data;
++ u32 ird32 = be32_to_cpu(ird_ord_hdr[0]);
++ u32 ord32 = be32_to_cpu(ird_ord_hdr[1]);
++
++ /*
++ * cifs.ko sends the legacy IRD/ORD negotiation
++ * event if iWarp MPA v2 was used.
++ *
++ * Here we check that the values match and only
++ * mark the client as legacy if they don't match.
++ */
++ if ((u32)event->param.conn.initiator_depth != ird32 ||
++ (u32)event->param.conn.responder_resources != ord32) {
++ /*
++ * There are broken clients (old cifs.ko)
++ * using little endian and also
++ * struct rdma_conn_param only uses u8
++ * for initiator_depth and responder_resources,
++ * so we truncate the value to U8_MAX.
++ *
++ * smb_direct_accept_client() will then
++ * do the real negotiation in order to
++ * select the minimum between client and
++ * server.
++ */
++ ird32 = min_t(u32, ird32, U8_MAX);
++ ord32 = min_t(u32, ord32, U8_MAX);
++
++ t->legacy_iwarp = true;
++ peer_initiator_depth = (u8)ird32;
++ peer_responder_resources = (u8)ord32;
++ }
++ }
++
++ /*
++ * First set what the we as server are able to support
++ */
++ t->initiator_depth = min_t(u8, t->initiator_depth,
++ new_cm_id->device->attrs.max_qp_rd_atom);
++
++ /*
++ * negotiate the value by using the minimum
++ * between client and server if the client provided
++ * non 0 values.
++ */
++ if (peer_initiator_depth != 0)
++ t->initiator_depth = min_t(u8, t->initiator_depth,
++ peer_initiator_depth);
++ if (peer_responder_resources != 0)
++ t->responder_resources = min_t(u8, t->responder_resources,
++ peer_responder_resources);
++
+ ret = smb_direct_connect(t);
+ if (ret)
+ goto out_err;
+@@ -2122,7 +2193,7 @@ static int smb_direct_listen_handler(struct rdma_cm_id *cm_id,
+ {
+ switch (event->event) {
+ case RDMA_CM_EVENT_CONNECT_REQUEST: {
+- int ret = smb_direct_handle_connect_request(cm_id);
++ int ret = smb_direct_handle_connect_request(cm_id, event);
+
+ if (ret) {
+ pr_err("Can't create transport: %d\n", ret);
+diff --git a/fs/smb/server/transport_tcp.c b/fs/smb/server/transport_tcp.c
+index 4337df97987da3..1009cb324fd514 100644
+--- a/fs/smb/server/transport_tcp.c
++++ b/fs/smb/server/transport_tcp.c
+@@ -238,6 +238,7 @@ static int ksmbd_kthread_fn(void *p)
+ struct interface *iface = (struct interface *)p;
+ struct ksmbd_conn *conn;
+ int ret;
++ unsigned int max_ip_conns;
+
+ while (!kthread_should_stop()) {
+ mutex_lock(&iface->sock_release_lock);
+@@ -255,34 +256,38 @@ static int ksmbd_kthread_fn(void *p)
+ continue;
+ }
+
++ if (!server_conf.max_ip_connections)
++ goto skip_max_ip_conns_limit;
++
+ /*
+ * Limits repeated connections from clients with the same IP.
+ */
++ max_ip_conns = 0;
+ down_read(&conn_list_lock);
+- list_for_each_entry(conn, &conn_list, conns_list)
++ list_for_each_entry(conn, &conn_list, conns_list) {
+ #if IS_ENABLED(CONFIG_IPV6)
+ if (client_sk->sk->sk_family == AF_INET6) {
+ if (memcmp(&client_sk->sk->sk_v6_daddr,
+- &conn->inet6_addr, 16) == 0) {
+- ret = -EAGAIN;
+- break;
+- }
++ &conn->inet6_addr, 16) == 0)
++ max_ip_conns++;
+ } else if (inet_sk(client_sk->sk)->inet_daddr ==
+- conn->inet_addr) {
+- ret = -EAGAIN;
+- break;
+- }
++ conn->inet_addr)
++ max_ip_conns++;
+ #else
+ if (inet_sk(client_sk->sk)->inet_daddr ==
+- conn->inet_addr) {
++ conn->inet_addr)
++ max_ip_conns++;
++#endif
++ if (server_conf.max_ip_connections <= max_ip_conns) {
+ ret = -EAGAIN;
+ break;
+ }
+-#endif
++ }
+ up_read(&conn_list_lock);
+ if (ret == -EAGAIN)
+ continue;
+
++skip_max_ip_conns_limit:
+ if (server_conf.max_connections &&
+ atomic_inc_return(&active_num_conn) >= server_conf.max_connections) {
+ pr_info_ratelimited("Limit the maximum number of connections(%u)\n",
+diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
+index d5918eba27e371..53104f25de5116 100644
+--- a/fs/squashfs/inode.c
++++ b/fs/squashfs/inode.c
+@@ -165,6 +165,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ squashfs_i(inode)->start = le32_to_cpu(sqsh_ino->start_block);
+ squashfs_i(inode)->block_list_start = block;
+ squashfs_i(inode)->offset = offset;
++ squashfs_i(inode)->parent = 0;
+ inode->i_data.a_ops = &squashfs_aops;
+
+ TRACE("File inode %x:%x, start_block %llx, block_list_start "
+@@ -212,6 +213,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ squashfs_i(inode)->start = le64_to_cpu(sqsh_ino->start_block);
+ squashfs_i(inode)->block_list_start = block;
+ squashfs_i(inode)->offset = offset;
++ squashfs_i(inode)->parent = 0;
+ inode->i_data.a_ops = &squashfs_aops;
+
+ TRACE("File inode %x:%x, start_block %llx, block_list_start "
+@@ -292,6 +294,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ inode->i_mode |= S_IFLNK;
+ squashfs_i(inode)->start = block;
+ squashfs_i(inode)->offset = offset;
++ squashfs_i(inode)->parent = 0;
+
+ if (type == SQUASHFS_LSYMLINK_TYPE) {
+ __le32 xattr;
+@@ -329,6 +332,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
+ rdev = le32_to_cpu(sqsh_ino->rdev);
+ init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
++ squashfs_i(inode)->parent = 0;
+
+ TRACE("Device inode %x:%x, rdev %x\n",
+ SQUASHFS_INODE_BLK(ino), offset, rdev);
+@@ -353,6 +357,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
+ rdev = le32_to_cpu(sqsh_ino->rdev);
+ init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
++ squashfs_i(inode)->parent = 0;
+
+ TRACE("Device inode %x:%x, rdev %x\n",
+ SQUASHFS_INODE_BLK(ino), offset, rdev);
+@@ -373,6 +378,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ inode->i_mode |= S_IFSOCK;
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
+ init_special_inode(inode, inode->i_mode, 0);
++ squashfs_i(inode)->parent = 0;
+ break;
+ }
+ case SQUASHFS_LFIFO_TYPE:
+@@ -392,6 +398,7 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ inode->i_op = &squashfs_inode_ops;
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
+ init_special_inode(inode, inode->i_mode, 0);
++ squashfs_i(inode)->parent = 0;
+ break;
+ }
+ default:
+diff --git a/fs/squashfs/squashfs_fs_i.h b/fs/squashfs/squashfs_fs_i.h
+index 2c82d6f2a4561b..8e497ac07b9a83 100644
+--- a/fs/squashfs/squashfs_fs_i.h
++++ b/fs/squashfs/squashfs_fs_i.h
+@@ -16,6 +16,7 @@ struct squashfs_inode_info {
+ u64 xattr;
+ unsigned int xattr_size;
+ int xattr_count;
++ int parent;
+ union {
+ struct {
+ u64 fragment_block;
+@@ -27,7 +28,6 @@ struct squashfs_inode_info {
+ u64 dir_idx_start;
+ int dir_idx_offset;
+ int dir_idx_cnt;
+- int parent;
+ };
+ };
+ struct inode vfs_inode;
+diff --git a/fs/udf/inode.c b/fs/udf/inode.c
+index f24aa98e686917..a79d73f28aa788 100644
+--- a/fs/udf/inode.c
++++ b/fs/udf/inode.c
+@@ -2272,6 +2272,9 @@ int udf_current_aext(struct inode *inode, struct extent_position *epos,
+ if (check_add_overflow(sizeof(struct allocExtDesc),
+ le32_to_cpu(header->lengthAllocDescs), &alen))
+ return -1;
++
++ if (alen > epos->bh->b_size)
++ return -1;
+ }
+
+ switch (iinfo->i_alloc_type) {
+diff --git a/include/acpi/actbl.h b/include/acpi/actbl.h
+index 243097a3da6360..8a67d4ea6e3feb 100644
+--- a/include/acpi/actbl.h
++++ b/include/acpi/actbl.h
+@@ -73,7 +73,7 @@ struct acpi_table_header {
+ char oem_id[ACPI_OEM_ID_SIZE] ACPI_NONSTRING; /* ASCII OEM identification */
+ char oem_table_id[ACPI_OEM_TABLE_ID_SIZE] ACPI_NONSTRING; /* ASCII OEM table identification */
+ u32 oem_revision; /* OEM revision number */
+- char asl_compiler_id[ACPI_NAMESEG_SIZE]; /* ASCII ASL compiler vendor ID */
++ char asl_compiler_id[ACPI_NAMESEG_SIZE] ACPI_NONSTRING; /* ASCII ASL compiler vendor ID */
+ u32 asl_compiler_revision; /* ASL compiler version */
+ };
+
+diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
+index ae2d2359b79e9e..8efbe8c4874ee8 100644
+--- a/include/asm-generic/vmlinux.lds.h
++++ b/include/asm-generic/vmlinux.lds.h
+@@ -361,6 +361,7 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
+ __start_once = .; \
+ *(.data..once) \
+ __end_once = .; \
++ *(.data..do_once) \
+ STRUCT_ALIGN(); \
+ *(__tracepoints) \
+ /* implement dynamic printk debug */ \
+diff --git a/include/crypto/internal/scompress.h b/include/crypto/internal/scompress.h
+index 533d6c16a49145..6a2c5f2e90f954 100644
+--- a/include/crypto/internal/scompress.h
++++ b/include/crypto/internal/scompress.h
+@@ -18,11 +18,8 @@ struct crypto_scomp {
+ /**
+ * struct scomp_alg - synchronous compression algorithm
+ *
+- * @alloc_ctx: Function allocates algorithm specific context
+- * @free_ctx: Function frees context allocated with alloc_ctx
+ * @compress: Function performs a compress operation
+ * @decompress: Function performs a de-compress operation
+- * @base: Common crypto API algorithm data structure
+ * @streams: Per-cpu memory for algorithm
+ * @calg: Cmonn algorithm data structure shared with acomp
+ */
+@@ -34,13 +31,7 @@ struct scomp_alg {
+ unsigned int slen, u8 *dst, unsigned int *dlen,
+ void *ctx);
+
+- union {
+- struct {
+- void *(*alloc_ctx)(void);
+- void (*free_ctx)(void *ctx);
+- };
+- struct crypto_acomp_streams streams;
+- };
++ struct crypto_acomp_streams streams;
+
+ union {
+ struct COMP_ALG_COMMON;
+diff --git a/include/drm/drm_panel.h b/include/drm/drm_panel.h
+index 843fb756a2950a..2407bfa60236f8 100644
+--- a/include/drm/drm_panel.h
++++ b/include/drm/drm_panel.h
+@@ -160,6 +160,20 @@ struct drm_panel_follower_funcs {
+ * Called before the panel is powered off.
+ */
+ int (*panel_unpreparing)(struct drm_panel_follower *follower);
++
++ /**
++ * @panel_enabled:
++ *
++ * Called after the panel and the backlight have been enabled.
++ */
++ int (*panel_enabled)(struct drm_panel_follower *follower);
++
++ /**
++ * @panel_disabling:
++ *
++ * Called before the panel and the backlight are disabled.
++ */
++ int (*panel_disabling)(struct drm_panel_follower *follower);
+ };
+
+ struct drm_panel_follower {
+diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
+index 09b99d52fd365f..f78145be77df5c 100644
+--- a/include/linux/blk_types.h
++++ b/include/linux/blk_types.h
+@@ -198,10 +198,6 @@ static inline bool blk_path_error(blk_status_t error)
+ return true;
+ }
+
+-struct bio_issue {
+- u64 value;
+-};
+-
+ typedef __u32 __bitwise blk_opf_t;
+
+ typedef unsigned int blk_qc_t;
+@@ -242,7 +238,8 @@ struct bio {
+ * on release of the bio.
+ */
+ struct blkcg_gq *bi_blkg;
+- struct bio_issue bi_issue;
++ /* Time that this bio was issued. */
++ u64 issue_time_ns;
+ #ifdef CONFIG_BLK_CGROUP_IOCOST
+ u64 bi_iocost_cost;
+ #endif
+diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
+index fe1797bbec420c..cc221318712e7a 100644
+--- a/include/linux/blkdev.h
++++ b/include/linux/blkdev.h
+@@ -999,6 +999,8 @@ extern int blk_register_queue(struct gendisk *disk);
+ extern void blk_unregister_queue(struct gendisk *disk);
+ void submit_bio_noacct(struct bio *bio);
+ struct bio *bio_split_to_limits(struct bio *bio);
++struct bio *bio_submit_split_bioset(struct bio *bio, unsigned int split_sectors,
++ struct bio_set *bs);
+
+ extern int blk_lld_busy(struct request_queue *q);
+ extern int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags);
+diff --git a/include/linux/bpf.h b/include/linux/bpf.h
+index cc700925b802fe..84826dc0a3268e 100644
+--- a/include/linux/bpf.h
++++ b/include/linux/bpf.h
+@@ -285,6 +285,7 @@ struct bpf_map_owner {
+ bool xdp_has_frags;
+ u64 storage_cookie[MAX_BPF_CGROUP_STORAGE_TYPE];
+ const struct btf_type *attach_func_proto;
++ enum bpf_attach_type expected_attach_type;
+ };
+
+ struct bpf_map {
+diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
+index 94defa405c85e3..fe9a841fdf0cfe 100644
+--- a/include/linux/bpf_verifier.h
++++ b/include/linux/bpf_verifier.h
+@@ -875,13 +875,15 @@ __printf(3, 4) void verbose_linfo(struct bpf_verifier_env *env,
+ #define verifier_bug_if(cond, env, fmt, args...) \
+ ({ \
+ bool __cond = (cond); \
+- if (unlikely(__cond)) { \
+- BPF_WARN_ONCE(1, "verifier bug: " fmt "(" #cond ")\n", ##args); \
+- bpf_log(&env->log, "verifier bug: " fmt "(" #cond ")\n", ##args); \
+- } \
++ if (unlikely(__cond)) \
++ verifier_bug(env, fmt " (" #cond ")", ##args); \
+ (__cond); \
+ })
+-#define verifier_bug(env, fmt, args...) verifier_bug_if(1, env, fmt, ##args)
++#define verifier_bug(env, fmt, args...) \
++ ({ \
++ BPF_WARN_ONCE(1, "verifier bug: " fmt "\n", ##args); \
++ bpf_log(&env->log, "verifier bug: " fmt "\n", ##args); \
++ })
+
+ static inline struct bpf_func_state *cur_func(struct bpf_verifier_env *env)
+ {
+diff --git a/include/linux/btf.h b/include/linux/btf.h
+index 9eda6b113f9b48..f06976ffb63f94 100644
+--- a/include/linux/btf.h
++++ b/include/linux/btf.h
+@@ -86,7 +86,7 @@
+ * as to avoid issues such as the compiler inlining or eliding either a static
+ * kfunc, or a global kfunc in an LTO build.
+ */
+-#define __bpf_kfunc __used __retain noinline
++#define __bpf_kfunc __used __retain __noclone noinline
+
+ #define __bpf_kfunc_start_defs() \
+ __diag_push(); \
+diff --git a/include/linux/coresight.h b/include/linux/coresight.h
+index 4ac65c68bbf44b..bb49080ec8f96b 100644
+--- a/include/linux/coresight.h
++++ b/include/linux/coresight.h
+@@ -6,6 +6,7 @@
+ #ifndef _LINUX_CORESIGHT_H
+ #define _LINUX_CORESIGHT_H
+
++#include <linux/acpi.h>
+ #include <linux/amba/bus.h>
+ #include <linux/clk.h>
+ #include <linux/device.h>
+@@ -480,26 +481,24 @@ static inline bool is_coresight_device(void __iomem *base)
+ * Returns:
+ *
+ * clk - Clock is found and enabled
+- * NULL - clock is not found
++ * NULL - Clock is controlled by firmware (ACPI device only) or when managed
++ * by the AMBA bus driver instead
+ * ERROR - Clock is found but failed to enable
+ */
+ static inline struct clk *coresight_get_enable_apb_pclk(struct device *dev)
+ {
+- struct clk *pclk;
+- int ret;
+-
+- pclk = clk_get(dev, "apb_pclk");
+- if (IS_ERR(pclk)) {
+- pclk = clk_get(dev, "apb");
+- if (IS_ERR(pclk))
+- return NULL;
+- }
++ struct clk *pclk = NULL;
++
++ /* Firmware controls clocks for an ACPI device. */
++ if (has_acpi_companion(dev))
++ return NULL;
+
+- ret = clk_prepare_enable(pclk);
+- if (ret) {
+- clk_put(pclk);
+- return ERR_PTR(ret);
++ if (!dev_is_amba(dev)) {
++ pclk = devm_clk_get_optional_enabled(dev, "apb_pclk");
++ if (!pclk)
++ pclk = devm_clk_get_optional_enabled(dev, "apb");
+ }
++
+ return pclk;
+ }
+
+diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
+index 6de7c05d6bd8c9..99efe2b9b4ea98 100644
+--- a/include/linux/dmaengine.h
++++ b/include/linux/dmaengine.h
+@@ -594,9 +594,9 @@ struct dma_descriptor_metadata_ops {
+ * @phys: physical address of the descriptor
+ * @chan: target channel for this operation
+ * @tx_submit: accept the descriptor, assign ordered cookie and mark the
++ * descriptor pending. To be pushed on .issue_pending() call
+ * @desc_free: driver's callback function to free a resusable descriptor
+ * after completion
+- * descriptor pending. To be pushed on .issue_pending() call
+ * @callback: routine to call after this operation is complete
+ * @callback_result: error result from a DMA transaction
+ * @callback_param: general parameter to pass to the callback routine
+diff --git a/include/linux/hid.h b/include/linux/hid.h
+index 2cc4f1e4ea9637..c32425b5d0119c 100644
+--- a/include/linux/hid.h
++++ b/include/linux/hid.h
+@@ -364,6 +364,7 @@ struct hid_item {
+ * | @HID_QUIRK_HAVE_SPECIAL_DRIVER:
+ * | @HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE:
+ * | @HID_QUIRK_IGNORE_SPECIAL_DRIVER
++ * | @HID_QUIRK_POWER_ON_AFTER_BACKLIGHT
+ * | @HID_QUIRK_FULLSPEED_INTERVAL:
+ * | @HID_QUIRK_NO_INIT_REPORTS:
+ * | @HID_QUIRK_NO_IGNORE:
+@@ -391,6 +392,7 @@ struct hid_item {
+ #define HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE BIT(20)
+ #define HID_QUIRK_NOINVERT BIT(21)
+ #define HID_QUIRK_IGNORE_SPECIAL_DRIVER BIT(22)
++#define HID_QUIRK_POWER_ON_AFTER_BACKLIGHT BIT(23)
+ #define HID_QUIRK_FULLSPEED_INTERVAL BIT(28)
+ #define HID_QUIRK_NO_INIT_REPORTS BIT(29)
+ #define HID_QUIRK_NO_IGNORE BIT(30)
+diff --git a/include/linux/irq.h b/include/linux/irq.h
+index 1d6b606a81efe5..890e1371f5d4c2 100644
+--- a/include/linux/irq.h
++++ b/include/linux/irq.h
+@@ -669,6 +669,8 @@ extern int irq_chip_set_parent_state(struct irq_data *data,
+ extern int irq_chip_get_parent_state(struct irq_data *data,
+ enum irqchip_irq_state which,
+ bool *state);
++extern void irq_chip_shutdown_parent(struct irq_data *data);
++extern unsigned int irq_chip_startup_parent(struct irq_data *data);
+ extern void irq_chip_enable_parent(struct irq_data *data);
+ extern void irq_chip_disable_parent(struct irq_data *data);
+ extern void irq_chip_ack_parent(struct irq_data *data);
+diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
+index 785173aa0739cc..25921fbec68566 100644
+--- a/include/linux/memcontrol.h
++++ b/include/linux/memcontrol.h
+@@ -1604,6 +1604,7 @@ extern struct static_key_false memcg_sockets_enabled_key;
+ #define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_enabled_key)
+ void mem_cgroup_sk_alloc(struct sock *sk);
+ void mem_cgroup_sk_free(struct sock *sk);
++void mem_cgroup_sk_inherit(const struct sock *sk, struct sock *newsk);
+
+ #if BITS_PER_LONG < 64
+ static inline void mem_cgroup_set_socket_pressure(struct mem_cgroup *memcg)
+@@ -1661,6 +1662,11 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg);
+ #define mem_cgroup_sockets_enabled 0
+ static inline void mem_cgroup_sk_alloc(struct sock *sk) { };
+ static inline void mem_cgroup_sk_free(struct sock *sk) { };
++
++static inline void mem_cgroup_sk_inherit(const struct sock *sk, struct sock *newsk)
++{
++}
++
+ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
+ {
+ return false;
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index 1ae97a0b8ec756..c6794d0e24eb6c 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -296,7 +296,7 @@ extern unsigned int kobjsize(const void *objp);
+ #define VM_MIXEDMAP 0x10000000 /* Can contain "struct page" and pure PFN pages */
+ #define VM_HUGEPAGE 0x20000000 /* MADV_HUGEPAGE marked this vma */
+ #define VM_NOHUGEPAGE 0x40000000 /* MADV_NOHUGEPAGE marked this vma */
+-#define VM_MERGEABLE 0x80000000 /* KSM may merge identical pages */
++#define VM_MERGEABLE BIT(31) /* KSM may merge identical pages */
+
+ #ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS
+ #define VM_HIGH_ARCH_BIT_0 32 /* bit only usable on 64-bit architectures */
+diff --git a/include/linux/mmc/sdio_ids.h b/include/linux/mmc/sdio_ids.h
+index fe3d6d98f8da41..673cbdf4345330 100644
+--- a/include/linux/mmc/sdio_ids.h
++++ b/include/linux/mmc/sdio_ids.h
+@@ -77,7 +77,7 @@
+ #define SDIO_DEVICE_ID_BROADCOM_43439 0xa9af
+ #define SDIO_DEVICE_ID_BROADCOM_43455 0xa9bf
+ #define SDIO_DEVICE_ID_BROADCOM_43751 0xaae7
+-#define SDIO_DEVICE_ID_BROADCOM_CYPRESS_43752 0xaae8
++#define SDIO_DEVICE_ID_BROADCOM_43752 0xaae8
+
+ #define SDIO_VENDOR_ID_CYPRESS 0x04b4
+ #define SDIO_DEVICE_ID_BROADCOM_CYPRESS_43439 0xbd3d
+diff --git a/include/linux/msi.h b/include/linux/msi.h
+index e5e86a8529fb6f..3111ba95fbde49 100644
+--- a/include/linux/msi.h
++++ b/include/linux/msi.h
+@@ -568,6 +568,8 @@ enum {
+ MSI_FLAG_PARENT_PM_DEV = (1 << 8),
+ /* Support for parent mask/unmask */
+ MSI_FLAG_PCI_MSI_MASK_PARENT = (1 << 9),
++ /* Support for parent startup/shutdown */
++ MSI_FLAG_PCI_MSI_STARTUP_PARENT = (1 << 10),
+
+ /* Mask for the generic functionality */
+ MSI_GENERIC_FLAGS_MASK = GENMASK(15, 0),
+diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
+index 5c7c92659e736f..7ca2715edccca3 100644
+--- a/include/linux/nfslocalio.h
++++ b/include/linux/nfslocalio.h
+@@ -65,6 +65,8 @@ struct nfsd_localio_operations {
+ struct net *(*nfsd_file_put_local)(struct nfsd_file __rcu **);
+ struct nfsd_file *(*nfsd_file_get_local)(struct nfsd_file *);
+ struct file *(*nfsd_file_file)(struct nfsd_file *);
++ void (*nfsd_file_dio_alignment)(struct nfsd_file *,
++ u32 *, u32 *, u32 *);
+ } ____cacheline_aligned;
+
+ extern void nfsd_localio_ops_init(void);
+diff --git a/include/linux/once.h b/include/linux/once.h
+index 30346fcdc7995d..449a0e34ad5ad9 100644
+--- a/include/linux/once.h
++++ b/include/linux/once.h
+@@ -46,7 +46,7 @@ void __do_once_sleepable_done(bool *done, struct static_key_true *once_key,
+ #define DO_ONCE(func, ...) \
+ ({ \
+ bool ___ret = false; \
+- static bool __section(".data..once") ___done = false; \
++ static bool __section(".data..do_once") ___done = false; \
+ static DEFINE_STATIC_KEY_TRUE(___once_key); \
+ if (static_branch_unlikely(&___once_key)) { \
+ unsigned long ___flags; \
+@@ -64,7 +64,7 @@ void __do_once_sleepable_done(bool *done, struct static_key_true *once_key,
+ #define DO_ONCE_SLEEPABLE(func, ...) \
+ ({ \
+ bool ___ret = false; \
+- static bool __section(".data..once") ___done = false; \
++ static bool __section(".data..do_once") ___done = false; \
+ static DEFINE_STATIC_KEY_TRUE(___once_key); \
+ if (static_branch_unlikely(&___once_key)) { \
+ ___ret = __do_once_sleepable_start(&___done); \
+diff --git a/include/linux/phy.h b/include/linux/phy.h
+index bb45787d868484..04553419adc3ff 100644
+--- a/include/linux/phy.h
++++ b/include/linux/phy.h
+@@ -1273,9 +1273,13 @@ struct phy_driver {
+ #define to_phy_driver(d) container_of_const(to_mdio_common_driver(d), \
+ struct phy_driver, mdiodrv)
+
+-#define PHY_ID_MATCH_EXACT(id) .phy_id = (id), .phy_id_mask = GENMASK(31, 0)
+-#define PHY_ID_MATCH_MODEL(id) .phy_id = (id), .phy_id_mask = GENMASK(31, 4)
+-#define PHY_ID_MATCH_VENDOR(id) .phy_id = (id), .phy_id_mask = GENMASK(31, 10)
++#define PHY_ID_MATCH_EXTACT_MASK GENMASK(31, 0)
++#define PHY_ID_MATCH_MODEL_MASK GENMASK(31, 4)
++#define PHY_ID_MATCH_VENDOR_MASK GENMASK(31, 10)
++
++#define PHY_ID_MATCH_EXACT(id) .phy_id = (id), .phy_id_mask = PHY_ID_MATCH_EXTACT_MASK
++#define PHY_ID_MATCH_MODEL(id) .phy_id = (id), .phy_id_mask = PHY_ID_MATCH_MODEL_MASK
++#define PHY_ID_MATCH_VENDOR(id) .phy_id = (id), .phy_id_mask = PHY_ID_MATCH_VENDOR_MASK
+
+ /**
+ * phy_id_compare - compare @id1 with @id2 taking account of @mask
+@@ -1291,6 +1295,19 @@ static inline bool phy_id_compare(u32 id1, u32 id2, u32 mask)
+ return !((id1 ^ id2) & mask);
+ }
+
++/**
++ * phy_id_compare_vendor - compare @id with @vendor mask
++ * @id: PHY ID
++ * @vendor_mask: PHY Vendor mask
++ *
++ * Return: true if the bits from @id match @vendor using the
++ * generic PHY Vendor mask.
++ */
++static inline bool phy_id_compare_vendor(u32 id, u32 vendor_mask)
++{
++ return phy_id_compare(id, vendor_mask, PHY_ID_MATCH_VENDOR_MASK);
++}
++
+ /**
+ * phydev_id_compare - compare @id with the PHY's Clause 22 ID
+ * @phydev: the PHY device
+diff --git a/include/linux/power/max77705_charger.h b/include/linux/power/max77705_charger.h
+index fdec9af9c54183..a612795577b621 100644
+--- a/include/linux/power/max77705_charger.h
++++ b/include/linux/power/max77705_charger.h
+@@ -9,6 +9,8 @@
+ #ifndef __MAX77705_CHARGER_H
+ #define __MAX77705_CHARGER_H __FILE__
+
++#include <linux/regmap.h>
++
+ /* MAX77705_CHG_REG_CHG_INT */
+ #define MAX77705_BYP_I BIT(0)
+ #define MAX77705_INP_LIMIT_I BIT(1)
+@@ -63,7 +65,6 @@
+ #define MAX77705_BUCK_SHIFT 2
+ #define MAX77705_BOOST_SHIFT 3
+ #define MAX77705_WDTEN_SHIFT 4
+-#define MAX77705_MODE_MASK GENMASK(3, 0)
+ #define MAX77705_CHG_MASK BIT(MAX77705_CHG_SHIFT)
+ #define MAX77705_UNO_MASK BIT(MAX77705_UNO_SHIFT)
+ #define MAX77705_OTG_MASK BIT(MAX77705_OTG_SHIFT)
+@@ -74,34 +75,19 @@
+ #define MAX77705_OTG_CTRL (MAX77705_OTG_MASK | MAX77705_BOOST_MASK)
+
+ /* MAX77705_CHG_REG_CNFG_01 */
+-#define MAX77705_FCHGTIME_SHIFT 0
+-#define MAX77705_FCHGTIME_MASK GENMASK(2, 0)
+-#define MAX77705_CHG_RSTRT_SHIFT 4
+-#define MAX77705_CHG_RSTRT_MASK GENMASK(5, 4)
+ #define MAX77705_FCHGTIME_DISABLE 0
+ #define MAX77705_CHG_RSTRT_DISABLE 0x3
+
+-#define MAX77705_PQEN_SHIFT 7
+-#define MAX77705_PQEN_MASK BIT(7)
+ #define MAX77705_CHG_PQEN_DISABLE 0
+ #define MAX77705_CHG_PQEN_ENABLE 1
+
+ /* MAX77705_CHG_REG_CNFG_02 */
+-#define MAX77705_OTG_ILIM_SHIFT 6
+-#define MAX77705_OTG_ILIM_MASK GENMASK(7, 6)
+ #define MAX77705_OTG_ILIM_500 0
+ #define MAX77705_OTG_ILIM_900 1
+ #define MAX77705_OTG_ILIM_1200 2
+ #define MAX77705_OTG_ILIM_1500 3
+-#define MAX77705_CHG_CC GENMASK(5, 0)
+
+ /* MAX77705_CHG_REG_CNFG_03 */
+-#define MAX77705_TO_ITH_SHIFT 0
+-#define MAX77705_TO_ITH_MASK GENMASK(2, 0)
+-#define MAX77705_TO_TIME_SHIFT 3
+-#define MAX77705_TO_TIME_MASK GENMASK(5, 3)
+-#define MAX77705_SYS_TRACK_DIS_SHIFT 7
+-#define MAX77705_SYS_TRACK_DIS_MASK BIT(7)
+ #define MAX77705_TO_ITH_150MA 0
+ #define MAX77705_TO_TIME_30M 3
+ #define MAX77705_SYS_TRACK_ENABLE 0
+@@ -110,15 +96,8 @@
+ /* MAX77705_CHG_REG_CNFG_04 */
+ #define MAX77705_CHG_MINVSYS_SHIFT 6
+ #define MAX77705_CHG_MINVSYS_MASK GENMASK(7, 6)
+-#define MAX77705_CHG_PRM_SHIFT 0
+-#define MAX77705_CHG_PRM_MASK GENMASK(5, 0)
+-
+-#define MAX77705_CHG_CV_PRM_SHIFT 0
+-#define MAX77705_CHG_CV_PRM_MASK GENMASK(5, 0)
+
+ /* MAX77705_CHG_REG_CNFG_05 */
+-#define MAX77705_REG_B2SOVRC_SHIFT 0
+-#define MAX77705_REG_B2SOVRC_MASK GENMASK(3, 0)
+ #define MAX77705_B2SOVRC_DISABLE 0
+ #define MAX77705_B2SOVRC_4_5A 6
+ #define MAX77705_B2SOVRC_4_8A 8
+@@ -128,9 +107,8 @@
+ #define MAX77705_WDTCLR_SHIFT 0
+ #define MAX77705_WDTCLR_MASK GENMASK(1, 0)
+ #define MAX77705_WDTCLR 1
+-#define MAX77705_CHGPROT_MASK GENMASK(3, 2)
+-#define MAX77705_CHGPROT_UNLOCKED GENMASK(3, 2)
+-#define MAX77705_SLOWEST_LX_SLOPE GENMASK(6, 5)
++#define MAX77705_CHGPROT_UNLOCKED 3
++#define MAX77705_SLOWEST_LX_SLOPE 3
+
+ /* MAX77705_CHG_REG_CNFG_07 */
+ #define MAX77705_CHG_FMBST 4
+@@ -140,36 +118,14 @@
+ #define MAX77705_REG_FGSRC_MASK BIT(MAX77705_REG_FGSRC_SHIFT)
+
+ /* MAX77705_CHG_REG_CNFG_08 */
+-#define MAX77705_REG_FSW_SHIFT 0
+-#define MAX77705_REG_FSW_MASK GENMASK(1, 0)
+ #define MAX77705_CHG_FSW_3MHz 0
+ #define MAX77705_CHG_FSW_2MHz 1
+ #define MAX77705_CHG_FSW_1_5MHz 2
+
+ /* MAX77705_CHG_REG_CNFG_09 */
+-#define MAX77705_CHG_CHGIN_LIM_MASK GENMASK(6, 0)
+-#define MAX77705_CHG_EN_MASK BIT(7)
+ #define MAX77705_CHG_DISABLE 0
+-#define MAX77705_CHARGER_CHG_CHARGING(_reg) \
+- (((_reg) & MAX77705_CHG_EN_MASK) > 1)
+-
+-
+-/* MAX77705_CHG_REG_CNFG_10 */
+-#define MAX77705_CHG_WCIN_LIM GENMASK(5, 0)
+-
+-/* MAX77705_CHG_REG_CNFG_11 */
+-#define MAX77705_VBYPSET_SHIFT 0
+-#define MAX77705_VBYPSET_MASK GENMASK(6, 0)
+
+ /* MAX77705_CHG_REG_CNFG_12 */
+-#define MAX77705_CHGINSEL_SHIFT 5
+-#define MAX77705_CHGINSEL_MASK BIT(MAX77705_CHGINSEL_SHIFT)
+-#define MAX77705_WCINSEL_SHIFT 6
+-#define MAX77705_WCINSEL_MASK BIT(MAX77705_WCINSEL_SHIFT)
+-#define MAX77705_VCHGIN_REG_MASK GENMASK(4, 3)
+-#define MAX77705_WCIN_REG_MASK GENMASK(2, 1)
+-#define MAX77705_REG_DISKIP_SHIFT 0
+-#define MAX77705_REG_DISKIP_MASK BIT(MAX77705_REG_DISKIP_SHIFT)
+ /* REG=4.5V, UVLO=4.7V */
+ #define MAX77705_VCHGIN_4_5 0
+ /* REG=4.5V, UVLO=4.7V */
+@@ -183,9 +139,59 @@
+ #define MAX77705_CURRENT_CHGIN_MIN 100000
+ #define MAX77705_CURRENT_CHGIN_MAX 3200000
+
++enum max77705_field_idx {
++ MAX77705_CHGPROT,
++ MAX77705_CHG_EN,
++ MAX77705_CHG_CC_LIM,
++ MAX77705_CHG_CHGIN_LIM,
++ MAX77705_CHG_CV_PRM,
++ MAX77705_CHG_PQEN,
++ MAX77705_CHG_RSTRT,
++ MAX77705_CHG_WCIN,
++ MAX77705_FCHGTIME,
++ MAX77705_LX_SLOPE,
++ MAX77705_MODE,
++ MAX77705_OTG_ILIM,
++ MAX77705_REG_B2SOVRC,
++ MAX77705_REG_DISKIP,
++ MAX77705_REG_FSW,
++ MAX77705_SYS_TRACK,
++ MAX77705_TO,
++ MAX77705_TO_TIME,
++ MAX77705_VBYPSET,
++ MAX77705_VCHGIN,
++ MAX77705_WCIN,
++ MAX77705_N_REGMAP_FIELDS,
++};
++
++static const struct reg_field max77705_reg_field[MAX77705_N_REGMAP_FIELDS] = {
++ [MAX77705_MODE] = REG_FIELD(MAX77705_CHG_REG_CNFG_00, 0, 3),
++ [MAX77705_FCHGTIME] = REG_FIELD(MAX77705_CHG_REG_CNFG_01, 0, 2),
++ [MAX77705_CHG_RSTRT] = REG_FIELD(MAX77705_CHG_REG_CNFG_01, 4, 5),
++ [MAX77705_CHG_PQEN] = REG_FIELD(MAX77705_CHG_REG_CNFG_01, 7, 7),
++ [MAX77705_CHG_CC_LIM] = REG_FIELD(MAX77705_CHG_REG_CNFG_02, 0, 5),
++ [MAX77705_OTG_ILIM] = REG_FIELD(MAX77705_CHG_REG_CNFG_02, 6, 7),
++ [MAX77705_TO] = REG_FIELD(MAX77705_CHG_REG_CNFG_03, 0, 2),
++ [MAX77705_TO_TIME] = REG_FIELD(MAX77705_CHG_REG_CNFG_03, 3, 5),
++ [MAX77705_SYS_TRACK] = REG_FIELD(MAX77705_CHG_REG_CNFG_03, 7, 7),
++ [MAX77705_CHG_CV_PRM] = REG_FIELD(MAX77705_CHG_REG_CNFG_04, 0, 5),
++ [MAX77705_REG_B2SOVRC] = REG_FIELD(MAX77705_CHG_REG_CNFG_05, 0, 3),
++ [MAX77705_CHGPROT] = REG_FIELD(MAX77705_CHG_REG_CNFG_06, 2, 3),
++ [MAX77705_LX_SLOPE] = REG_FIELD(MAX77705_CHG_REG_CNFG_06, 5, 6),
++ [MAX77705_REG_FSW] = REG_FIELD(MAX77705_CHG_REG_CNFG_08, 0, 1),
++ [MAX77705_CHG_CHGIN_LIM] = REG_FIELD(MAX77705_CHG_REG_CNFG_09, 0, 6),
++ [MAX77705_CHG_EN] = REG_FIELD(MAX77705_CHG_REG_CNFG_09, 7, 7),
++ [MAX77705_CHG_WCIN] = REG_FIELD(MAX77705_CHG_REG_CNFG_10, 0, 5),
++ [MAX77705_VBYPSET] = REG_FIELD(MAX77705_CHG_REG_CNFG_11, 0, 6),
++ [MAX77705_REG_DISKIP] = REG_FIELD(MAX77705_CHG_REG_CNFG_12, 0, 0),
++ [MAX77705_WCIN] = REG_FIELD(MAX77705_CHG_REG_CNFG_12, 1, 2),
++ [MAX77705_VCHGIN] = REG_FIELD(MAX77705_CHG_REG_CNFG_12, 3, 4),
++};
++
+ struct max77705_charger_data {
+ struct device *dev;
+ struct regmap *regmap;
++ struct regmap_field *rfield[MAX77705_N_REGMAP_FIELDS];
+ struct power_supply_battery_info *bat_info;
+ struct workqueue_struct *wqueue;
+ struct work_struct chgin_work;
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 5263746b63e8c3..a3a24e115d4468 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -30,11 +30,19 @@ struct sd_flag_debug {
+ };
+ extern const struct sd_flag_debug sd_flag_debug[];
+
++struct sched_domain_topology_level;
++
+ #ifdef CONFIG_SCHED_SMT
+ static inline int cpu_smt_flags(void)
+ {
+ return SD_SHARE_CPUCAPACITY | SD_SHARE_LLC;
+ }
++
++static inline const
++struct cpumask *tl_smt_mask(struct sched_domain_topology_level *tl, int cpu)
++{
++ return cpu_smt_mask(cpu);
++}
+ #endif
+
+ #ifdef CONFIG_SCHED_CLUSTER
+@@ -42,6 +50,12 @@ static inline int cpu_cluster_flags(void)
+ {
+ return SD_CLUSTER | SD_SHARE_LLC;
+ }
++
++static inline const
++struct cpumask *tl_cls_mask(struct sched_domain_topology_level *tl, int cpu)
++{
++ return cpu_clustergroup_mask(cpu);
++}
+ #endif
+
+ #ifdef CONFIG_SCHED_MC
+@@ -49,8 +63,20 @@ static inline int cpu_core_flags(void)
+ {
+ return SD_SHARE_LLC;
+ }
++
++static inline const
++struct cpumask *tl_mc_mask(struct sched_domain_topology_level *tl, int cpu)
++{
++ return cpu_coregroup_mask(cpu);
++}
+ #endif
+
++static inline const
++struct cpumask *tl_pkg_mask(struct sched_domain_topology_level *tl, int cpu)
++{
++ return cpu_node_mask(cpu);
++}
++
+ #ifdef CONFIG_NUMA
+ static inline int cpu_numa_flags(void)
+ {
+@@ -172,7 +198,7 @@ bool cpus_equal_capacity(int this_cpu, int that_cpu);
+ bool cpus_share_cache(int this_cpu, int that_cpu);
+ bool cpus_share_resources(int this_cpu, int that_cpu);
+
+-typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
++typedef const struct cpumask *(*sched_domain_mask_f)(struct sched_domain_topology_level *tl, int cpu);
+ typedef int (*sched_domain_flags_f)(void);
+
+ struct sd_data {
+diff --git a/include/linux/topology.h b/include/linux/topology.h
+index 33b7fda97d3902..6575af39fd10f7 100644
+--- a/include/linux/topology.h
++++ b/include/linux/topology.h
+@@ -260,7 +260,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
+
+ #endif
+
+-static inline const struct cpumask *cpu_cpu_mask(int cpu)
++static inline const struct cpumask *cpu_node_mask(int cpu)
+ {
+ return cpumask_of_node(cpu_to_node(cpu));
+ }
+diff --git a/include/net/bonding.h b/include/net/bonding.h
+index e06f0d63b2c176..bd56ad976cfb02 100644
+--- a/include/net/bonding.h
++++ b/include/net/bonding.h
+@@ -711,6 +711,7 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
+ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave);
+ void bond_slave_arr_work_rearm(struct bonding *bond, unsigned long delay);
+ void bond_work_init_all(struct bonding *bond);
++void bond_work_cancel_all(struct bonding *bond);
+
+ #ifdef CONFIG_PROC_FS
+ void bond_create_proc_entry(struct bonding *bond);
+diff --git a/include/net/dst.h b/include/net/dst.h
+index bab01363bb975d..f8aa1239b4db63 100644
+--- a/include/net/dst.h
++++ b/include/net/dst.h
+@@ -24,7 +24,10 @@
+ struct sk_buff;
+
+ struct dst_entry {
+- struct net_device *dev;
++ union {
++ struct net_device *dev;
++ struct net_device __rcu *dev_rcu;
++ };
+ struct dst_ops *ops;
+ unsigned long _metrics;
+ unsigned long expires;
+@@ -570,9 +573,12 @@ static inline struct net_device *dst_dev(const struct dst_entry *dst)
+
+ static inline struct net_device *dst_dev_rcu(const struct dst_entry *dst)
+ {
+- /* In the future, use rcu_dereference(dst->dev) */
+- WARN_ON_ONCE(!rcu_read_lock_held());
+- return READ_ONCE(dst->dev);
++ return rcu_dereference(dst->dev_rcu);
++}
++
++static inline struct net *dst_dev_net_rcu(const struct dst_entry *dst)
++{
++ return dev_net_rcu(dst_dev_rcu(dst));
+ }
+
+ static inline struct net_device *skb_dst_dev(const struct sk_buff *skb)
+@@ -592,7 +598,7 @@ static inline struct net *skb_dst_dev_net(const struct sk_buff *skb)
+
+ static inline struct net *skb_dst_dev_net_rcu(const struct sk_buff *skb)
+ {
+- return dev_net_rcu(skb_dst_dev(skb));
++ return dev_net_rcu(skb_dst_dev_rcu(skb));
+ }
+
+ struct dst_entry *dst_blackhole_check(struct dst_entry *dst, u32 cookie);
+diff --git a/include/net/ip.h b/include/net/ip.h
+index befcba575129ac..a1624e8db1abdc 100644
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -338,6 +338,19 @@ static inline u64 snmp_fold_field64(void __percpu *mib, int offt, size_t syncp_o
+ } \
+ }
+
++#define snmp_get_cpu_field64_batch_cnt(buff64, stats_list, cnt, \
++ mib_statistic, offset) \
++{ \
++ int i, c; \
++ for_each_possible_cpu(c) { \
++ for (i = 0; i < cnt; i++) \
++ buff64[i] += snmp_get_cpu_field64( \
++ mib_statistic, \
++ c, stats_list[i].entry, \
++ offset); \
++ } \
++}
++
+ #define snmp_get_cpu_field_batch(buff, stats_list, mib_statistic) \
+ { \
+ int i, c; \
+@@ -349,6 +362,17 @@ static inline u64 snmp_fold_field64(void __percpu *mib, int offt, size_t syncp_o
+ } \
+ }
+
++#define snmp_get_cpu_field_batch_cnt(buff, stats_list, cnt, mib_statistic) \
++{ \
++ int i, c; \
++ for_each_possible_cpu(c) { \
++ for (i = 0; i < cnt; i++) \
++ buff[i] += snmp_get_cpu_field( \
++ mib_statistic, \
++ c, stats_list[i].entry); \
++ } \
++}
++
+ static inline void inet_get_local_port_range(const struct net *net, int *low, int *high)
+ {
+ u32 range = READ_ONCE(net->ipv4.ip_local_ports.range);
+@@ -467,12 +491,14 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
+ bool forwarding)
+ {
+ const struct rtable *rt = dst_rtable(dst);
++ const struct net_device *dev;
+ unsigned int mtu, res;
+ struct net *net;
+
+ rcu_read_lock();
+
+- net = dev_net_rcu(dst_dev(dst));
++ dev = dst_dev_rcu(dst);
++ net = dev_net_rcu(dev);
+ if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
+ ip_mtu_locked(dst) ||
+ !forwarding) {
+@@ -486,7 +512,7 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
+ if (mtu)
+ goto out;
+
+- mtu = READ_ONCE(dst_dev(dst)->mtu);
++ mtu = READ_ONCE(dev->mtu);
+
+ if (unlikely(ip_mtu_locked(dst))) {
+ if (rt->rt_uses_gateway && mtu > 576)
+diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
+index 9255f21818ee7b..59f48ca3abdf5a 100644
+--- a/include/net/ip6_route.h
++++ b/include/net/ip6_route.h
+@@ -337,7 +337,7 @@ static inline unsigned int ip6_dst_mtu_maybe_forward(const struct dst_entry *dst
+
+ mtu = IPV6_MIN_MTU;
+ rcu_read_lock();
+- idev = __in6_dev_get(dst_dev(dst));
++ idev = __in6_dev_get(dst_dev_rcu(dst));
+ if (idev)
+ mtu = READ_ONCE(idev->cnf.mtu6);
+ rcu_read_unlock();
+diff --git a/include/net/route.h b/include/net/route.h
+index 7ea840daa775b2..c916bbe25a774a 100644
+--- a/include/net/route.h
++++ b/include/net/route.h
+@@ -390,7 +390,7 @@ static inline int ip4_dst_hoplimit(const struct dst_entry *dst)
+ const struct net *net;
+
+ rcu_read_lock();
+- net = dev_net_rcu(dst_dev(dst));
++ net = dst_dev_net_rcu(dst);
+ hoplimit = READ_ONCE(net->ipv4.sysctl_ip_default_ttl);
+ rcu_read_unlock();
+ }
+diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
+index ba460b6c0374de..8d38565e99fa1c 100644
+--- a/include/scsi/libsas.h
++++ b/include/scsi/libsas.h
+@@ -203,6 +203,14 @@ static inline bool dev_is_expander(enum sas_device_type type)
+ type == SAS_FANOUT_EXPANDER_DEVICE;
+ }
+
++static inline bool dev_parent_is_expander(struct domain_device *dev)
++{
++ if (!dev->parent)
++ return false;
++
++ return dev_is_expander(dev->parent->dev_type);
++}
++
+ static inline void INIT_SAS_WORK(struct sas_work *sw, void (*fn)(struct work_struct *))
+ {
+ INIT_WORK(&sw->work, fn);
+diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
+index b8d1e00a7982c9..2dfeb158e848a5 100644
+--- a/include/trace/events/filelock.h
++++ b/include/trace/events/filelock.h
+@@ -27,7 +27,8 @@
+ { FL_SLEEP, "FL_SLEEP" }, \
+ { FL_DOWNGRADE_PENDING, "FL_DOWNGRADE_PENDING" }, \
+ { FL_UNLOCK_PENDING, "FL_UNLOCK_PENDING" }, \
+- { FL_OFDLCK, "FL_OFDLCK" })
++ { FL_OFDLCK, "FL_OFDLCK" }, \
++ { FL_RECLAIM, "FL_RECLAIM"})
+
+ #define show_fl_type(val) \
+ __print_symbolic(val, \
+diff --git a/include/trace/misc/fs.h b/include/trace/misc/fs.h
+index 0406ebe2a80a49..7ead1c61f0cb13 100644
+--- a/include/trace/misc/fs.h
++++ b/include/trace/misc/fs.h
+@@ -141,3 +141,25 @@
+ { ATTR_TIMES_SET, "TIMES_SET" }, \
+ { ATTR_TOUCH, "TOUCH"}, \
+ { ATTR_DELEG, "DELEG"})
++
++#define show_statx_mask(flags) \
++ __print_flags(flags, "|", \
++ { STATX_TYPE, "TYPE" }, \
++ { STATX_MODE, "MODE" }, \
++ { STATX_NLINK, "NLINK" }, \
++ { STATX_UID, "UID" }, \
++ { STATX_GID, "GID" }, \
++ { STATX_ATIME, "ATIME" }, \
++ { STATX_MTIME, "MTIME" }, \
++ { STATX_CTIME, "CTIME" }, \
++ { STATX_INO, "INO" }, \
++ { STATX_SIZE, "SIZE" }, \
++ { STATX_BLOCKS, "BLOCKS" }, \
++ { STATX_BASIC_STATS, "BASIC_STATS" }, \
++ { STATX_BTIME, "BTIME" }, \
++ { STATX_MNT_ID, "MNT_ID" }, \
++ { STATX_DIOALIGN, "DIOALIGN" }, \
++ { STATX_MNT_ID_UNIQUE, "MNT_ID_UNIQUE" }, \
++ { STATX_SUBVOL, "SUBVOL" }, \
++ { STATX_WRITE_ATOMIC, "WRITE_ATOMIC" }, \
++ { STATX_DIO_READ_ALIGN, "DIO_READ_ALIGN" })
+diff --git a/include/uapi/linux/hidraw.h b/include/uapi/linux/hidraw.h
+index d5ee269864e07f..ebd701b3c18d9d 100644
+--- a/include/uapi/linux/hidraw.h
++++ b/include/uapi/linux/hidraw.h
+@@ -48,6 +48,8 @@ struct hidraw_devinfo {
+ #define HIDIOCGOUTPUT(len) _IOC(_IOC_WRITE|_IOC_READ, 'H', 0x0C, len)
+ #define HIDIOCREVOKE _IOW('H', 0x0D, int) /* Revoke device access */
+
++#define HIDIOCTL_LAST _IOC_NR(HIDIOCREVOKE)
++
+ #define HIDRAW_FIRST_MINOR 0
+ #define HIDRAW_MAX_DEVICES 64
+ /* number of reports to buffer */
+diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
+index 1d394377758424..a3fa98540d1845 100644
+--- a/include/ufs/ufshcd.h
++++ b/include/ufs/ufshcd.h
+@@ -963,6 +963,7 @@ enum ufshcd_mcq_opr {
+ * @ufs_rtc_update_work: A work for UFS RTC periodic update
+ * @pm_qos_req: PM QoS request handle
+ * @pm_qos_enabled: flag to check if pm qos is enabled
++ * @pm_qos_mutex: synchronizes PM QoS request and status updates
+ * @critical_health_count: count of critical health exceptions
+ * @dev_lvl_exception_count: count of device level exceptions since last reset
+ * @dev_lvl_exception_id: vendor specific information about the
+@@ -1136,6 +1137,8 @@ struct ufs_hba {
+ struct delayed_work ufs_rtc_update_work;
+ struct pm_qos_request pm_qos_req;
+ bool pm_qos_enabled;
++ /* synchronizes PM QoS request and status updates */
++ struct mutex pm_qos_mutex;
+
+ int critical_health_count;
+ atomic_t dev_lvl_exception_count;
+diff --git a/include/vdso/gettime.h b/include/vdso/gettime.h
+index c50d152e7b3e06..9ac161866653a0 100644
+--- a/include/vdso/gettime.h
++++ b/include/vdso/gettime.h
+@@ -5,6 +5,7 @@
+ #include <linux/types.h>
+
+ struct __kernel_timespec;
++struct __kernel_old_timeval;
+ struct timezone;
+
+ #if !defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64)
+diff --git a/init/Kconfig b/init/Kconfig
+index ecddb94db8dc01..87c868f86a0605 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -102,7 +102,7 @@ config CC_HAS_ASM_GOTO_OUTPUT
+ # Detect basic support
+ depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
+ # Detect clang (< v17) scoped label issues
+- depends on $(success,echo 'void b(void **);void* c(void);int f(void){{asm goto("jmp %l0"::::l0);return 0;l0:return 1;}void *x __attribute__((cleanup(b)))=c();{asm goto("jmp %l0"::::l1);return 2;l1:return 3;}}' | $(CC) -x c - -c -o /dev/null)
++ depends on $(success,echo 'void b(void **);void* c(void);int f(void){{asm goto(""::::l0);return 0;l0:return 1;}void *x __attribute__((cleanup(b)))=c();{asm goto(""::::l1);return 2;l1:return 3;}}' | $(CC) -x c - -c -o /dev/null)
+
+ config CC_HAS_ASM_GOTO_TIED_OUTPUT
+ depends on CC_HAS_ASM_GOTO_OUTPUT
+@@ -1504,6 +1504,7 @@ config BOOT_CONFIG_EMBED_FILE
+
+ config INITRAMFS_PRESERVE_MTIME
+ bool "Preserve cpio archive mtimes in initramfs"
++ depends on BLK_DEV_INITRD
+ default y
+ help
+ Each entry in an initramfs cpio archive carries an mtime value. When
+diff --git a/io_uring/waitid.c b/io_uring/waitid.c
+index e07a9469439737..3101ad8ec0cf62 100644
+--- a/io_uring/waitid.c
++++ b/io_uring/waitid.c
+@@ -232,13 +232,14 @@ static int io_waitid_wait(struct wait_queue_entry *wait, unsigned mode,
+ if (!pid_child_should_wake(wo, p))
+ return 0;
+
++ list_del_init(&wait->entry);
++
+ /* cancel is in progress */
+ if (atomic_fetch_inc(&iw->refs) & IO_WAITID_REF_MASK)
+ return 1;
+
+ req->io_task_work.func = io_waitid_cb;
+ io_req_task_work_add(req);
+- list_del_init(&wait->entry);
+ return 1;
+ }
+
+diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
+index e5ff49f3425e07..643a69f9ffe2ae 100644
+--- a/io_uring/zcrx.c
++++ b/io_uring/zcrx.c
+@@ -1154,12 +1154,16 @@ io_zcrx_recv_skb(read_descriptor_t *desc, struct sk_buff *skb,
+
+ end = start + frag_iter->len;
+ if (offset < end) {
++ size_t count;
++
+ copy = end - offset;
+ if (copy > len)
+ copy = len;
+
+ off = offset - start;
++ count = desc->count;
+ ret = io_zcrx_recv_skb(desc, frag_iter, off, copy);
++ desc->count = count;
+ if (ret < 0)
+ goto out;
+
+diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
+index e4568d44e82790..f6dd071f5e38c8 100644
+--- a/kernel/bpf/core.c
++++ b/kernel/bpf/core.c
+@@ -2393,6 +2393,7 @@ static bool __bpf_prog_map_compatible(struct bpf_map *map,
+ map->owner->type = prog_type;
+ map->owner->jited = fp->jited;
+ map->owner->xdp_has_frags = aux->xdp_has_frags;
++ map->owner->expected_attach_type = fp->expected_attach_type;
+ map->owner->attach_func_proto = aux->attach_func_proto;
+ for_each_cgroup_storage_type(i) {
+ map->owner->storage_cookie[i] =
+@@ -2404,6 +2405,10 @@ static bool __bpf_prog_map_compatible(struct bpf_map *map,
+ ret = map->owner->type == prog_type &&
+ map->owner->jited == fp->jited &&
+ map->owner->xdp_has_frags == aux->xdp_has_frags;
++ if (ret &&
++ map->map_type == BPF_MAP_TYPE_PROG_ARRAY &&
++ map->owner->expected_attach_type != fp->expected_attach_type)
++ ret = false;
+ for_each_cgroup_storage_type(i) {
+ if (!ret)
+ break;
+diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
+index 8af62cb243d9ec..9c750a6a895bf7 100644
+--- a/kernel/bpf/helpers.c
++++ b/kernel/bpf/helpers.c
+@@ -774,11 +774,9 @@ int bpf_try_get_buffers(struct bpf_bprintf_buffers **bufs)
+ {
+ int nest_level;
+
+- preempt_disable();
+ nest_level = this_cpu_inc_return(bpf_bprintf_nest_level);
+ if (WARN_ON_ONCE(nest_level > MAX_BPRINTF_NEST_LEVEL)) {
+ this_cpu_dec(bpf_bprintf_nest_level);
+- preempt_enable();
+ return -EBUSY;
+ }
+ *bufs = this_cpu_ptr(&bpf_bprintf_bufs[nest_level - 1]);
+@@ -791,7 +789,6 @@ void bpf_put_buffers(void)
+ if (WARN_ON_ONCE(this_cpu_read(bpf_bprintf_nest_level) == 0))
+ return;
+ this_cpu_dec(bpf_bprintf_nest_level);
+- preempt_enable();
+ }
+
+ void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index 9fb1f957a09374..ed1457c2734092 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -1946,9 +1946,24 @@ static int maybe_exit_scc(struct bpf_verifier_env *env, struct bpf_verifier_stat
+ return 0;
+ visit = scc_visit_lookup(env, callchain);
+ if (!visit) {
+- verifier_bug(env, "scc exit: no visit info for call chain %s",
+- format_callchain(env, callchain));
+- return -EFAULT;
++ /*
++ * If path traversal stops inside an SCC, corresponding bpf_scc_visit
++ * must exist for non-speculative paths. For non-speculative paths
++ * traversal stops when:
++ * a. Verification error is found, maybe_exit_scc() is not called.
++ * b. Top level BPF_EXIT is reached. Top level BPF_EXIT is not a member
++ * of any SCC.
++ * c. A checkpoint is reached and matched. Checkpoints are created by
++ * is_state_visited(), which calls maybe_enter_scc(), which allocates
++ * bpf_scc_visit instances for checkpoints within SCCs.
++ * (c) is the only case that can reach this point.
++ */
++ if (!st->speculative) {
++ verifier_bug(env, "scc exit: no visit info for call chain %s",
++ format_callchain(env, callchain));
++ return -EFAULT;
++ }
++ return 0;
+ }
+ if (visit->entry_state != st)
+ return 0;
+@@ -15577,7 +15592,8 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ }
+
+ /* check dest operand */
+- if (opcode == BPF_NEG) {
++ if (opcode == BPF_NEG &&
++ regs[insn->dst_reg].type == SCALAR_VALUE) {
+ err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK);
+ err = err ?: adjust_scalar_min_max_vals(env, insn,
+ ®s[insn->dst_reg],
+@@ -15739,7 +15755,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ } else { /* all other ALU ops: and, sub, xor, add, ... */
+
+ if (BPF_SRC(insn->code) == BPF_X) {
+- if (insn->imm != 0 || insn->off > 1 ||
++ if (insn->imm != 0 || (insn->off != 0 && insn->off != 1) ||
+ (insn->off == 1 && opcode != BPF_MOD && opcode != BPF_DIV)) {
+ verbose(env, "BPF_ALU uses reserved fields\n");
+ return -EINVAL;
+@@ -15749,7 +15765,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ if (err)
+ return err;
+ } else {
+- if (insn->src_reg != BPF_REG_0 || insn->off > 1 ||
++ if (insn->src_reg != BPF_REG_0 || (insn->off != 0 && insn->off != 1) ||
+ (insn->off == 1 && opcode != BPF_MOD && opcode != BPF_DIV)) {
+ verbose(env, "BPF_ALU uses reserved fields\n");
+ return -EINVAL;
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index 27adb04df675d4..fef93032fe7e4d 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -1716,6 +1716,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
+ xcpus = tmp->delmask;
+ if (compute_effective_exclusive_cpumask(cs, xcpus, NULL))
+ WARN_ON_ONCE(!cpumask_empty(cs->exclusive_cpus));
++ new_prs = (cmd == partcmd_enable) ? PRS_ROOT : PRS_ISOLATED;
+
+ /*
+ * Enabling partition root is not allowed if its
+@@ -1748,7 +1749,6 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
+
+ deleting = true;
+ subparts_delta++;
+- new_prs = (cmd == partcmd_enable) ? PRS_ROOT : PRS_ISOLATED;
+ } else if (cmd == partcmd_disable) {
+ /*
+ * May need to add cpus back to parent's effective_cpus
+diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
+index 7ca1940607bd83..4b97d16f731c1a 100644
+--- a/kernel/events/uprobes.c
++++ b/kernel/events/uprobes.c
+@@ -121,7 +121,7 @@ struct xol_area {
+
+ static void uprobe_warn(struct task_struct *t, const char *msg)
+ {
+- pr_warn("uprobe: %s:%d failed to %s\n", current->comm, current->pid, msg);
++ pr_warn("uprobe: %s:%d failed to %s\n", t->comm, t->pid, msg);
+ }
+
+ /*
+diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
+index 1da5e9d9da7193..a75df2bb9db66f 100644
+--- a/kernel/irq/Kconfig
++++ b/kernel/irq/Kconfig
+@@ -147,7 +147,9 @@ config GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD
+ config IRQ_KUNIT_TEST
+ bool "KUnit tests for IRQ management APIs" if !KUNIT_ALL_TESTS
+ depends on KUNIT=y
++ depends on SPARSE_IRQ
+ default KUNIT_ALL_TESTS
++ select IRQ_DOMAIN
+ imply SMP
+ help
+ This option enables KUnit tests for the IRQ subsystem API. These are
+diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
+index 0d0276378c707c..3ffa0d80ddd19c 100644
+--- a/kernel/irq/chip.c
++++ b/kernel/irq/chip.c
+@@ -1259,6 +1259,43 @@ int irq_chip_get_parent_state(struct irq_data *data,
+ }
+ EXPORT_SYMBOL_GPL(irq_chip_get_parent_state);
+
++/**
++ * irq_chip_shutdown_parent - Shutdown the parent interrupt
++ * @data: Pointer to interrupt specific data
++ *
++ * Invokes the irq_shutdown() callback of the parent if available or falls
++ * back to irq_chip_disable_parent().
++ */
++void irq_chip_shutdown_parent(struct irq_data *data)
++{
++ struct irq_data *parent = data->parent_data;
++
++ if (parent->chip->irq_shutdown)
++ parent->chip->irq_shutdown(parent);
++ else
++ irq_chip_disable_parent(data);
++}
++EXPORT_SYMBOL_GPL(irq_chip_shutdown_parent);
++
++/**
++ * irq_chip_startup_parent - Startup the parent interrupt
++ * @data: Pointer to interrupt specific data
++ *
++ * Invokes the irq_startup() callback of the parent if available or falls
++ * back to irq_chip_enable_parent().
++ */
++unsigned int irq_chip_startup_parent(struct irq_data *data)
++{
++ struct irq_data *parent = data->parent_data;
++
++ if (parent->chip->irq_startup)
++ return parent->chip->irq_startup(parent);
++
++ irq_chip_enable_parent(data);
++ return 0;
++}
++EXPORT_SYMBOL_GPL(irq_chip_startup_parent);
++
+ /**
+ * irq_chip_enable_parent - Enable the parent interrupt (defaults to unmask if
+ * NULL)
+diff --git a/kernel/irq/irq_test.c b/kernel/irq/irq_test.c
+index a75abebed7f248..f71f46fdcfd5e6 100644
+--- a/kernel/irq/irq_test.c
++++ b/kernel/irq/irq_test.c
+@@ -54,6 +54,9 @@ static void irq_disable_depth_test(struct kunit *test)
+ desc = irq_to_desc(virq);
+ KUNIT_ASSERT_PTR_NE(test, desc, NULL);
+
++ /* On some architectures, IRQs are NOREQUEST | NOPROBE by default. */
++ irq_settings_clr_norequest(desc);
++
+ ret = request_irq(virq, noop_handler, 0, "test_irq", NULL);
+ KUNIT_EXPECT_EQ(test, ret, 0);
+
+@@ -81,6 +84,9 @@ static void irq_free_disabled_test(struct kunit *test)
+ desc = irq_to_desc(virq);
+ KUNIT_ASSERT_PTR_NE(test, desc, NULL);
+
++ /* On some architectures, IRQs are NOREQUEST | NOPROBE by default. */
++ irq_settings_clr_norequest(desc);
++
+ ret = request_irq(virq, noop_handler, 0, "test_irq", NULL);
+ KUNIT_EXPECT_EQ(test, ret, 0);
+
+@@ -120,6 +126,9 @@ static void irq_shutdown_depth_test(struct kunit *test)
+ desc = irq_to_desc(virq);
+ KUNIT_ASSERT_PTR_NE(test, desc, NULL);
+
++ /* On some architectures, IRQs are NOREQUEST | NOPROBE by default. */
++ irq_settings_clr_norequest(desc);
++
+ data = irq_desc_get_irq_data(desc);
+ KUNIT_ASSERT_PTR_NE(test, data, NULL);
+
+@@ -169,6 +178,8 @@ static void irq_cpuhotplug_test(struct kunit *test)
+ kunit_skip(test, "requires more than 1 CPU for CPU hotplug");
+ if (!cpu_is_hotpluggable(1))
+ kunit_skip(test, "CPU 1 must be hotpluggable");
++ if (!cpu_online(1))
++ kunit_skip(test, "CPU 1 must be online");
+
+ cpumask_copy(&affinity.mask, cpumask_of(1));
+
+@@ -180,6 +191,9 @@ static void irq_cpuhotplug_test(struct kunit *test)
+ desc = irq_to_desc(virq);
+ KUNIT_ASSERT_PTR_NE(test, desc, NULL);
+
++ /* On some architectures, IRQs are NOREQUEST | NOPROBE by default. */
++ irq_settings_clr_norequest(desc);
++
+ data = irq_desc_get_irq_data(desc);
+ KUNIT_ASSERT_PTR_NE(test, data, NULL);
+
+@@ -196,13 +210,9 @@ static void irq_cpuhotplug_test(struct kunit *test)
+ KUNIT_EXPECT_EQ(test, desc->depth, 1);
+
+ KUNIT_EXPECT_EQ(test, remove_cpu(1), 0);
+- KUNIT_EXPECT_FALSE(test, irqd_is_activated(data));
+- KUNIT_EXPECT_FALSE(test, irqd_is_started(data));
+ KUNIT_EXPECT_GE(test, desc->depth, 1);
+ KUNIT_EXPECT_EQ(test, add_cpu(1), 0);
+
+- KUNIT_EXPECT_FALSE(test, irqd_is_activated(data));
+- KUNIT_EXPECT_FALSE(test, irqd_is_started(data));
+ KUNIT_EXPECT_EQ(test, desc->depth, 1);
+
+ enable_irq(virq);
+diff --git a/kernel/pid.c b/kernel/pid.c
+index c45a28c16cd256..d94ce025050127 100644
+--- a/kernel/pid.c
++++ b/kernel/pid.c
+@@ -680,7 +680,7 @@ static int pid_table_root_permissions(struct ctl_table_header *head,
+ container_of(head->set, struct pid_namespace, set);
+ int mode = table->mode;
+
+- if (ns_capable(pidns->user_ns, CAP_SYS_ADMIN) ||
++ if (ns_capable_noaudit(pidns->user_ns, CAP_SYS_ADMIN) ||
+ uid_eq(current_euid(), make_kuid(pidns->user_ns, 0)))
+ mode = (mode & S_IRWXU) >> 6;
+ else if (in_egroup_p(make_kgid(pidns->user_ns, 0)))
+diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
+index 6e9fe2ce1075d5..e3b64a5e0ec7e1 100644
+--- a/kernel/rcu/srcutiny.c
++++ b/kernel/rcu/srcutiny.c
+@@ -176,10 +176,9 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
+ {
+ unsigned long cookie;
+
+- preempt_disable(); // Needed for PREEMPT_LAZY
++ lockdep_assert_preemption_disabled(); // Needed for PREEMPT_LAZY
+ cookie = get_state_synchronize_srcu(ssp);
+ if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie)) {
+- preempt_enable();
+ return;
+ }
+ WRITE_ONCE(ssp->srcu_idx_max, cookie);
+@@ -189,7 +188,6 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
+ else if (list_empty(&ssp->srcu_work.entry))
+ list_add(&ssp->srcu_work.entry, &srcu_boot_list);
+ }
+- preempt_enable();
+ }
+
+ /*
+diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
+index 6e2f54169e66c0..36d4f9f063516f 100644
+--- a/kernel/sched/topology.c
++++ b/kernel/sched/topology.c
+@@ -1591,7 +1591,6 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
+ enum numa_topology_type sched_numa_topology_type;
+
+ static int sched_domains_numa_levels;
+-static int sched_domains_curr_level;
+
+ int sched_max_numa_distance;
+ static int *sched_domains_numa_distance;
+@@ -1632,14 +1631,7 @@ sd_init(struct sched_domain_topology_level *tl,
+ int sd_id, sd_weight, sd_flags = 0;
+ struct cpumask *sd_span;
+
+-#ifdef CONFIG_NUMA
+- /*
+- * Ugly hack to pass state to sd_numa_mask()...
+- */
+- sched_domains_curr_level = tl->numa_level;
+-#endif
+-
+- sd_weight = cpumask_weight(tl->mask(cpu));
++ sd_weight = cpumask_weight(tl->mask(tl, cpu));
+
+ if (tl->sd_flags)
+ sd_flags = (*tl->sd_flags)();
+@@ -1677,7 +1669,7 @@ sd_init(struct sched_domain_topology_level *tl,
+ };
+
+ sd_span = sched_domain_span(sd);
+- cpumask_and(sd_span, cpu_map, tl->mask(cpu));
++ cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
+ sd_id = cpumask_first(sd_span);
+
+ sd->flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
+@@ -1737,17 +1729,17 @@ sd_init(struct sched_domain_topology_level *tl,
+ */
+ static struct sched_domain_topology_level default_topology[] = {
+ #ifdef CONFIG_SCHED_SMT
+- SDTL_INIT(cpu_smt_mask, cpu_smt_flags, SMT),
++ SDTL_INIT(tl_smt_mask, cpu_smt_flags, SMT),
+ #endif
+
+ #ifdef CONFIG_SCHED_CLUSTER
+- SDTL_INIT(cpu_clustergroup_mask, cpu_cluster_flags, CLS),
++ SDTL_INIT(tl_cls_mask, cpu_cluster_flags, CLS),
+ #endif
+
+ #ifdef CONFIG_SCHED_MC
+- SDTL_INIT(cpu_coregroup_mask, cpu_core_flags, MC),
++ SDTL_INIT(tl_mc_mask, cpu_core_flags, MC),
+ #endif
+- SDTL_INIT(cpu_cpu_mask, NULL, PKG),
++ SDTL_INIT(tl_pkg_mask, NULL, PKG),
+ { NULL, },
+ };
+
+@@ -1769,9 +1761,9 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
+
+ #ifdef CONFIG_NUMA
+
+-static const struct cpumask *sd_numa_mask(int cpu)
++static const struct cpumask *sd_numa_mask(struct sched_domain_topology_level *tl, int cpu)
+ {
+- return sched_domains_numa_masks[sched_domains_curr_level][cpu_to_node(cpu)];
++ return sched_domains_numa_masks[tl->numa_level][cpu_to_node(cpu)];
+ }
+
+ static void sched_numa_warn(const char *str)
+@@ -2413,7 +2405,7 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
+ * breaks the linking done for an earlier span.
+ */
+ for_each_cpu(cpu, cpu_map) {
+- const struct cpumask *tl_cpu_mask = tl->mask(cpu);
++ const struct cpumask *tl_cpu_mask = tl->mask(tl, cpu);
+ int id;
+
+ /* lowest bit set in this mask is used as a unique id */
+@@ -2421,7 +2413,7 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
+
+ if (cpumask_test_cpu(id, id_seen)) {
+ /* First CPU has already been seen, ensure identical spans */
+- if (!cpumask_equal(tl->mask(id), tl_cpu_mask))
++ if (!cpumask_equal(tl->mask(tl, id), tl_cpu_mask))
+ return false;
+ } else {
+ /* First CPU hasn't been seen before, ensure it's a completely new span */
+diff --git a/kernel/seccomp.c b/kernel/seccomp.c
+index 41aa761c7738ce..3bbfba30a777a1 100644
+--- a/kernel/seccomp.c
++++ b/kernel/seccomp.c
+@@ -1139,7 +1139,7 @@ static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd, struct seccomp_kn
+ static bool should_sleep_killable(struct seccomp_filter *match,
+ struct seccomp_knotif *n)
+ {
+- return match->wait_killable_recv && n->state == SECCOMP_NOTIFY_SENT;
++ return match->wait_killable_recv && n->state >= SECCOMP_NOTIFY_SENT;
+ }
+
+ static int seccomp_do_user_notification(int this_syscall,
+@@ -1186,13 +1186,11 @@ static int seccomp_do_user_notification(int this_syscall,
+
+ if (err != 0) {
+ /*
+- * Check to see if the notifcation got picked up and
+- * whether we should switch to wait killable.
++ * Check to see whether we should switch to wait
++ * killable. Only return the interrupted error if not.
+ */
+- if (!wait_killable && should_sleep_killable(match, &n))
+- continue;
+-
+- goto interrupted;
++ if (!(!wait_killable && should_sleep_killable(match, &n)))
++ goto interrupted;
+ }
+
+ addfd = list_first_entry_or_null(&n.addfd,
+diff --git a/kernel/smp.c b/kernel/smp.c
+index 56f83aa58ec82f..02f52291fae425 100644
+--- a/kernel/smp.c
++++ b/kernel/smp.c
+@@ -884,16 +884,15 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
+ * @mask: The set of cpus to run on (only runs on online subset).
+ * @func: The function to run. This must be fast and non-blocking.
+ * @info: An arbitrary pointer to pass to the function.
+- * @wait: Bitmask that controls the operation. If %SCF_WAIT is set, wait
+- * (atomically) until function has completed on other CPUs. If
+- * %SCF_RUN_LOCAL is set, the function will also be run locally
+- * if the local CPU is set in the @cpumask.
+- *
+- * If @wait is true, then returns once @func has returned.
++ * @wait: If true, wait (atomically) until function has completed
++ * on other CPUs.
+ *
+ * You must not call this function with disabled interrupts or from a
+ * hardware interrupt handler or from a bottom half handler. Preemption
+ * must be disabled when calling this function.
++ *
++ * @func is not called on the local CPU even if @mask contains it. Consider
++ * using on_each_cpu_cond_mask() instead if this is not desirable.
+ */
+ void smp_call_function_many(const struct cpumask *mask,
+ smp_call_func_t func, void *info, bool wait)
+diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
+index f3e831f62906f1..a59bc75ab7c5b4 100644
+--- a/kernel/time/clockevents.c
++++ b/kernel/time/clockevents.c
+@@ -633,7 +633,7 @@ void tick_offline_cpu(unsigned int cpu)
+ raw_spin_lock(&clockevents_lock);
+
+ tick_broadcast_offline(cpu);
+- tick_shutdown(cpu);
++ tick_shutdown();
+
+ /*
+ * Unregister the clock event devices which were
+diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
+index 9a3859443c042c..7e33d3f2e889b1 100644
+--- a/kernel/time/tick-common.c
++++ b/kernel/time/tick-common.c
+@@ -411,24 +411,18 @@ int tick_cpu_dying(unsigned int dying_cpu)
+ }
+
+ /*
+- * Shutdown an event device on a given cpu:
++ * Shutdown an event device on the outgoing CPU:
+ *
+- * This is called on a life CPU, when a CPU is dead. So we cannot
+- * access the hardware device itself.
+- * We just set the mode and remove it from the lists.
++ * Called by the dying CPU during teardown, with clockevents_lock held
++ * and interrupts disabled.
+ */
+-void tick_shutdown(unsigned int cpu)
++void tick_shutdown(void)
+ {
+- struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
++ struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
+ struct clock_event_device *dev = td->evtdev;
+
+ td->mode = TICKDEV_MODE_PERIODIC;
+ if (dev) {
+- /*
+- * Prevent that the clock events layer tries to call
+- * the set mode function!
+- */
+- clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
+ clockevents_exchange_device(dev, NULL);
+ dev->event_handler = clockevents_handle_noop;
+ td->evtdev = NULL;
+diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
+index faac36de35b9ef..4e4f7bbe2a64bc 100644
+--- a/kernel/time/tick-internal.h
++++ b/kernel/time/tick-internal.h
+@@ -26,7 +26,7 @@ extern void tick_setup_periodic(struct clock_event_device *dev, int broadcast);
+ extern void tick_handle_periodic(struct clock_event_device *dev);
+ extern void tick_check_new_device(struct clock_event_device *dev);
+ extern void tick_offline_cpu(unsigned int cpu);
+-extern void tick_shutdown(unsigned int cpu);
++extern void tick_shutdown(void);
+ extern void tick_suspend(void);
+ extern void tick_resume(void);
+ extern bool tick_check_replacement(struct clock_event_device *curdev,
+diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
+index 3ae52978cae61a..606007c387c52f 100644
+--- a/kernel/trace/bpf_trace.c
++++ b/kernel/trace/bpf_trace.c
+@@ -2728,20 +2728,25 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
+ struct pt_regs *regs;
+ int err;
+
++ /*
++ * graph tracer framework ensures we won't migrate, so there is no need
++ * to use migrate_disable for bpf_prog_run again. The check here just for
++ * __this_cpu_inc_return.
++ */
++ cant_sleep();
++
+ if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
+ bpf_prog_inc_misses_counter(link->link.prog);
+ err = 1;
+ goto out;
+ }
+
+- migrate_disable();
+ rcu_read_lock();
+ regs = ftrace_partial_regs(fregs, bpf_kprobe_multi_pt_regs_ptr());
+ old_run_ctx = bpf_set_run_ctx(&run_ctx.session_ctx.run_ctx);
+ err = bpf_prog_run(link->link.prog, regs);
+ bpf_reset_run_ctx(old_run_ctx);
+ rcu_read_unlock();
+- migrate_enable();
+
+ out:
+ __this_cpu_dec(bpf_prog_active);
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index b3c94fbaf002ff..eb256378e65bae 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
+ return single_release(inode, filp);
+ }
+
+-static int tracing_mark_open(struct inode *inode, struct file *filp)
+-{
+- stream_open(inode, filp);
+- return tracing_open_generic_tr(inode, filp);
+-}
+-
+ static int tracing_release(struct inode *inode, struct file *file)
+ {
+ struct trace_array *tr = inode->i_private;
+@@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
+
+ #define TRACE_MARKER_MAX_SIZE 4096
+
+-static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf,
++static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf,
+ size_t cnt, unsigned long ip)
+ {
+ struct ring_buffer_event *event;
+@@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
+ int meta_size;
+ ssize_t written;
+ size_t size;
+- int len;
+-
+-/* Used in tracing_mark_raw_write() as well */
+-#define FAULTED_STR "<faulted>"
+-#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */
+
+ meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
+ again:
+ size = cnt + meta_size;
+
+- /* If less than "<faulted>", then make sure we can still add that */
+- if (cnt < FAULTED_SIZE)
+- size += FAULTED_SIZE - cnt;
+-
+ buffer = tr->array_buffer.buffer;
+ event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
+ tracing_gen_ctx());
+@@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
+ * make it smaller and try again.
+ */
+ if (size > ring_buffer_max_event_size(buffer)) {
+- /* cnt < FAULTED size should never be bigger than max */
+- if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
+- return -EBADF;
+ cnt = ring_buffer_max_event_size(buffer) - meta_size;
+ /* The above should only happen once */
+ if (WARN_ON_ONCE(cnt + meta_size == size))
+@@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
+
+ entry = ring_buffer_event_data(event);
+ entry->ip = ip;
+-
+- len = copy_from_user_nofault(&entry->buf, ubuf, cnt);
+- if (len) {
+- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
+- cnt = FAULTED_SIZE;
+- written = -EFAULT;
+- } else
+- written = cnt;
++ memcpy(&entry->buf, buf, cnt);
++ written = cnt;
+
+ if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) {
+ /* do not add \n before testing triggers, but add \0 */
+@@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
+ return written;
+ }
+
++struct trace_user_buf {
++ char *buf;
++};
++
++struct trace_user_buf_info {
++ struct trace_user_buf __percpu *tbuf;
++ int ref;
++};
++
++
++static DEFINE_MUTEX(trace_user_buffer_mutex);
++static struct trace_user_buf_info *trace_user_buffer;
++
++static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
++{
++ char *buf;
++ int cpu;
++
++ for_each_possible_cpu(cpu) {
++ buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
++ kfree(buf);
++ }
++ free_percpu(tinfo->tbuf);
++ kfree(tinfo);
++}
++
++static int trace_user_fault_buffer_enable(void)
++{
++ struct trace_user_buf_info *tinfo;
++ char *buf;
++ int cpu;
++
++ guard(mutex)(&trace_user_buffer_mutex);
++
++ if (trace_user_buffer) {
++ trace_user_buffer->ref++;
++ return 0;
++ }
++
++ tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
++ if (!tinfo)
++ return -ENOMEM;
++
++ tinfo->tbuf = alloc_percpu(struct trace_user_buf);
++ if (!tinfo->tbuf) {
++ kfree(tinfo);
++ return -ENOMEM;
++ }
++
++ tinfo->ref = 1;
++
++ /* Clear each buffer in case of error */
++ for_each_possible_cpu(cpu) {
++ per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL;
++ }
++
++ for_each_possible_cpu(cpu) {
++ buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
++ cpu_to_node(cpu));
++ if (!buf) {
++ trace_user_fault_buffer_free(tinfo);
++ return -ENOMEM;
++ }
++ per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
++ }
++
++ trace_user_buffer = tinfo;
++
++ return 0;
++}
++
++static void trace_user_fault_buffer_disable(void)
++{
++ struct trace_user_buf_info *tinfo;
++
++ guard(mutex)(&trace_user_buffer_mutex);
++
++ tinfo = trace_user_buffer;
++
++ if (WARN_ON_ONCE(!tinfo))
++ return;
++
++ if (--tinfo->ref)
++ return;
++
++ trace_user_fault_buffer_free(tinfo);
++ trace_user_buffer = NULL;
++}
++
++/* Must be called with preemption disabled */
++static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
++ const char __user *ptr, size_t size,
++ size_t *read_size)
++{
++ int cpu = smp_processor_id();
++ char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
++ unsigned int cnt;
++ int trys = 0;
++ int ret;
++
++ if (size > TRACE_MARKER_MAX_SIZE)
++ size = TRACE_MARKER_MAX_SIZE;
++ *read_size = 0;
++
++ /*
++ * This acts similar to a seqcount. The per CPU context switches are
++ * recorded, migration is disabled and preemption is enabled. The
++ * read of the user space memory is copied into the per CPU buffer.
++ * Preemption is disabled again, and if the per CPU context switches count
++ * is still the same, it means the buffer has not been corrupted.
++ * If the count is different, it is assumed the buffer is corrupted
++ * and reading must be tried again.
++ */
++
++ do {
++ /*
++ * If for some reason, copy_from_user() always causes a context
++ * switch, this would then cause an infinite loop.
++ * If this task is preempted by another user space task, it
++ * will cause this task to try again. But just in case something
++ * changes where the copying from user space causes another task
++ * to run, prevent this from going into an infinite loop.
++ * 100 tries should be plenty.
++ */
++ if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space"))
++ return NULL;
++
++ /* Read the current CPU context switch counter */
++ cnt = nr_context_switches_cpu(cpu);
++
++ /*
++ * Preemption is going to be enabled, but this task must
++ * remain on this CPU.
++ */
++ migrate_disable();
++
++ /*
++ * Now preemption is being enabed and another task can come in
++ * and use the same buffer and corrupt our data.
++ */
++ preempt_enable_notrace();
++
++ ret = __copy_from_user(buffer, ptr, size);
++
++ preempt_disable_notrace();
++ migrate_enable();
++
++ /* if it faulted, no need to test if the buffer was corrupted */
++ if (ret)
++ return NULL;
++
++ /*
++ * Preemption is disabled again, now check the per CPU context
++ * switch counter. If it doesn't match, then another user space
++ * process may have schedule in and corrupted our buffer. In that
++ * case the copying must be retried.
++ */
++ } while (nr_context_switches_cpu(cpu) != cnt);
++
++ *read_size = size;
++ return buffer;
++}
++
+ static ssize_t
+ tracing_mark_write(struct file *filp, const char __user *ubuf,
+ size_t cnt, loff_t *fpos)
+@@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
+ struct trace_array *tr = filp->private_data;
+ ssize_t written = -ENODEV;
+ unsigned long ip;
++ size_t size;
++ char *buf;
+
+ if (tracing_disabled)
+ return -EINVAL;
+@@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
+ if (cnt > TRACE_MARKER_MAX_SIZE)
+ cnt = TRACE_MARKER_MAX_SIZE;
+
++ /* Must have preemption disabled while having access to the buffer */
++ guard(preempt_notrace)();
++
++ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
++ if (!buf)
++ return -EFAULT;
++
++ if (cnt > size)
++ cnt = size;
++
+ /* The selftests expect this function to be the IP address */
+ ip = _THIS_IP_;
+
+@@ -7270,32 +7421,28 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
+ if (tr == &global_trace) {
+ guard(rcu)();
+ list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
+- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
++ written = write_marker_to_buffer(tr, buf, cnt, ip);
+ if (written < 0)
+ break;
+ }
+ } else {
+- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
++ written = write_marker_to_buffer(tr, buf, cnt, ip);
+ }
+
+ return written;
+ }
+
+ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
+- const char __user *ubuf, size_t cnt)
++ const char *buf, size_t cnt)
+ {
+ struct ring_buffer_event *event;
+ struct trace_buffer *buffer;
+ struct raw_data_entry *entry;
+ ssize_t written;
+- int size;
+- int len;
+-
+-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
++ size_t size;
+
+- size = sizeof(*entry) + cnt;
+- if (cnt < FAULT_SIZE_ID)
+- size += FAULT_SIZE_ID - cnt;
++ /* cnt includes both the entry->id and the data behind it. */
++ size = struct_size(entry, buf, cnt - sizeof(entry->id));
+
+ buffer = tr->array_buffer.buffer;
+
+@@ -7309,14 +7456,11 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
+ return -EBADF;
+
+ entry = ring_buffer_event_data(event);
+-
+- len = copy_from_user_nofault(&entry->id, ubuf, cnt);
+- if (len) {
+- entry->id = -1;
+- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
+- written = -EFAULT;
+- } else
+- written = cnt;
++ unsafe_memcpy(&entry->id, buf, cnt,
++ "id and content already reserved on ring buffer"
++ "'buf' includes the 'id' and the data."
++ "'entry' was allocated with cnt from 'id'.");
++ written = cnt;
+
+ __buffer_unlock_commit(buffer, event);
+
+@@ -7329,8 +7473,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
+ {
+ struct trace_array *tr = filp->private_data;
+ ssize_t written = -ENODEV;
+-
+-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
++ size_t size;
++ char *buf;
+
+ if (tracing_disabled)
+ return -EINVAL;
+@@ -7342,21 +7486,53 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
+ if (cnt < sizeof(unsigned int))
+ return -EINVAL;
+
++ /* Must have preemption disabled while having access to the buffer */
++ guard(preempt_notrace)();
++
++ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
++ if (!buf)
++ return -EFAULT;
++
++ /* raw write is all or nothing */
++ if (cnt > size)
++ return -EINVAL;
++
+ /* The global trace_marker_raw can go to multiple instances */
+ if (tr == &global_trace) {
+ guard(rcu)();
+ list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
+- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
++ written = write_raw_marker_to_buffer(tr, buf, cnt);
+ if (written < 0)
+ break;
+ }
+ } else {
+- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
++ written = write_raw_marker_to_buffer(tr, buf, cnt);
+ }
+
+ return written;
+ }
+
++static int tracing_mark_open(struct inode *inode, struct file *filp)
++{
++ int ret;
++
++ ret = trace_user_fault_buffer_enable();
++ if (ret < 0)
++ return ret;
++
++ stream_open(inode, filp);
++ ret = tracing_open_generic_tr(inode, filp);
++ if (ret < 0)
++ trace_user_fault_buffer_disable();
++ return ret;
++}
++
++static int tracing_mark_release(struct inode *inode, struct file *file)
++{
++ trace_user_fault_buffer_disable();
++ return tracing_release_generic_tr(inode, file);
++}
++
+ static int tracing_clock_show(struct seq_file *m, void *v)
+ {
+ struct trace_array *tr = m->private;
+@@ -7764,13 +7940,13 @@ static const struct file_operations tracing_free_buffer_fops = {
+ static const struct file_operations tracing_mark_fops = {
+ .open = tracing_mark_open,
+ .write = tracing_mark_write,
+- .release = tracing_release_generic_tr,
++ .release = tracing_mark_release,
+ };
+
+ static const struct file_operations tracing_mark_raw_fops = {
+ .open = tracing_mark_open,
+ .write = tracing_mark_raw_write,
+- .release = tracing_release_generic_tr,
++ .release = tracing_mark_release,
+ };
+
+ static const struct file_operations trace_clock_fops = {
+diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
+index 9f3e9537417d55..e00da4182deb78 100644
+--- a/kernel/trace/trace_events.c
++++ b/kernel/trace/trace_events.c
+@@ -1629,11 +1629,10 @@ static void *s_start(struct seq_file *m, loff_t *pos)
+ loff_t l;
+
+ iter = kzalloc(sizeof(*iter), GFP_KERNEL);
++ mutex_lock(&event_mutex);
+ if (!iter)
+ return NULL;
+
+- mutex_lock(&event_mutex);
+-
+ iter->type = SET_EVENT_FILE;
+ iter->file = list_entry(&tr->events, struct trace_event_file, list);
+
+diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
+index b36ade43d4b3bf..ad9d6347b5fa03 100644
+--- a/kernel/trace/trace_fprobe.c
++++ b/kernel/trace/trace_fprobe.c
+@@ -522,13 +522,14 @@ static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip,
+ void *entry_data)
+ {
+ struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
++ unsigned int flags = trace_probe_load_flag(&tf->tp);
+ int ret = 0;
+
+- if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
++ if (flags & TP_FLAG_TRACE)
+ fentry_trace_func(tf, entry_ip, fregs);
+
+ #ifdef CONFIG_PERF_EVENTS
+- if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE))
++ if (flags & TP_FLAG_PROFILE)
+ ret = fentry_perf_func(tf, entry_ip, fregs);
+ #endif
+ return ret;
+@@ -540,11 +541,12 @@ static void fexit_dispatcher(struct fprobe *fp, unsigned long entry_ip,
+ void *entry_data)
+ {
+ struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
++ unsigned int flags = trace_probe_load_flag(&tf->tp);
+
+- if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
++ if (flags & TP_FLAG_TRACE)
+ fexit_trace_func(tf, entry_ip, ret_ip, fregs, entry_data);
+ #ifdef CONFIG_PERF_EVENTS
+- if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE))
++ if (flags & TP_FLAG_PROFILE)
+ fexit_perf_func(tf, entry_ip, ret_ip, fregs, entry_data);
+ #endif
+ }
+diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
+index 5496758b6c760f..4c45c49b06c8d7 100644
+--- a/kernel/trace/trace_irqsoff.c
++++ b/kernel/trace/trace_irqsoff.c
+@@ -184,7 +184,7 @@ static int irqsoff_graph_entry(struct ftrace_graph_ent *trace,
+ unsigned long flags;
+ unsigned int trace_ctx;
+ u64 *calltime;
+- int ret;
++ int ret = 0;
+
+ if (ftrace_graph_ignore_func(gops, trace))
+ return 0;
+@@ -202,13 +202,11 @@ static int irqsoff_graph_entry(struct ftrace_graph_ent *trace,
+ return 0;
+
+ calltime = fgraph_reserve_data(gops->idx, sizeof(*calltime));
+- if (!calltime)
+- return 0;
+-
+- *calltime = trace_clock_local();
+-
+- trace_ctx = tracing_gen_ctx_flags(flags);
+- ret = __trace_graph_entry(tr, trace, trace_ctx);
++ if (calltime) {
++ *calltime = trace_clock_local();
++ trace_ctx = tracing_gen_ctx_flags(flags);
++ ret = __trace_graph_entry(tr, trace, trace_ctx);
++ }
+ local_dec(&data->disabled);
+
+ return ret;
+@@ -233,11 +231,10 @@ static void irqsoff_graph_return(struct ftrace_graph_ret *trace,
+
+ rettime = trace_clock_local();
+ calltime = fgraph_retrieve_data(gops->idx, &size);
+- if (!calltime)
+- return;
+-
+- trace_ctx = tracing_gen_ctx_flags(flags);
+- __trace_graph_return(tr, trace, trace_ctx, *calltime, rettime);
++ if (calltime) {
++ trace_ctx = tracing_gen_ctx_flags(flags);
++ __trace_graph_return(tr, trace, trace_ctx, *calltime, rettime);
++ }
+ local_dec(&data->disabled);
+ }
+
+diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
+index fa60362a3f31bd..ee8171b19bee20 100644
+--- a/kernel/trace/trace_kprobe.c
++++ b/kernel/trace/trace_kprobe.c
+@@ -1815,14 +1815,15 @@ static int kprobe_register(struct trace_event_call *event,
+ static int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
+ {
+ struct trace_kprobe *tk = container_of(kp, struct trace_kprobe, rp.kp);
++ unsigned int flags = trace_probe_load_flag(&tk->tp);
+ int ret = 0;
+
+ raw_cpu_inc(*tk->nhit);
+
+- if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
++ if (flags & TP_FLAG_TRACE)
+ kprobe_trace_func(tk, regs);
+ #ifdef CONFIG_PERF_EVENTS
+- if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE))
++ if (flags & TP_FLAG_PROFILE)
+ ret = kprobe_perf_func(tk, regs);
+ #endif
+ return ret;
+@@ -1834,6 +1835,7 @@ kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
+ {
+ struct kretprobe *rp = get_kretprobe(ri);
+ struct trace_kprobe *tk;
++ unsigned int flags;
+
+ /*
+ * There is a small chance that get_kretprobe(ri) returns NULL when
+@@ -1846,10 +1848,11 @@ kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
+ tk = container_of(rp, struct trace_kprobe, rp);
+ raw_cpu_inc(*tk->nhit);
+
+- if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
++ flags = trace_probe_load_flag(&tk->tp);
++ if (flags & TP_FLAG_TRACE)
+ kretprobe_trace_func(tk, ri, regs);
+ #ifdef CONFIG_PERF_EVENTS
+- if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE))
++ if (flags & TP_FLAG_PROFILE)
+ kretprobe_perf_func(tk, ri, regs);
+ #endif
+ return 0; /* We don't tweak kernel, so just return 0 */
+diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
+index 842383fbc03b9c..08b5bda24da225 100644
+--- a/kernel/trace/trace_probe.h
++++ b/kernel/trace/trace_probe.h
+@@ -271,16 +271,21 @@ struct event_file_link {
+ struct list_head list;
+ };
+
++static inline unsigned int trace_probe_load_flag(struct trace_probe *tp)
++{
++ return smp_load_acquire(&tp->event->flags);
++}
++
+ static inline bool trace_probe_test_flag(struct trace_probe *tp,
+ unsigned int flag)
+ {
+- return !!(tp->event->flags & flag);
++ return !!(trace_probe_load_flag(tp) & flag);
+ }
+
+ static inline void trace_probe_set_flag(struct trace_probe *tp,
+ unsigned int flag)
+ {
+- tp->event->flags |= flag;
++ smp_store_release(&tp->event->flags, tp->event->flags | flag);
+ }
+
+ static inline void trace_probe_clear_flag(struct trace_probe *tp,
+diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
+index bf1cb80742aed7..e3f2e4f56faa42 100644
+--- a/kernel/trace/trace_sched_wakeup.c
++++ b/kernel/trace/trace_sched_wakeup.c
+@@ -138,12 +138,10 @@ static int wakeup_graph_entry(struct ftrace_graph_ent *trace,
+ return 0;
+
+ calltime = fgraph_reserve_data(gops->idx, sizeof(*calltime));
+- if (!calltime)
+- return 0;
+-
+- *calltime = trace_clock_local();
+-
+- ret = __trace_graph_entry(tr, trace, trace_ctx);
++ if (calltime) {
++ *calltime = trace_clock_local();
++ ret = __trace_graph_entry(tr, trace, trace_ctx);
++ }
+ local_dec(&data->disabled);
+ preempt_enable_notrace();
+
+@@ -169,12 +167,10 @@ static void wakeup_graph_return(struct ftrace_graph_ret *trace,
+ rettime = trace_clock_local();
+
+ calltime = fgraph_retrieve_data(gops->idx, &size);
+- if (!calltime)
+- return;
++ if (calltime)
++ __trace_graph_return(tr, trace, trace_ctx, *calltime, rettime);
+
+- __trace_graph_return(tr, trace, trace_ctx, *calltime, rettime);
+ local_dec(&data->disabled);
+-
+ preempt_enable_notrace();
+ return;
+ }
+diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
+index 8b0bcc0d8f41b2..430d09c49462dd 100644
+--- a/kernel/trace/trace_uprobe.c
++++ b/kernel/trace/trace_uprobe.c
+@@ -1547,6 +1547,7 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs,
+ struct trace_uprobe *tu;
+ struct uprobe_dispatch_data udd;
+ struct uprobe_cpu_buffer *ucb = NULL;
++ unsigned int flags;
+ int ret = 0;
+
+ tu = container_of(con, struct trace_uprobe, consumer);
+@@ -1561,11 +1562,12 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs,
+ if (WARN_ON_ONCE(!uprobe_cpu_buffer))
+ return 0;
+
+- if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE))
++ flags = trace_probe_load_flag(&tu->tp);
++ if (flags & TP_FLAG_TRACE)
+ ret |= uprobe_trace_func(tu, regs, &ucb);
+
+ #ifdef CONFIG_PERF_EVENTS
+- if (trace_probe_test_flag(&tu->tp, TP_FLAG_PROFILE))
++ if (flags & TP_FLAG_PROFILE)
+ ret |= uprobe_perf_func(tu, regs, &ucb);
+ #endif
+ uprobe_buffer_put(ucb);
+@@ -1579,6 +1581,7 @@ static int uretprobe_dispatcher(struct uprobe_consumer *con,
+ struct trace_uprobe *tu;
+ struct uprobe_dispatch_data udd;
+ struct uprobe_cpu_buffer *ucb = NULL;
++ unsigned int flags;
+
+ tu = container_of(con, struct trace_uprobe, consumer);
+
+@@ -1590,11 +1593,12 @@ static int uretprobe_dispatcher(struct uprobe_consumer *con,
+ if (WARN_ON_ONCE(!uprobe_cpu_buffer))
+ return 0;
+
+- if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE))
++ flags = trace_probe_load_flag(&tu->tp);
++ if (flags & TP_FLAG_TRACE)
+ uretprobe_trace_func(tu, func, regs, &ucb);
+
+ #ifdef CONFIG_PERF_EVENTS
+- if (trace_probe_test_flag(&tu->tp, TP_FLAG_PROFILE))
++ if (flags & TP_FLAG_PROFILE)
+ uretprobe_perf_func(tu, func, regs, &ucb);
+ #endif
+ uprobe_buffer_put(ucb);
+diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
+index 5d54c4b437df78..5f779719c3d34c 100644
+--- a/lib/raid6/recov_rvv.c
++++ b/lib/raid6/recov_rvv.c
+@@ -4,9 +4,7 @@
+ * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
+ */
+
+-#include <asm/simd.h>
+ #include <asm/vector.h>
+-#include <crypto/internal/simd.h>
+ #include <linux/raid/pq.h>
+
+ static int rvv_has_vector(void)
+diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
+index 7d82efa5b14f9e..b193ea176d5d33 100644
+--- a/lib/raid6/rvv.c
++++ b/lib/raid6/rvv.c
+@@ -9,11 +9,8 @@
+ * Copyright 2002-2004 H. Peter Anvin
+ */
+
+-#include <asm/simd.h>
+ #include <asm/vector.h>
+-#include <crypto/internal/simd.h>
+ #include <linux/raid/pq.h>
+-#include <linux/types.h>
+ #include "rvv.h"
+
+ #define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */
+diff --git a/lib/vdso/datastore.c b/lib/vdso/datastore.c
+index 3693c6caf2c4d4..a565c30c71a04f 100644
+--- a/lib/vdso/datastore.c
++++ b/lib/vdso/datastore.c
+@@ -11,14 +11,14 @@
+ /*
+ * The vDSO data page.
+ */
+-#ifdef CONFIG_HAVE_GENERIC_VDSO
++#ifdef CONFIG_GENERIC_GETTIMEOFDAY
+ static union {
+ struct vdso_time_data data;
+ u8 page[PAGE_SIZE];
+ } vdso_time_data_store __page_aligned_data;
+ struct vdso_time_data *vdso_k_time_data = &vdso_time_data_store.data;
+ static_assert(sizeof(vdso_time_data_store) == PAGE_SIZE);
+-#endif /* CONFIG_HAVE_GENERIC_VDSO */
++#endif /* CONFIG_GENERIC_GETTIMEOFDAY */
+
+ #ifdef CONFIG_VDSO_GETRANDOM
+ static union {
+@@ -46,7 +46,7 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
+
+ switch (vmf->pgoff) {
+ case VDSO_TIME_PAGE_OFFSET:
+- if (!IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO))
++ if (!IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY))
+ return VM_FAULT_SIGBUS;
+ pfn = __phys_to_pfn(__pa_symbol(vdso_k_time_data));
+ if (timens_page) {
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 6cfe0b43ab8f96..8f19d0f293e090 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -7203,6 +7203,8 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
+ psize);
+ }
+ spin_unlock(ptl);
++
++ cond_resched();
+ }
+ /*
+ * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare
+diff --git a/mm/memcontrol.c b/mm/memcontrol.c
+index 8dd7fbed5a9427..46713b9ece0638 100644
+--- a/mm/memcontrol.c
++++ b/mm/memcontrol.c
+@@ -5024,6 +5024,19 @@ void mem_cgroup_sk_free(struct sock *sk)
+ css_put(&sk->sk_memcg->css);
+ }
+
++void mem_cgroup_sk_inherit(const struct sock *sk, struct sock *newsk)
++{
++ if (sk->sk_memcg == newsk->sk_memcg)
++ return;
++
++ mem_cgroup_sk_free(newsk);
++
++ if (sk->sk_memcg)
++ css_get(&sk->sk_memcg->css);
++
++ newsk->sk_memcg = sk->sk_memcg;
++}
++
+ /**
+ * mem_cgroup_charge_skmem - charge socket memory
+ * @memcg: memcg to charge
+diff --git a/mm/slub.c b/mm/slub.c
+index d257141896c953..264fc76455d739 100644
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -7731,10 +7731,7 @@ static int cmp_loc_by_count(const void *a, const void *b, const void *data)
+ struct location *loc1 = (struct location *)a;
+ struct location *loc2 = (struct location *)b;
+
+- if (loc1->count > loc2->count)
+- return -1;
+- else
+- return 1;
++ return cmp_int(loc2->count, loc1->count);
+ }
+
+ static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos)
+diff --git a/net/9p/trans_usbg.c b/net/9p/trans_usbg.c
+index 6b694f117aef29..468f7e8f0277b9 100644
+--- a/net/9p/trans_usbg.c
++++ b/net/9p/trans_usbg.c
+@@ -231,6 +231,8 @@ static void usb9pfs_rx_complete(struct usb_ep *ep, struct usb_request *req)
+ struct f_usb9pfs *usb9pfs = ep->driver_data;
+ struct usb_composite_dev *cdev = usb9pfs->function.config->cdev;
+ struct p9_req_t *p9_rx_req;
++ unsigned int req_size = req->actual;
++ int status = REQ_STATUS_RCVD;
+
+ if (req->status) {
+ dev_err(&cdev->gadget->dev, "%s usb9pfs complete --> %d, %d/%d\n",
+@@ -242,11 +244,19 @@ static void usb9pfs_rx_complete(struct usb_ep *ep, struct usb_request *req)
+ if (!p9_rx_req)
+ return;
+
+- memcpy(p9_rx_req->rc.sdata, req->buf, req->actual);
++ if (req_size > p9_rx_req->rc.capacity) {
++ dev_err(&cdev->gadget->dev,
++ "%s received data size %u exceeds buffer capacity %zu\n",
++ ep->name, req_size, p9_rx_req->rc.capacity);
++ req_size = 0;
++ status = REQ_STATUS_ERROR;
++ }
++
++ memcpy(p9_rx_req->rc.sdata, req->buf, req_size);
+
+- p9_rx_req->rc.size = req->actual;
++ p9_rx_req->rc.size = req_size;
+
+- p9_client_cb(usb9pfs->client, p9_rx_req, REQ_STATUS_RCVD);
++ p9_client_cb(usb9pfs->client, p9_rx_req, status);
+ p9_req_put(usb9pfs->client, p9_rx_req);
+
+ complete(&usb9pfs->received);
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index 7a7d4989085848..eefdb6134ca53b 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -1325,7 +1325,7 @@ int hci_setup_ext_adv_instance_sync(struct hci_dev *hdev, u8 instance)
+ {
+ struct hci_cp_le_set_ext_adv_params cp;
+ struct hci_rp_le_set_ext_adv_params rp;
+- bool connectable;
++ bool connectable, require_privacy;
+ u32 flags;
+ bdaddr_t random_addr;
+ u8 own_addr_type;
+@@ -1363,10 +1363,12 @@ int hci_setup_ext_adv_instance_sync(struct hci_dev *hdev, u8 instance)
+ return -EPERM;
+
+ /* Set require_privacy to true only when non-connectable
+- * advertising is used. In that case it is fine to use a
+- * non-resolvable private address.
++ * advertising is used and it is not periodic.
++ * In that case it is fine to use a non-resolvable private address.
+ */
+- err = hci_get_random_address(hdev, !connectable,
++ require_privacy = !connectable && !(adv && adv->periodic);
++
++ err = hci_get_random_address(hdev, require_privacy,
+ adv_use_rpa(hdev, flags), adv,
+ &own_addr_type, &random_addr);
+ if (err < 0)
+diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c
+index 5ce823ca3aaf44..88602f19decacd 100644
+--- a/net/bluetooth/iso.c
++++ b/net/bluetooth/iso.c
+@@ -111,6 +111,8 @@ static void iso_conn_free(struct kref *ref)
+ /* Ensure no more work items will run since hci_conn has been dropped */
+ disable_delayed_work_sync(&conn->timeout_work);
+
++ kfree_skb(conn->rx_skb);
++
+ kfree(conn);
+ }
+
+@@ -750,6 +752,13 @@ static void iso_sock_kill(struct sock *sk)
+
+ BT_DBG("sk %p state %d", sk, sk->sk_state);
+
++ /* Sock is dead, so set conn->sk to NULL to avoid possible UAF */
++ if (iso_pi(sk)->conn) {
++ iso_conn_lock(iso_pi(sk)->conn);
++ iso_pi(sk)->conn->sk = NULL;
++ iso_conn_unlock(iso_pi(sk)->conn);
++ }
++
+ /* Kill poor orphan */
+ bt_sock_unlink(&iso_sk_list, sk);
+ sock_set_flag(sk, SOCK_DEAD);
+@@ -2407,7 +2416,7 @@ void iso_recv(struct hci_conn *hcon, struct sk_buff *skb, u16 flags)
+ skb_copy_from_linear_data(skb, skb_put(conn->rx_skb, skb->len),
+ skb->len);
+ conn->rx_len -= skb->len;
+- return;
++ break;
+
+ case ISO_END:
+ skb_copy_from_linear_data(skb, skb_put(conn->rx_skb, skb->len),
+diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
+index 225140fcb3d6c8..a3d16eece0d236 100644
+--- a/net/bluetooth/mgmt.c
++++ b/net/bluetooth/mgmt.c
+@@ -4542,13 +4542,11 @@ static int read_exp_features_info(struct sock *sk, struct hci_dev *hdev,
+ return -ENOMEM;
+
+ #ifdef CONFIG_BT_FEATURE_DEBUG
+- if (!hdev) {
+- flags = bt_dbg_get() ? BIT(0) : 0;
++ flags = bt_dbg_get() ? BIT(0) : 0;
+
+- memcpy(rp->features[idx].uuid, debug_uuid, 16);
+- rp->features[idx].flags = cpu_to_le32(flags);
+- idx++;
+- }
++ memcpy(rp->features[idx].uuid, debug_uuid, 16);
++ rp->features[idx].flags = cpu_to_le32(flags);
++ idx++;
+ #endif
+
+ if (hdev && hci_dev_le_state_simultaneous(hdev)) {
+diff --git a/net/core/dst.c b/net/core/dst.c
+index e2de8b68c41d3f..e9d35f49c9e780 100644
+--- a/net/core/dst.c
++++ b/net/core/dst.c
+@@ -150,7 +150,7 @@ void dst_dev_put(struct dst_entry *dst)
+ dst->ops->ifdown(dst, dev);
+ WRITE_ONCE(dst->input, dst_discard);
+ WRITE_ONCE(dst->output, dst_discard_out);
+- WRITE_ONCE(dst->dev, blackhole_netdev);
++ rcu_assign_pointer(dst->dev_rcu, blackhole_netdev);
+ netdev_ref_replace(dev, blackhole_netdev, &dst->dev_tracker,
+ GFP_ATOMIC);
+ }
+diff --git a/net/core/filter.c b/net/core/filter.c
+index da391e2b0788d0..2d326d35c38716 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -9284,13 +9284,17 @@ static bool sock_addr_is_valid_access(int off, int size,
+ return false;
+ info->reg_type = PTR_TO_SOCKET;
+ break;
+- default:
+- if (type == BPF_READ) {
+- if (size != size_default)
+- return false;
+- } else {
++ case bpf_ctx_range(struct bpf_sock_addr, user_family):
++ case bpf_ctx_range(struct bpf_sock_addr, family):
++ case bpf_ctx_range(struct bpf_sock_addr, type):
++ case bpf_ctx_range(struct bpf_sock_addr, protocol):
++ if (type != BPF_READ)
+ return false;
+- }
++ if (size != size_default)
++ return false;
++ break;
++ default:
++ return false;
+ }
+
+ return true;
+diff --git a/net/core/sock.c b/net/core/sock.c
+index 158bddd23134c4..e21348ead7e764 100644
+--- a/net/core/sock.c
++++ b/net/core/sock.c
+@@ -2584,7 +2584,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
+ }
+ EXPORT_SYMBOL_GPL(sk_clone_lock);
+
+-static u32 sk_dst_gso_max_size(struct sock *sk, struct dst_entry *dst)
++static u32 sk_dst_gso_max_size(struct sock *sk, const struct net_device *dev)
+ {
+ bool is_ipv6 = false;
+ u32 max_size;
+@@ -2594,8 +2594,8 @@ static u32 sk_dst_gso_max_size(struct sock *sk, struct dst_entry *dst)
+ !ipv6_addr_v4mapped(&sk->sk_v6_rcv_saddr));
+ #endif
+ /* pairs with the WRITE_ONCE() in netif_set_gso(_ipv4)_max_size() */
+- max_size = is_ipv6 ? READ_ONCE(dst_dev(dst)->gso_max_size) :
+- READ_ONCE(dst_dev(dst)->gso_ipv4_max_size);
++ max_size = is_ipv6 ? READ_ONCE(dev->gso_max_size) :
++ READ_ONCE(dev->gso_ipv4_max_size);
+ if (max_size > GSO_LEGACY_MAX_SIZE && !sk_is_tcp(sk))
+ max_size = GSO_LEGACY_MAX_SIZE;
+
+@@ -2604,9 +2604,12 @@ static u32 sk_dst_gso_max_size(struct sock *sk, struct dst_entry *dst)
+
+ void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
+ {
++ const struct net_device *dev;
+ u32 max_segs = 1;
+
+- sk->sk_route_caps = dst_dev(dst)->features;
++ rcu_read_lock();
++ dev = dst_dev_rcu(dst);
++ sk->sk_route_caps = dev->features;
+ if (sk_is_tcp(sk)) {
+ struct inet_connection_sock *icsk = inet_csk(sk);
+
+@@ -2622,13 +2625,14 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
+ sk->sk_route_caps &= ~NETIF_F_GSO_MASK;
+ } else {
+ sk->sk_route_caps |= NETIF_F_SG | NETIF_F_HW_CSUM;
+- sk->sk_gso_max_size = sk_dst_gso_max_size(sk, dst);
++ sk->sk_gso_max_size = sk_dst_gso_max_size(sk, dev);
+ /* pairs with the WRITE_ONCE() in netif_set_gso_max_segs() */
+- max_segs = max_t(u32, READ_ONCE(dst_dev(dst)->gso_max_segs), 1);
++ max_segs = max_t(u32, READ_ONCE(dev->gso_max_segs), 1);
+ }
+ }
+ sk->sk_gso_max_segs = max_segs;
+ sk_dst_set(sk, dst);
++ rcu_read_unlock();
+ }
+ EXPORT_SYMBOL_GPL(sk_setup_caps);
+
+diff --git a/net/ethtool/tsconfig.c b/net/ethtool/tsconfig.c
+index 2be356bdfe8737..169b413b31fc5f 100644
+--- a/net/ethtool/tsconfig.c
++++ b/net/ethtool/tsconfig.c
+@@ -423,13 +423,11 @@ static int ethnl_set_tsconfig(struct ethnl_req_info *req_base,
+ return ret;
+ }
+
+- if (hwprov_mod || config_mod) {
+- ret = tsconfig_send_reply(dev, info);
+- if (ret && ret != -EOPNOTSUPP) {
+- NL_SET_ERR_MSG(info->extack,
+- "error while reading the new configuration set");
+- return ret;
+- }
++ ret = tsconfig_send_reply(dev, info);
++ if (ret && ret != -EOPNOTSUPP) {
++ NL_SET_ERR_MSG(info->extack,
++ "error while reading the new configuration set");
++ return ret;
+ }
+
+ /* tsconfig has no notification */
+diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
+index c48c572f024da8..1be0d91620a38b 100644
+--- a/net/ipv4/icmp.c
++++ b/net/ipv4/icmp.c
+@@ -318,17 +318,17 @@ static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt,
+ return true;
+
+ /* No rate limit on loopback */
+- dev = dst_dev(dst);
++ rcu_read_lock();
++ dev = dst_dev_rcu(dst);
+ if (dev && (dev->flags & IFF_LOOPBACK))
+ goto out;
+
+- rcu_read_lock();
+ peer = inet_getpeer_v4(net->ipv4.peers, fl4->daddr,
+ l3mdev_master_ifindex_rcu(dev));
+ rc = inet_peer_xrlim_allow(peer,
+ READ_ONCE(net->ipv4.sysctl_icmp_ratelimit));
+- rcu_read_unlock();
+ out:
++ rcu_read_unlock();
+ if (!rc)
+ __ICMP_INC_STATS(net, ICMP_MIB_RATELIMITHOST);
+ else
+diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
+index b2584cce90ae1c..f7012479713ba6 100644
+--- a/net/ipv4/ip_fragment.c
++++ b/net/ipv4/ip_fragment.c
+@@ -476,14 +476,16 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
+ /* Process an incoming IP datagram fragment. */
+ int ip_defrag(struct net *net, struct sk_buff *skb, u32 user)
+ {
+- struct net_device *dev = skb->dev ? : skb_dst_dev(skb);
+- int vif = l3mdev_master_ifindex_rcu(dev);
++ struct net_device *dev;
+ struct ipq *qp;
++ int vif;
+
+ __IP_INC_STATS(net, IPSTATS_MIB_REASMREQDS);
+
+ /* Lookup (or create) queue header */
+ rcu_read_lock();
++ dev = skb->dev ? : skb_dst_dev_rcu(skb);
++ vif = l3mdev_master_ifindex_rcu(dev);
+ qp = ip_find(net, ip_hdr(skb), user, vif);
+ if (qp) {
+ int ret, refs = 0;
+diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
+index e86a8a862c4117..8c568fbddb5fb5 100644
+--- a/net/ipv4/ipmr.c
++++ b/net/ipv4/ipmr.c
+@@ -1904,7 +1904,7 @@ static int ipmr_prepare_xmit(struct net *net, struct mr_table *mrt,
+ return -1;
+ }
+
+- encap += LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len;
++ encap += LL_RESERVED_SPACE(dst_dev_rcu(&rt->dst)) + rt->dst.header_len;
+
+ if (skb_cow(skb, encap)) {
+ ip_rt_put(rt);
+@@ -1957,7 +1957,7 @@ static void ipmr_queue_fwd_xmit(struct net *net, struct mr_table *mrt,
+ * result in receiving multiple packets.
+ */
+ NF_HOOK(NFPROTO_IPV4, NF_INET_FORWARD,
+- net, NULL, skb, skb->dev, rt->dst.dev,
++ net, NULL, skb, skb->dev, dst_dev_rcu(&rt->dst),
+ ipmr_forward_finish);
+ return;
+
+@@ -2301,7 +2301,7 @@ int ip_mr_output(struct net *net, struct sock *sk, struct sk_buff *skb)
+
+ guard(rcu)();
+
+- dev = rt->dst.dev;
++ dev = dst_dev_rcu(&rt->dst);
+
+ if (IPCB(skb)->flags & IPSKB_FORWARDED)
+ goto mc_output;
+diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
+index 031df4c19fcc5c..d2c3480df8f77d 100644
+--- a/net/ipv4/ping.c
++++ b/net/ipv4/ping.c
+@@ -77,6 +77,7 @@ static inline struct hlist_head *ping_hashslot(struct ping_table *table,
+
+ int ping_get_port(struct sock *sk, unsigned short ident)
+ {
++ struct net *net = sock_net(sk);
+ struct inet_sock *isk, *isk2;
+ struct hlist_head *hlist;
+ struct sock *sk2 = NULL;
+@@ -90,9 +91,10 @@ int ping_get_port(struct sock *sk, unsigned short ident)
+ for (i = 0; i < (1L << 16); i++, result++) {
+ if (!result)
+ result++; /* avoid zero */
+- hlist = ping_hashslot(&ping_table, sock_net(sk),
+- result);
++ hlist = ping_hashslot(&ping_table, net, result);
+ sk_for_each(sk2, hlist) {
++ if (!net_eq(sock_net(sk2), net))
++ continue;
+ isk2 = inet_sk(sk2);
+
+ if (isk2->inet_num == result)
+@@ -108,8 +110,10 @@ int ping_get_port(struct sock *sk, unsigned short ident)
+ if (i >= (1L << 16))
+ goto fail;
+ } else {
+- hlist = ping_hashslot(&ping_table, sock_net(sk), ident);
++ hlist = ping_hashslot(&ping_table, net, ident);
+ sk_for_each(sk2, hlist) {
++ if (!net_eq(sock_net(sk2), net))
++ continue;
+ isk2 = inet_sk(sk2);
+
+ /* BUG? Why is this reuse and not reuseaddr? ping.c
+@@ -129,7 +133,7 @@ int ping_get_port(struct sock *sk, unsigned short ident)
+ pr_debug("was not hashed\n");
+ sk_add_node_rcu(sk, hlist);
+ sock_set_flag(sk, SOCK_RCU_FREE);
+- sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
++ sock_prot_inuse_add(net, sk->sk_prot, 1);
+ }
+ spin_unlock(&ping_table.lock);
+ return 0;
+@@ -188,6 +192,8 @@ static struct sock *ping_lookup(struct net *net, struct sk_buff *skb, u16 ident)
+ }
+
+ sk_for_each_rcu(sk, hslot) {
++ if (!net_eq(sock_net(sk), net))
++ continue;
+ isk = inet_sk(sk);
+
+ pr_debug("iterate\n");
+diff --git a/net/ipv4/route.c b/net/ipv4/route.c
+index baa43e5966b19b..5582ccd673eebb 100644
+--- a/net/ipv4/route.c
++++ b/net/ipv4/route.c
+@@ -413,11 +413,11 @@ static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst,
+ const void *daddr)
+ {
+ const struct rtable *rt = container_of(dst, struct rtable, dst);
+- struct net_device *dev = dst_dev(dst);
++ struct net_device *dev;
+ struct neighbour *n;
+
+ rcu_read_lock();
+-
++ dev = dst_dev_rcu(dst);
+ if (likely(rt->rt_gw_family == AF_INET)) {
+ n = ip_neigh_gw4(dev, rt->rt_gw4);
+ } else if (rt->rt_gw_family == AF_INET6) {
+@@ -1026,7 +1026,7 @@ static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu)
+ return;
+
+ rcu_read_lock();
+- net = dev_net_rcu(dst_dev(dst));
++ net = dst_dev_net_rcu(dst);
+ if (mtu < net->ipv4.ip_rt_min_pmtu) {
+ lock = true;
+ mtu = min(old_mtu, net->ipv4.ip_rt_min_pmtu);
+@@ -1326,7 +1326,7 @@ static unsigned int ipv4_default_advmss(const struct dst_entry *dst)
+ struct net *net;
+
+ rcu_read_lock();
+- net = dev_net_rcu(dst_dev(dst));
++ net = dst_dev_net_rcu(dst);
+ advmss = max_t(unsigned int, ipv4_mtu(dst) - header_size,
+ net->ipv4.ip_rt_min_advmss);
+ rcu_read_unlock();
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index ad76556800f2b2..89040007c7b709 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -3099,8 +3099,8 @@ bool tcp_check_oom(const struct sock *sk, int shift)
+
+ void __tcp_close(struct sock *sk, long timeout)
+ {
++ bool data_was_unread = false;
+ struct sk_buff *skb;
+- int data_was_unread = 0;
+ int state;
+
+ WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);
+@@ -3119,11 +3119,12 @@ void __tcp_close(struct sock *sk, long timeout)
+ * reader process may not have drained the data yet!
+ */
+ while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
+- u32 len = TCP_SKB_CB(skb)->end_seq - TCP_SKB_CB(skb)->seq;
++ u32 end_seq = TCP_SKB_CB(skb)->end_seq;
+
+ if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
+- len--;
+- data_was_unread += len;
++ end_seq--;
++ if (after(end_seq, tcp_sk(sk)->copied_seq))
++ data_was_unread = true;
+ __kfree_skb(skb);
+ }
+
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 71b76e98371a66..64f93668a8452b 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -4890,12 +4890,23 @@ static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb);
+
+ /* Check if this incoming skb can be added to socket receive queues
+ * while satisfying sk->sk_rcvbuf limit.
++ *
++ * In theory we should use skb->truesize, but this can cause problems
++ * when applications use too small SO_RCVBUF values.
++ * When LRO / hw gro is used, the socket might have a high tp->scaling_ratio,
++ * allowing RWIN to be close to available space.
++ * Whenever the receive queue gets full, we can receive a small packet
++ * filling RWIN, but with a high skb->truesize, because most NIC use 4K page
++ * plus sk_buff metadata even when receiving less than 1500 bytes of payload.
++ *
++ * Note that we use skb->len to decide to accept or drop this packet,
++ * but sk->sk_rmem_alloc is the sum of all skb->truesize.
+ */
+ static bool tcp_can_ingest(const struct sock *sk, const struct sk_buff *skb)
+ {
+- unsigned int new_mem = atomic_read(&sk->sk_rmem_alloc) + skb->truesize;
++ unsigned int rmem = atomic_read(&sk->sk_rmem_alloc);
+
+- return new_mem <= sk->sk_rcvbuf;
++ return rmem + skb->len <= sk->sk_rcvbuf;
+ }
+
+ static int tcp_try_rmem_schedule(struct sock *sk, const struct sk_buff *skb,
+diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
+index 03c068ea27b6ad..10e86f1008e9d9 100644
+--- a/net/ipv4/tcp_metrics.c
++++ b/net/ipv4/tcp_metrics.c
+@@ -170,7 +170,7 @@ static struct tcp_metrics_block *tcpm_new(struct dst_entry *dst,
+ struct net *net;
+
+ spin_lock_bh(&tcp_metrics_lock);
+- net = dev_net_rcu(dst_dev(dst));
++ net = dst_dev_net_rcu(dst);
+
+ /* While waiting for the spin-lock the cache might have been populated
+ * with this entry and so we have to check again.
+@@ -273,7 +273,7 @@ static struct tcp_metrics_block *__tcp_get_metrics_req(struct request_sock *req,
+ return NULL;
+ }
+
+- net = dev_net_rcu(dst_dev(dst));
++ net = dst_dev_net_rcu(dst);
+ hash ^= net_hash_mix(net);
+ hash = hash_32(hash, tcp_metrics_hash_log);
+
+@@ -318,7 +318,7 @@ static struct tcp_metrics_block *tcp_get_metrics(struct sock *sk,
+ else
+ return NULL;
+
+- net = dev_net_rcu(dst_dev(dst));
++ net = dst_dev_net_rcu(dst);
+ hash ^= net_hash_mix(net);
+ hash = hash_32(hash, tcp_metrics_hash_log);
+
+diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
+index f8a8e46286b8ee..52599584422bf4 100644
+--- a/net/ipv6/anycast.c
++++ b/net/ipv6/anycast.c
+@@ -104,7 +104,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
+ rcu_read_lock();
+ rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
+ if (rt) {
+- dev = dst_dev(&rt->dst);
++ dev = dst_dev_rcu(&rt->dst);
+ netdev_hold(dev, &dev_tracker, GFP_ATOMIC);
+ ip6_rt_put(rt);
+ } else if (ishost) {
+diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
+index 44550957fd4e36..56c974cf75d151 100644
+--- a/net/ipv6/icmp.c
++++ b/net/ipv6/icmp.c
+@@ -209,7 +209,8 @@ static bool icmpv6_xrlim_allow(struct sock *sk, u8 type,
+ * this lookup should be more aggressive (not longer than timeout).
+ */
+ dst = ip6_route_output(net, sk, fl6);
+- dev = dst_dev(dst);
++ rcu_read_lock();
++ dev = dst_dev_rcu(dst);
+ if (dst->error) {
+ IP6_INC_STATS(net, ip6_dst_idev(dst),
+ IPSTATS_MIB_OUTNOROUTES);
+@@ -224,14 +225,12 @@ static bool icmpv6_xrlim_allow(struct sock *sk, u8 type,
+ if (rt->rt6i_dst.plen < 128)
+ tmo >>= ((128 - rt->rt6i_dst.plen)>>5);
+
+- rcu_read_lock();
+ peer = inet_getpeer_v6(net->ipv6.peers, &fl6->daddr);
+ res = inet_peer_xrlim_allow(peer, tmo);
+- rcu_read_unlock();
+ }
++ rcu_read_unlock();
+ if (!res)
+- __ICMP6_INC_STATS(net, ip6_dst_idev(dst),
+- ICMP6_MIB_RATELIMITHOST);
++ __ICMP6_INC_STATS(net, NULL, ICMP6_MIB_RATELIMITHOST);
+ else
+ icmp_global_consume(net);
+ dst_release(dst);
+diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
+index 1e1410237b6ef0..9d64c13bab5eac 100644
+--- a/net/ipv6/ip6_output.c
++++ b/net/ipv6/ip6_output.c
+@@ -60,7 +60,7 @@
+ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *skb)
+ {
+ struct dst_entry *dst = skb_dst(skb);
+- struct net_device *dev = dst_dev(dst);
++ struct net_device *dev = dst_dev_rcu(dst);
+ struct inet6_dev *idev = ip6_dst_idev(dst);
+ unsigned int hh_len = LL_RESERVED_SPACE(dev);
+ const struct in6_addr *daddr, *nexthop;
+@@ -70,15 +70,12 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
+
+ /* Be paranoid, rather than too clever. */
+ if (unlikely(hh_len > skb_headroom(skb)) && dev->header_ops) {
+- /* Make sure idev stays alive */
+- rcu_read_lock();
++ /* idev stays alive because we hold rcu_read_lock(). */
+ skb = skb_expand_head(skb, hh_len);
+ if (!skb) {
+ IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
+- rcu_read_unlock();
+ return -ENOMEM;
+ }
+- rcu_read_unlock();
+ }
+
+ hdr = ipv6_hdr(skb);
+@@ -123,7 +120,6 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
+
+ IP6_UPD_PO_STATS(net, idev, IPSTATS_MIB_OUT, skb->len);
+
+- rcu_read_lock();
+ nexthop = rt6_nexthop(dst_rt6_info(dst), daddr);
+ neigh = __ipv6_neigh_lookup_noref(dev, nexthop);
+
+@@ -131,7 +127,6 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
+ if (unlikely(!neigh))
+ neigh = __neigh_create(&nd_tbl, nexthop, dev, false);
+ if (IS_ERR(neigh)) {
+- rcu_read_unlock();
+ IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTNOROUTES);
+ kfree_skb_reason(skb, SKB_DROP_REASON_NEIGH_CREATEFAIL);
+ return -EINVAL;
+@@ -139,7 +134,6 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
+ }
+ sock_confirm_neigh(skb, neigh);
+ ret = neigh_output(neigh, skb, false);
+- rcu_read_unlock();
+ return ret;
+ }
+
+@@ -233,22 +227,29 @@ static int ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff *s
+ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
+ {
+ struct dst_entry *dst = skb_dst(skb);
+- struct net_device *dev = dst_dev(dst), *indev = skb->dev;
+- struct inet6_dev *idev = ip6_dst_idev(dst);
++ struct net_device *dev, *indev = skb->dev;
++ struct inet6_dev *idev;
++ int ret;
+
+ skb->protocol = htons(ETH_P_IPV6);
++ rcu_read_lock();
++ dev = dst_dev_rcu(dst);
++ idev = ip6_dst_idev(dst);
+ skb->dev = dev;
+
+ if (unlikely(!idev || READ_ONCE(idev->cnf.disable_ipv6))) {
+ IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
++ rcu_read_unlock();
+ kfree_skb_reason(skb, SKB_DROP_REASON_IPV6DISABLED);
+ return 0;
+ }
+
+- return NF_HOOK_COND(NFPROTO_IPV6, NF_INET_POST_ROUTING,
+- net, sk, skb, indev, dev,
+- ip6_finish_output,
+- !(IP6CB(skb)->flags & IP6SKB_REROUTED));
++ ret = NF_HOOK_COND(NFPROTO_IPV6, NF_INET_POST_ROUTING,
++ net, sk, skb, indev, dev,
++ ip6_finish_output,
++ !(IP6CB(skb)->flags & IP6SKB_REROUTED));
++ rcu_read_unlock();
++ return ret;
+ }
+ EXPORT_SYMBOL(ip6_output);
+
+@@ -268,35 +269,36 @@ bool ip6_autoflowlabel(struct net *net, const struct sock *sk)
+ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
+ __u32 mark, struct ipv6_txoptions *opt, int tclass, u32 priority)
+ {
+- struct net *net = sock_net(sk);
+ const struct ipv6_pinfo *np = inet6_sk(sk);
+ struct in6_addr *first_hop = &fl6->daddr;
+ struct dst_entry *dst = skb_dst(skb);
+- struct net_device *dev = dst_dev(dst);
+ struct inet6_dev *idev = ip6_dst_idev(dst);
+ struct hop_jumbo_hdr *hop_jumbo;
+ int hoplen = sizeof(*hop_jumbo);
++ struct net *net = sock_net(sk);
+ unsigned int head_room;
++ struct net_device *dev;
+ struct ipv6hdr *hdr;
+ u8 proto = fl6->flowi6_proto;
+ int seg_len = skb->len;
+- int hlimit = -1;
++ int ret, hlimit = -1;
+ u32 mtu;
+
++ rcu_read_lock();
++
++ dev = dst_dev_rcu(dst);
+ head_room = sizeof(struct ipv6hdr) + hoplen + LL_RESERVED_SPACE(dev);
+ if (opt)
+ head_room += opt->opt_nflen + opt->opt_flen;
+
+ if (unlikely(head_room > skb_headroom(skb))) {
+- /* Make sure idev stays alive */
+- rcu_read_lock();
++ /* idev stays alive while we hold rcu_read_lock(). */
+ skb = skb_expand_head(skb, head_room);
+ if (!skb) {
+ IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
+- rcu_read_unlock();
+- return -ENOBUFS;
++ ret = -ENOBUFS;
++ goto unlock;
+ }
+- rcu_read_unlock();
+ }
+
+ if (opt) {
+@@ -358,17 +360,21 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
+ * skb to its handler for processing
+ */
+ skb = l3mdev_ip6_out((struct sock *)sk, skb);
+- if (unlikely(!skb))
+- return 0;
++ if (unlikely(!skb)) {
++ ret = 0;
++ goto unlock;
++ }
+
+ /* hooks should never assume socket lock is held.
+ * we promote our socket to non const
+ */
+- return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT,
+- net, (struct sock *)sk, skb, NULL, dev,
+- dst_output);
++ ret = NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT,
++ net, (struct sock *)sk, skb, NULL, dev,
++ dst_output);
++ goto unlock;
+ }
+
++ ret = -EMSGSIZE;
+ skb->dev = dev;
+ /* ipv6_local_error() does not require socket lock,
+ * we promote our socket to non const
+@@ -377,7 +383,9 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
+
+ IP6_INC_STATS(net, idev, IPSTATS_MIB_FRAGFAILS);
+ kfree_skb(skb);
+- return -EMSGSIZE;
++unlock:
++ rcu_read_unlock();
++ return ret;
+ }
+ EXPORT_SYMBOL(ip6_xmit);
+
+diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
+index 36ca27496b3c04..016b572e7d6f02 100644
+--- a/net/ipv6/mcast.c
++++ b/net/ipv6/mcast.c
+@@ -169,6 +169,29 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
+ return iv > 0 ? iv : 1;
+ }
+
++static struct net_device *ip6_mc_find_dev(struct net *net,
++ const struct in6_addr *group,
++ int ifindex)
++{
++ struct net_device *dev = NULL;
++ struct rt6_info *rt;
++
++ if (ifindex == 0) {
++ rcu_read_lock();
++ rt = rt6_lookup(net, group, NULL, 0, NULL, 0);
++ if (rt) {
++ dev = dst_dev_rcu(&rt->dst);
++ dev_hold(dev);
++ ip6_rt_put(rt);
++ }
++ rcu_read_unlock();
++ } else {
++ dev = dev_get_by_index(net, ifindex);
++ }
++
++ return dev;
++}
++
+ /*
+ * socket join on multicast group
+ */
+@@ -191,28 +214,13 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
+ }
+
+ mc_lst = sock_kmalloc(sk, sizeof(struct ipv6_mc_socklist), GFP_KERNEL);
+-
+ if (!mc_lst)
+ return -ENOMEM;
+
+ mc_lst->next = NULL;
+ mc_lst->addr = *addr;
+
+- if (ifindex == 0) {
+- struct rt6_info *rt;
+-
+- rcu_read_lock();
+- rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
+- if (rt) {
+- dev = dst_dev(&rt->dst);
+- dev_hold(dev);
+- ip6_rt_put(rt);
+- }
+- rcu_read_unlock();
+- } else {
+- dev = dev_get_by_index(net, ifindex);
+- }
+-
++ dev = ip6_mc_find_dev(net, addr, ifindex);
+ if (!dev) {
+ sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
+ return -ENODEV;
+@@ -302,27 +310,14 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
+ }
+ EXPORT_SYMBOL(ipv6_sock_mc_drop);
+
+-static struct inet6_dev *ip6_mc_find_dev(struct net *net,
+- const struct in6_addr *group,
+- int ifindex)
++static struct inet6_dev *ip6_mc_find_idev(struct net *net,
++ const struct in6_addr *group,
++ int ifindex)
+ {
+- struct net_device *dev = NULL;
++ struct net_device *dev;
+ struct inet6_dev *idev;
+
+- if (ifindex == 0) {
+- struct rt6_info *rt;
+-
+- rcu_read_lock();
+- rt = rt6_lookup(net, group, NULL, 0, NULL, 0);
+- if (rt) {
+- dev = dst_dev(&rt->dst);
+- dev_hold(dev);
+- ip6_rt_put(rt);
+- }
+- rcu_read_unlock();
+- } else {
+- dev = dev_get_by_index(net, ifindex);
+- }
++ dev = ip6_mc_find_dev(net, group, ifindex);
+ if (!dev)
+ return NULL;
+
+@@ -374,7 +369,7 @@ int ip6_mc_source(int add, int omode, struct sock *sk,
+ if (!ipv6_addr_is_multicast(group))
+ return -EINVAL;
+
+- idev = ip6_mc_find_dev(net, group, pgsr->gsr_interface);
++ idev = ip6_mc_find_idev(net, group, pgsr->gsr_interface);
+ if (!idev)
+ return -ENODEV;
+
+@@ -509,7 +504,7 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf,
+ gsf->gf_fmode != MCAST_EXCLUDE)
+ return -EINVAL;
+
+- idev = ip6_mc_find_dev(net, group, gsf->gf_interface);
++ idev = ip6_mc_find_idev(net, group, gsf->gf_interface);
+ if (!idev)
+ return -ENODEV;
+
+diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
+index 7d5abb3158ec96..d6bb1e2f6192ed 100644
+--- a/net/ipv6/ndisc.c
++++ b/net/ipv6/ndisc.c
+@@ -505,7 +505,7 @@ void ndisc_send_skb(struct sk_buff *skb, const struct in6_addr *daddr,
+
+ ip6_nd_hdr(skb, saddr, daddr, READ_ONCE(inet6_sk(sk)->hop_limit), skb->len);
+
+- dev = dst_dev(dst);
++ dev = dst_dev_rcu(dst);
+ idev = __in6_dev_get(dev);
+ IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTREQUESTS);
+
+diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
+index d21fe27fe21e34..1c9b283a4132dc 100644
+--- a/net/ipv6/output_core.c
++++ b/net/ipv6/output_core.c
+@@ -104,18 +104,20 @@ EXPORT_SYMBOL(ip6_find_1stfragopt);
+ int ip6_dst_hoplimit(struct dst_entry *dst)
+ {
+ int hoplimit = dst_metric_raw(dst, RTAX_HOPLIMIT);
++
++ rcu_read_lock();
+ if (hoplimit == 0) {
+- struct net_device *dev = dst_dev(dst);
++ struct net_device *dev = dst_dev_rcu(dst);
+ struct inet6_dev *idev;
+
+- rcu_read_lock();
+ idev = __in6_dev_get(dev);
+ if (idev)
+ hoplimit = READ_ONCE(idev->cnf.hop_limit);
+ else
+ hoplimit = READ_ONCE(dev_net(dev)->ipv6.devconf_all->hop_limit);
+- rcu_read_unlock();
+ }
++ rcu_read_unlock();
++
+ return hoplimit;
+ }
+ EXPORT_SYMBOL(ip6_dst_hoplimit);
+diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
+index 752327b10dde74..eb268b07002589 100644
+--- a/net/ipv6/proc.c
++++ b/net/ipv6/proc.c
+@@ -85,7 +85,6 @@ static const struct snmp_mib snmp6_ipstats_list[] = {
+ SNMP_MIB_ITEM("Ip6InECT0Pkts", IPSTATS_MIB_ECT0PKTS),
+ SNMP_MIB_ITEM("Ip6InCEPkts", IPSTATS_MIB_CEPKTS),
+ SNMP_MIB_ITEM("Ip6OutTransmits", IPSTATS_MIB_OUTPKTS),
+- SNMP_MIB_SENTINEL
+ };
+
+ static const struct snmp_mib snmp6_icmp6_list[] = {
+@@ -95,8 +94,8 @@ static const struct snmp_mib snmp6_icmp6_list[] = {
+ SNMP_MIB_ITEM("Icmp6OutMsgs", ICMP6_MIB_OUTMSGS),
+ SNMP_MIB_ITEM("Icmp6OutErrors", ICMP6_MIB_OUTERRORS),
+ SNMP_MIB_ITEM("Icmp6InCsumErrors", ICMP6_MIB_CSUMERRORS),
++/* ICMP6_MIB_RATELIMITHOST needs to be last, see snmp6_dev_seq_show(). */
+ SNMP_MIB_ITEM("Icmp6OutRateLimitHost", ICMP6_MIB_RATELIMITHOST),
+- SNMP_MIB_SENTINEL
+ };
+
+ /* RFC 4293 v6 ICMPMsgStatsTable; named items for RFC 2466 compatibility */
+@@ -129,7 +128,6 @@ static const struct snmp_mib snmp6_udp6_list[] = {
+ SNMP_MIB_ITEM("Udp6InCsumErrors", UDP_MIB_CSUMERRORS),
+ SNMP_MIB_ITEM("Udp6IgnoredMulti", UDP_MIB_IGNOREDMULTI),
+ SNMP_MIB_ITEM("Udp6MemErrors", UDP_MIB_MEMERRORS),
+- SNMP_MIB_SENTINEL
+ };
+
+ static const struct snmp_mib snmp6_udplite6_list[] = {
+@@ -141,7 +139,6 @@ static const struct snmp_mib snmp6_udplite6_list[] = {
+ SNMP_MIB_ITEM("UdpLite6SndbufErrors", UDP_MIB_SNDBUFERRORS),
+ SNMP_MIB_ITEM("UdpLite6InCsumErrors", UDP_MIB_CSUMERRORS),
+ SNMP_MIB_ITEM("UdpLite6MemErrors", UDP_MIB_MEMERRORS),
+- SNMP_MIB_SENTINEL
+ };
+
+ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, atomic_long_t *smib)
+@@ -182,35 +179,37 @@ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, atomic_long_t *smib)
+ */
+ static void snmp6_seq_show_item(struct seq_file *seq, void __percpu *pcpumib,
+ atomic_long_t *smib,
+- const struct snmp_mib *itemlist)
++ const struct snmp_mib *itemlist,
++ int cnt)
+ {
+ unsigned long buff[SNMP_MIB_MAX];
+ int i;
+
+ if (pcpumib) {
+- memset(buff, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
++ memset(buff, 0, sizeof(unsigned long) * cnt);
+
+- snmp_get_cpu_field_batch(buff, itemlist, pcpumib);
+- for (i = 0; itemlist[i].name; i++)
++ snmp_get_cpu_field_batch_cnt(buff, itemlist, cnt, pcpumib);
++ for (i = 0; i < cnt; i++)
+ seq_printf(seq, "%-32s\t%lu\n",
+ itemlist[i].name, buff[i]);
+ } else {
+- for (i = 0; itemlist[i].name; i++)
++ for (i = 0; i < cnt; i++)
+ seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name,
+ atomic_long_read(smib + itemlist[i].entry));
+ }
+ }
+
+ static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu *mib,
+- const struct snmp_mib *itemlist, size_t syncpoff)
++ const struct snmp_mib *itemlist,
++ int cnt, size_t syncpoff)
+ {
+ u64 buff64[SNMP_MIB_MAX];
+ int i;
+
+- memset(buff64, 0, sizeof(u64) * SNMP_MIB_MAX);
++ memset(buff64, 0, sizeof(u64) * cnt);
+
+- snmp_get_cpu_field64_batch(buff64, itemlist, mib, syncpoff);
+- for (i = 0; itemlist[i].name; i++)
++ snmp_get_cpu_field64_batch_cnt(buff64, itemlist, cnt, mib, syncpoff);
++ for (i = 0; i < cnt; i++)
+ seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name, buff64[i]);
+ }
+
+@@ -219,14 +218,19 @@ static int snmp6_seq_show(struct seq_file *seq, void *v)
+ struct net *net = (struct net *)seq->private;
+
+ snmp6_seq_show_item64(seq, net->mib.ipv6_statistics,
+- snmp6_ipstats_list, offsetof(struct ipstats_mib, syncp));
++ snmp6_ipstats_list,
++ ARRAY_SIZE(snmp6_ipstats_list),
++ offsetof(struct ipstats_mib, syncp));
+ snmp6_seq_show_item(seq, net->mib.icmpv6_statistics,
+- NULL, snmp6_icmp6_list);
++ NULL, snmp6_icmp6_list,
++ ARRAY_SIZE(snmp6_icmp6_list));
+ snmp6_seq_show_icmpv6msg(seq, net->mib.icmpv6msg_statistics->mibs);
+ snmp6_seq_show_item(seq, net->mib.udp_stats_in6,
+- NULL, snmp6_udp6_list);
++ NULL, snmp6_udp6_list,
++ ARRAY_SIZE(snmp6_udp6_list));
+ snmp6_seq_show_item(seq, net->mib.udplite_stats_in6,
+- NULL, snmp6_udplite6_list);
++ NULL, snmp6_udplite6_list,
++ ARRAY_SIZE(snmp6_udplite6_list));
+ return 0;
+ }
+
+@@ -236,9 +240,14 @@ static int snmp6_dev_seq_show(struct seq_file *seq, void *v)
+
+ seq_printf(seq, "%-32s\t%u\n", "ifIndex", idev->dev->ifindex);
+ snmp6_seq_show_item64(seq, idev->stats.ipv6,
+- snmp6_ipstats_list, offsetof(struct ipstats_mib, syncp));
++ snmp6_ipstats_list,
++ ARRAY_SIZE(snmp6_ipstats_list),
++ offsetof(struct ipstats_mib, syncp));
++
++ /* Per idev icmp stats do not have ICMP6_MIB_RATELIMITHOST */
+ snmp6_seq_show_item(seq, NULL, idev->stats.icmpv6dev->mibs,
+- snmp6_icmp6_list);
++ snmp6_icmp6_list, ARRAY_SIZE(snmp6_icmp6_list) - 1);
++
+ snmp6_seq_show_icmpv6msg(seq, idev->stats.icmpv6msgdev->mibs);
+ return 0;
+ }
+diff --git a/net/ipv6/route.c b/net/ipv6/route.c
+index 3299cfa12e21c9..3371f16b7a3e61 100644
+--- a/net/ipv6/route.c
++++ b/net/ipv6/route.c
+@@ -2943,7 +2943,7 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk,
+
+ if (res.f6i->nh) {
+ struct fib6_nh_match_arg arg = {
+- .dev = dst_dev(dst),
++ .dev = dst_dev_rcu(dst),
+ .gw = &rt6->rt6i_gateway,
+ };
+
+@@ -3238,7 +3238,6 @@ EXPORT_SYMBOL_GPL(ip6_sk_redirect);
+
+ static unsigned int ip6_default_advmss(const struct dst_entry *dst)
+ {
+- struct net_device *dev = dst_dev(dst);
+ unsigned int mtu = dst_mtu(dst);
+ struct net *net;
+
+@@ -3246,7 +3245,7 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst)
+
+ rcu_read_lock();
+
+- net = dev_net_rcu(dev);
++ net = dst_dev_net_rcu(dst);
+ if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss)
+ mtu = net->ipv6.sysctl.ip6_rt_min_advmss;
+
+@@ -4301,7 +4300,7 @@ static void rt6_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_bu
+
+ if (res.f6i->nh) {
+ struct fib6_nh_match_arg arg = {
+- .dev = dst_dev(dst),
++ .dev = dst_dev_rcu(dst),
+ .gw = &rt->rt6i_gateway,
+ };
+
+diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
+index 2ed07fa121ab73..7609c7c31df740 100644
+--- a/net/mac80211/cfg.c
++++ b/net/mac80211/cfg.c
+@@ -3001,6 +3001,9 @@ static int ieee80211_scan(struct wiphy *wiphy,
+ struct cfg80211_scan_request *req)
+ {
+ struct ieee80211_sub_if_data *sdata;
++ struct ieee80211_link_data *link;
++ struct ieee80211_channel *chan;
++ int radio_idx;
+
+ sdata = IEEE80211_WDEV_TO_SUB_IF(req->wdev);
+
+@@ -3028,10 +3031,20 @@ static int ieee80211_scan(struct wiphy *wiphy,
+ * the frames sent while scanning on other channel will be
+ * lost)
+ */
+- if (ieee80211_num_beaconing_links(sdata) &&
+- (!(wiphy->features & NL80211_FEATURE_AP_SCAN) ||
+- !(req->flags & NL80211_SCAN_FLAG_AP)))
+- return -EOPNOTSUPP;
++ for_each_link_data(sdata, link) {
++ /* if the link is not beaconing, ignore it */
++ if (!sdata_dereference(link->u.ap.beacon, sdata))
++ continue;
++
++ chan = link->conf->chanreq.oper.chan;
++ radio_idx = cfg80211_get_radio_idx_by_chan(wiphy, chan);
++
++ if (ieee80211_is_radio_idx_in_scan_req(wiphy, req,
++ radio_idx) &&
++ (!(wiphy->features & NL80211_FEATURE_AP_SCAN) ||
++ !(req->flags & NL80211_SCAN_FLAG_AP)))
++ return -EOPNOTSUPP;
++ }
+ break;
+ case NL80211_IFTYPE_NAN:
+ default:
+diff --git a/net/mac80211/main.c b/net/mac80211/main.c
+index 3ae6104e5cb201..78f862f79aa824 100644
+--- a/net/mac80211/main.c
++++ b/net/mac80211/main.c
+@@ -1164,9 +1164,6 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
+ if (WARN_ON(!ieee80211_hw_check(hw, MFP_CAPABLE)))
+ return -EINVAL;
+
+- if (WARN_ON(!ieee80211_hw_check(hw, CONNECTION_MONITOR)))
+- return -EINVAL;
+-
+ if (WARN_ON(ieee80211_hw_check(hw, NEED_DTIM_BEFORE_ASSOC)))
+ return -EINVAL;
+
+diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
+index 4d4ff4d4917a25..59baca24aa6b90 100644
+--- a/net/mac80211/rx.c
++++ b/net/mac80211/rx.c
+@@ -5230,12 +5230,20 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw,
+ }
+
+ rx.sdata = prev_sta->sdata;
++ if (!status->link_valid && prev_sta->sta.mlo) {
++ struct link_sta_info *link_sta;
++
++ link_sta = link_sta_info_get_bss(rx.sdata,
++ hdr->addr2);
++ if (!link_sta)
++ continue;
++
++ link_id = link_sta->link_id;
++ }
++
+ if (!ieee80211_rx_data_set_sta(&rx, prev_sta, link_id))
+ goto out;
+
+- if (!status->link_valid && prev_sta->sta.mlo)
+- continue;
+-
+ ieee80211_prepare_and_rx_handle(&rx, skb, false);
+
+ prev_sta = sta;
+@@ -5243,10 +5251,18 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw,
+
+ if (prev_sta) {
+ rx.sdata = prev_sta->sdata;
+- if (!ieee80211_rx_data_set_sta(&rx, prev_sta, link_id))
+- goto out;
++ if (!status->link_valid && prev_sta->sta.mlo) {
++ struct link_sta_info *link_sta;
+
+- if (!status->link_valid && prev_sta->sta.mlo)
++ link_sta = link_sta_info_get_bss(rx.sdata,
++ hdr->addr2);
++ if (!link_sta)
++ goto out;
++
++ link_id = link_sta->link_id;
++ }
++
++ if (!ieee80211_rx_data_set_sta(&rx, prev_sta, link_id))
+ goto out;
+
+ if (ieee80211_prepare_and_rx_handle(&rx, skb, true))
+diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
+index 8c550aab9bdce0..ebcec5241a944d 100644
+--- a/net/mac80211/sta_info.c
++++ b/net/mac80211/sta_info.c
+@@ -3206,16 +3206,20 @@ void sta_set_sinfo(struct sta_info *sta, struct station_info *sinfo,
+ struct link_sta_info *link_sta;
+
+ ether_addr_copy(sinfo->mld_addr, sta->addr);
++
++ /* assign valid links first for iteration */
++ sinfo->valid_links = sta->sta.valid_links;
++
+ for_each_valid_link(sinfo, link_id) {
+ link_sta = wiphy_dereference(sta->local->hw.wiphy,
+ sta->link[link_id]);
+ link = wiphy_dereference(sdata->local->hw.wiphy,
+ sdata->link[link_id]);
+
+- if (!link_sta || !sinfo->links[link_id] || !link)
++ if (!link_sta || !sinfo->links[link_id] || !link) {
++ sinfo->valid_links &= ~BIT(link_id);
+ continue;
+-
+- sinfo->valid_links = sta->sta.valid_links;
++ }
+ sta_set_link_sinfo(sta, sinfo->links[link_id],
+ link, tidstats);
+ }
+diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
+index fed40dae5583a3..e8ffa62ec183f3 100644
+--- a/net/mptcp/ctrl.c
++++ b/net/mptcp/ctrl.c
+@@ -501,10 +501,15 @@ void mptcp_active_enable(struct sock *sk)
+ struct mptcp_pernet *pernet = mptcp_get_pernet(sock_net(sk));
+
+ if (atomic_read(&pernet->active_disable_times)) {
+- struct dst_entry *dst = sk_dst_get(sk);
++ struct net_device *dev;
++ struct dst_entry *dst;
+
+- if (dst && dst->dev && (dst->dev->flags & IFF_LOOPBACK))
++ rcu_read_lock();
++ dst = __sk_dst_get(sk);
++ dev = dst ? dst_dev_rcu(dst) : NULL;
++ if (dev && (dev->flags & IFF_LOOPBACK))
+ atomic_set(&pernet->active_disable_times, 0);
++ rcu_read_unlock();
+ }
+ }
+
+diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
+index f31a3a79531a2e..e8325890a32238 100644
+--- a/net/mptcp/subflow.c
++++ b/net/mptcp/subflow.c
+@@ -1721,19 +1721,14 @@ static void mptcp_attach_cgroup(struct sock *parent, struct sock *child)
+ /* only the additional subflows created by kworkers have to be modified */
+ if (cgroup_id(sock_cgroup_ptr(parent_skcd)) !=
+ cgroup_id(sock_cgroup_ptr(child_skcd))) {
+-#ifdef CONFIG_MEMCG
+- struct mem_cgroup *memcg = parent->sk_memcg;
+-
+- mem_cgroup_sk_free(child);
+- if (memcg && css_tryget(&memcg->css))
+- child->sk_memcg = memcg;
+-#endif /* CONFIG_MEMCG */
+-
+ cgroup_sk_free(child_skcd);
+ *child_skcd = *parent_skcd;
+ cgroup_sk_clone(child_skcd);
+ }
+ #endif /* CONFIG_SOCK_CGROUP_DATA */
++
++ if (mem_cgroup_sockets_enabled)
++ mem_cgroup_sk_inherit(parent, child);
+ }
+
+ static void mptcp_subflow_ops_override(struct sock *ssk)
+diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
+index 5251524b96afac..5e4453e9ef8e73 100644
+--- a/net/netfilter/ipset/ip_set_hash_gen.h
++++ b/net/netfilter/ipset/ip_set_hash_gen.h
+@@ -63,7 +63,7 @@ struct hbucket {
+ : jhash_size((htable_bits) - HTABLE_REGION_BITS))
+ #define ahash_sizeof_regions(htable_bits) \
+ (ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region))
+-#define ahash_region(n, htable_bits) \
++#define ahash_region(n) \
+ ((n) / jhash_size(HTABLE_REGION_BITS))
+ #define ahash_bucket_start(h, htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? 0 \
+@@ -702,7 +702,7 @@ mtype_resize(struct ip_set *set, bool retried)
+ #endif
+ key = HKEY(data, h->initval, htable_bits);
+ m = __ipset_dereference(hbucket(t, key));
+- nr = ahash_region(key, htable_bits);
++ nr = ahash_region(key);
+ if (!m) {
+ m = kzalloc(sizeof(*m) +
+ AHASH_INIT_SIZE * dsize,
+@@ -852,7 +852,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+ rcu_read_lock_bh();
+ t = rcu_dereference_bh(h->table);
+ key = HKEY(value, h->initval, t->htable_bits);
+- r = ahash_region(key, t->htable_bits);
++ r = ahash_region(key);
+ atomic_inc(&t->uref);
+ elements = t->hregion[r].elements;
+ maxelem = t->maxelem;
+@@ -1050,7 +1050,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+ rcu_read_lock_bh();
+ t = rcu_dereference_bh(h->table);
+ key = HKEY(value, h->initval, t->htable_bits);
+- r = ahash_region(key, t->htable_bits);
++ r = ahash_region(key);
+ atomic_inc(&t->uref);
+ rcu_read_unlock_bh();
+
+diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
+index 965f3c8e5089d3..37ebb0cb62b8b6 100644
+--- a/net/netfilter/ipvs/ip_vs_conn.c
++++ b/net/netfilter/ipvs/ip_vs_conn.c
+@@ -885,7 +885,7 @@ static void ip_vs_conn_expire(struct timer_list *t)
+ * conntrack cleanup for the net.
+ */
+ smp_rmb();
+- if (ipvs->enable)
++ if (READ_ONCE(ipvs->enable))
+ ip_vs_conn_drop_conntrack(cp);
+ }
+
+@@ -1439,7 +1439,7 @@ void ip_vs_expire_nodest_conn_flush(struct netns_ipvs *ipvs)
+ cond_resched_rcu();
+
+ /* netns clean up started, abort delayed work */
+- if (!ipvs->enable)
++ if (!READ_ONCE(ipvs->enable))
+ break;
+ }
+ rcu_read_unlock();
+diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
+index c7a8a08b730891..5ea7ab8bf4dcc2 100644
+--- a/net/netfilter/ipvs/ip_vs_core.c
++++ b/net/netfilter/ipvs/ip_vs_core.c
+@@ -1353,9 +1353,6 @@ ip_vs_out_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *stat
+ if (unlikely(!skb_dst(skb)))
+ return NF_ACCEPT;
+
+- if (!ipvs->enable)
+- return NF_ACCEPT;
+-
+ ip_vs_fill_iph_skb(af, skb, false, &iph);
+ #ifdef CONFIG_IP_VS_IPV6
+ if (af == AF_INET6) {
+@@ -1940,7 +1937,7 @@ ip_vs_in_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state
+ return NF_ACCEPT;
+ }
+ /* ipvs enabled in this netns ? */
+- if (unlikely(sysctl_backup_only(ipvs) || !ipvs->enable))
++ if (unlikely(sysctl_backup_only(ipvs)))
+ return NF_ACCEPT;
+
+ ip_vs_fill_iph_skb(af, skb, false, &iph);
+@@ -2108,7 +2105,7 @@ ip_vs_forward_icmp(void *priv, struct sk_buff *skb,
+ int r;
+
+ /* ipvs enabled in this netns ? */
+- if (unlikely(sysctl_backup_only(ipvs) || !ipvs->enable))
++ if (unlikely(sysctl_backup_only(ipvs)))
+ return NF_ACCEPT;
+
+ if (state->pf == NFPROTO_IPV4) {
+@@ -2295,7 +2292,7 @@ static int __net_init __ip_vs_init(struct net *net)
+ return -ENOMEM;
+
+ /* Hold the beast until a service is registered */
+- ipvs->enable = 0;
++ WRITE_ONCE(ipvs->enable, 0);
+ ipvs->net = net;
+ /* Counters used for creating unique names */
+ ipvs->gen = atomic_read(&ipvs_netns_cnt);
+@@ -2367,7 +2364,7 @@ static void __net_exit __ip_vs_dev_cleanup_batch(struct list_head *net_list)
+ ipvs = net_ipvs(net);
+ ip_vs_unregister_hooks(ipvs, AF_INET);
+ ip_vs_unregister_hooks(ipvs, AF_INET6);
+- ipvs->enable = 0; /* Disable packet reception */
++ WRITE_ONCE(ipvs->enable, 0); /* Disable packet reception */
+ smp_wmb();
+ ip_vs_sync_net_cleanup(ipvs);
+ }
+diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
+index 6a6fc447853372..4c8fa22be88ade 100644
+--- a/net/netfilter/ipvs/ip_vs_ctl.c
++++ b/net/netfilter/ipvs/ip_vs_ctl.c
+@@ -256,7 +256,7 @@ static void est_reload_work_handler(struct work_struct *work)
+ struct ip_vs_est_kt_data *kd = ipvs->est_kt_arr[id];
+
+ /* netns clean up started, abort delayed work */
+- if (!ipvs->enable)
++ if (!READ_ONCE(ipvs->enable))
+ goto unlock;
+ if (!kd)
+ continue;
+@@ -1483,9 +1483,9 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u,
+
+ *svc_p = svc;
+
+- if (!ipvs->enable) {
++ if (!READ_ONCE(ipvs->enable)) {
+ /* Now there is a service - full throttle */
+- ipvs->enable = 1;
++ WRITE_ONCE(ipvs->enable, 1);
+
+ /* Start estimation for first time */
+ ip_vs_est_reload_start(ipvs);
+diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
+index 15049b82673272..93a925f1ed9b81 100644
+--- a/net/netfilter/ipvs/ip_vs_est.c
++++ b/net/netfilter/ipvs/ip_vs_est.c
+@@ -231,7 +231,7 @@ static int ip_vs_estimation_kthread(void *data)
+ void ip_vs_est_reload_start(struct netns_ipvs *ipvs)
+ {
+ /* Ignore reloads before first service is added */
+- if (!ipvs->enable)
++ if (!READ_ONCE(ipvs->enable))
+ return;
+ ip_vs_est_stopped_recalc(ipvs);
+ /* Bump the kthread configuration genid */
+@@ -306,7 +306,7 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs)
+ int i;
+
+ if ((unsigned long)ipvs->est_kt_count >= ipvs->est_max_threads &&
+- ipvs->enable && ipvs->est_max_threads)
++ READ_ONCE(ipvs->enable) && ipvs->est_max_threads)
+ return -EINVAL;
+
+ mutex_lock(&ipvs->est_mutex);
+@@ -343,7 +343,7 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs)
+ }
+
+ /* Start kthread tasks only when services are present */
+- if (ipvs->enable && !ip_vs_est_stopped(ipvs)) {
++ if (READ_ONCE(ipvs->enable) && !ip_vs_est_stopped(ipvs)) {
+ ret = ip_vs_est_kthread_start(ipvs, kd);
+ if (ret < 0)
+ goto out;
+@@ -486,7 +486,7 @@ int ip_vs_start_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats)
+ struct ip_vs_estimator *est = &stats->est;
+ int ret;
+
+- if (!ipvs->est_max_threads && ipvs->enable)
++ if (!ipvs->est_max_threads && READ_ONCE(ipvs->enable))
+ ipvs->est_max_threads = ip_vs_est_max_threads(ipvs);
+
+ est->ktid = -1;
+@@ -663,7 +663,7 @@ static int ip_vs_est_calc_limits(struct netns_ipvs *ipvs, int *chain_max)
+ /* Wait for cpufreq frequency transition */
+ wait_event_idle_timeout(wq, kthread_should_stop(),
+ HZ / 50);
+- if (!ipvs->enable || kthread_should_stop())
++ if (!READ_ONCE(ipvs->enable) || kthread_should_stop())
+ goto stop;
+ }
+
+@@ -681,7 +681,7 @@ static int ip_vs_est_calc_limits(struct netns_ipvs *ipvs, int *chain_max)
+ rcu_read_unlock();
+ local_bh_enable();
+
+- if (!ipvs->enable || kthread_should_stop())
++ if (!READ_ONCE(ipvs->enable) || kthread_should_stop())
+ goto stop;
+ cond_resched();
+
+@@ -757,7 +757,7 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs)
+ mutex_lock(&ipvs->est_mutex);
+ for (id = 1; id < ipvs->est_kt_count; id++) {
+ /* netns clean up started, abort */
+- if (!ipvs->enable)
++ if (!READ_ONCE(ipvs->enable))
+ goto unlock2;
+ kd = ipvs->est_kt_arr[id];
+ if (!kd)
+@@ -787,7 +787,7 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs)
+ id = ipvs->est_kt_count;
+
+ next_kt:
+- if (!ipvs->enable || kthread_should_stop())
++ if (!READ_ONCE(ipvs->enable) || kthread_should_stop())
+ goto unlock;
+ id--;
+ if (id < 0)
+diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
+index d8a284999544b0..206c6700e2006e 100644
+--- a/net/netfilter/ipvs/ip_vs_ftp.c
++++ b/net/netfilter/ipvs/ip_vs_ftp.c
+@@ -53,6 +53,7 @@ enum {
+ IP_VS_FTP_EPSV,
+ };
+
++static bool exiting_module;
+ /*
+ * List of ports (up to IP_VS_APP_MAX_PORTS) to be handled by helper
+ * First port is set to the default port.
+@@ -605,7 +606,7 @@ static void __ip_vs_ftp_exit(struct net *net)
+ {
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+- if (!ipvs)
++ if (!ipvs || !exiting_module)
+ return;
+
+ unregister_ip_vs_app(ipvs, &ip_vs_ftp);
+@@ -627,6 +628,7 @@ static int __init ip_vs_ftp_init(void)
+ */
+ static void __exit ip_vs_ftp_exit(void)
+ {
++ exiting_module = true;
+ unregister_pernet_subsys(&ip_vs_ftp_ops);
+ /* rcu_barrier() is called by netns */
+ }
+diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
+index 1f14ef0436c65f..708b79380f047f 100644
+--- a/net/netfilter/nf_conntrack_standalone.c
++++ b/net/netfilter/nf_conntrack_standalone.c
+@@ -317,6 +317,9 @@ static int ct_seq_show(struct seq_file *s, void *v)
+ smp_acquire__after_ctrl_dep();
+
+ if (nf_ct_should_gc(ct)) {
++ struct ct_iter_state *st = s->private;
++
++ st->skip_elems--;
+ nf_ct_kill(ct);
+ goto release;
+ }
+diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
+index e598a2a252b0a5..811d02b4c4f7cf 100644
+--- a/net/netfilter/nfnetlink.c
++++ b/net/netfilter/nfnetlink.c
+@@ -376,6 +376,7 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
+ const struct nfnetlink_subsystem *ss;
+ const struct nfnl_callback *nc;
+ struct netlink_ext_ack extack;
++ struct nlmsghdr *onlh = nlh;
+ LIST_HEAD(err_list);
+ u32 status;
+ int err;
+@@ -386,6 +387,7 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
+ status = 0;
+ replay_abort:
+ skb = netlink_skb_clone(oskb, GFP_KERNEL);
++ nlh = onlh;
+ if (!skb)
+ return netlink_ack(oskb, nlh, -ENOMEM, NULL);
+
+diff --git a/net/nfc/nci/ntf.c b/net/nfc/nci/ntf.c
+index a818eff27e6bc2..418b84e2b2605f 100644
+--- a/net/nfc/nci/ntf.c
++++ b/net/nfc/nci/ntf.c
+@@ -27,11 +27,16 @@
+
+ /* Handle NCI Notification packets */
+
+-static void nci_core_reset_ntf_packet(struct nci_dev *ndev,
+- const struct sk_buff *skb)
++static int nci_core_reset_ntf_packet(struct nci_dev *ndev,
++ const struct sk_buff *skb)
+ {
+ /* Handle NCI 2.x core reset notification */
+- const struct nci_core_reset_ntf *ntf = (void *)skb->data;
++ const struct nci_core_reset_ntf *ntf;
++
++ if (skb->len < sizeof(struct nci_core_reset_ntf))
++ return -EINVAL;
++
++ ntf = (struct nci_core_reset_ntf *)skb->data;
+
+ ndev->nci_ver = ntf->nci_ver;
+ pr_debug("nci_ver 0x%x, config_status 0x%x\n",
+@@ -42,15 +47,22 @@ static void nci_core_reset_ntf_packet(struct nci_dev *ndev,
+ __le32_to_cpu(ntf->manufact_specific_info);
+
+ nci_req_complete(ndev, NCI_STATUS_OK);
++
++ return 0;
+ }
+
+-static void nci_core_conn_credits_ntf_packet(struct nci_dev *ndev,
+- struct sk_buff *skb)
++static int nci_core_conn_credits_ntf_packet(struct nci_dev *ndev,
++ struct sk_buff *skb)
+ {
+- struct nci_core_conn_credit_ntf *ntf = (void *) skb->data;
++ struct nci_core_conn_credit_ntf *ntf;
+ struct nci_conn_info *conn_info;
+ int i;
+
++ if (skb->len < sizeof(struct nci_core_conn_credit_ntf))
++ return -EINVAL;
++
++ ntf = (struct nci_core_conn_credit_ntf *)skb->data;
++
+ pr_debug("num_entries %d\n", ntf->num_entries);
+
+ if (ntf->num_entries > NCI_MAX_NUM_CONN)
+@@ -68,7 +80,7 @@ static void nci_core_conn_credits_ntf_packet(struct nci_dev *ndev,
+ conn_info = nci_get_conn_info_by_conn_id(ndev,
+ ntf->conn_entries[i].conn_id);
+ if (!conn_info)
+- return;
++ return 0;
+
+ atomic_add(ntf->conn_entries[i].credits,
+ &conn_info->credits_cnt);
+@@ -77,12 +89,19 @@ static void nci_core_conn_credits_ntf_packet(struct nci_dev *ndev,
+ /* trigger the next tx */
+ if (!skb_queue_empty(&ndev->tx_q))
+ queue_work(ndev->tx_wq, &ndev->tx_work);
++
++ return 0;
+ }
+
+-static void nci_core_generic_error_ntf_packet(struct nci_dev *ndev,
+- const struct sk_buff *skb)
++static int nci_core_generic_error_ntf_packet(struct nci_dev *ndev,
++ const struct sk_buff *skb)
+ {
+- __u8 status = skb->data[0];
++ __u8 status;
++
++ if (skb->len < 1)
++ return -EINVAL;
++
++ status = skb->data[0];
+
+ pr_debug("status 0x%x\n", status);
+
+@@ -91,12 +110,19 @@ static void nci_core_generic_error_ntf_packet(struct nci_dev *ndev,
+ (the state remains the same) */
+ nci_req_complete(ndev, status);
+ }
++
++ return 0;
+ }
+
+-static void nci_core_conn_intf_error_ntf_packet(struct nci_dev *ndev,
+- struct sk_buff *skb)
++static int nci_core_conn_intf_error_ntf_packet(struct nci_dev *ndev,
++ struct sk_buff *skb)
+ {
+- struct nci_core_intf_error_ntf *ntf = (void *) skb->data;
++ struct nci_core_intf_error_ntf *ntf;
++
++ if (skb->len < sizeof(struct nci_core_intf_error_ntf))
++ return -EINVAL;
++
++ ntf = (struct nci_core_intf_error_ntf *)skb->data;
+
+ ntf->conn_id = nci_conn_id(&ntf->conn_id);
+
+@@ -105,6 +131,8 @@ static void nci_core_conn_intf_error_ntf_packet(struct nci_dev *ndev,
+ /* complete the data exchange transaction, if exists */
+ if (test_bit(NCI_DATA_EXCHANGE, &ndev->flags))
+ nci_data_exchange_complete(ndev, NULL, ntf->conn_id, -EIO);
++
++ return 0;
+ }
+
+ static const __u8 *
+@@ -329,13 +357,18 @@ void nci_clear_target_list(struct nci_dev *ndev)
+ ndev->n_targets = 0;
+ }
+
+-static void nci_rf_discover_ntf_packet(struct nci_dev *ndev,
+- const struct sk_buff *skb)
++static int nci_rf_discover_ntf_packet(struct nci_dev *ndev,
++ const struct sk_buff *skb)
+ {
+ struct nci_rf_discover_ntf ntf;
+- const __u8 *data = skb->data;
++ const __u8 *data;
+ bool add_target = true;
+
++ if (skb->len < sizeof(struct nci_rf_discover_ntf))
++ return -EINVAL;
++
++ data = skb->data;
++
+ ntf.rf_discovery_id = *data++;
+ ntf.rf_protocol = *data++;
+ ntf.rf_tech_and_mode = *data++;
+@@ -390,6 +423,8 @@ static void nci_rf_discover_ntf_packet(struct nci_dev *ndev,
+ nfc_targets_found(ndev->nfc_dev, ndev->targets,
+ ndev->n_targets);
+ }
++
++ return 0;
+ }
+
+ static int nci_extract_activation_params_iso_dep(struct nci_dev *ndev,
+@@ -553,14 +588,19 @@ static int nci_store_ats_nfc_iso_dep(struct nci_dev *ndev,
+ return NCI_STATUS_OK;
+ }
+
+-static void nci_rf_intf_activated_ntf_packet(struct nci_dev *ndev,
+- const struct sk_buff *skb)
++static int nci_rf_intf_activated_ntf_packet(struct nci_dev *ndev,
++ const struct sk_buff *skb)
+ {
+ struct nci_conn_info *conn_info;
+ struct nci_rf_intf_activated_ntf ntf;
+- const __u8 *data = skb->data;
++ const __u8 *data;
+ int err = NCI_STATUS_OK;
+
++ if (skb->len < sizeof(struct nci_rf_intf_activated_ntf))
++ return -EINVAL;
++
++ data = skb->data;
++
+ ntf.rf_discovery_id = *data++;
+ ntf.rf_interface = *data++;
+ ntf.rf_protocol = *data++;
+@@ -667,7 +707,7 @@ static void nci_rf_intf_activated_ntf_packet(struct nci_dev *ndev,
+ if (err == NCI_STATUS_OK) {
+ conn_info = ndev->rf_conn_info;
+ if (!conn_info)
+- return;
++ return 0;
+
+ conn_info->max_pkt_payload_len = ntf.max_data_pkt_payload_size;
+ conn_info->initial_num_credits = ntf.initial_num_credits;
+@@ -721,19 +761,26 @@ static void nci_rf_intf_activated_ntf_packet(struct nci_dev *ndev,
+ pr_err("error when signaling tm activation\n");
+ }
+ }
++
++ return 0;
+ }
+
+-static void nci_rf_deactivate_ntf_packet(struct nci_dev *ndev,
+- const struct sk_buff *skb)
++static int nci_rf_deactivate_ntf_packet(struct nci_dev *ndev,
++ const struct sk_buff *skb)
+ {
+ const struct nci_conn_info *conn_info;
+- const struct nci_rf_deactivate_ntf *ntf = (void *)skb->data;
++ const struct nci_rf_deactivate_ntf *ntf;
++
++ if (skb->len < sizeof(struct nci_rf_deactivate_ntf))
++ return -EINVAL;
++
++ ntf = (struct nci_rf_deactivate_ntf *)skb->data;
+
+ pr_debug("entry, type 0x%x, reason 0x%x\n", ntf->type, ntf->reason);
+
+ conn_info = ndev->rf_conn_info;
+ if (!conn_info)
+- return;
++ return 0;
+
+ /* drop tx data queue */
+ skb_queue_purge(&ndev->tx_q);
+@@ -765,14 +812,20 @@ static void nci_rf_deactivate_ntf_packet(struct nci_dev *ndev,
+ }
+
+ nci_req_complete(ndev, NCI_STATUS_OK);
++
++ return 0;
+ }
+
+-static void nci_nfcee_discover_ntf_packet(struct nci_dev *ndev,
+- const struct sk_buff *skb)
++static int nci_nfcee_discover_ntf_packet(struct nci_dev *ndev,
++ const struct sk_buff *skb)
+ {
+ u8 status = NCI_STATUS_OK;
+- const struct nci_nfcee_discover_ntf *nfcee_ntf =
+- (struct nci_nfcee_discover_ntf *)skb->data;
++ const struct nci_nfcee_discover_ntf *nfcee_ntf;
++
++ if (skb->len < sizeof(struct nci_nfcee_discover_ntf))
++ return -EINVAL;
++
++ nfcee_ntf = (struct nci_nfcee_discover_ntf *)skb->data;
+
+ /* NFCForum NCI 9.2.1 HCI Network Specific Handling
+ * If the NFCC supports the HCI Network, it SHALL return one,
+@@ -783,6 +836,8 @@ static void nci_nfcee_discover_ntf_packet(struct nci_dev *ndev,
+ ndev->cur_params.id = nfcee_ntf->nfcee_id;
+
+ nci_req_complete(ndev, status);
++
++ return 0;
+ }
+
+ void nci_ntf_packet(struct nci_dev *ndev, struct sk_buff *skb)
+@@ -809,35 +864,43 @@ void nci_ntf_packet(struct nci_dev *ndev, struct sk_buff *skb)
+
+ switch (ntf_opcode) {
+ case NCI_OP_CORE_RESET_NTF:
+- nci_core_reset_ntf_packet(ndev, skb);
++ if (nci_core_reset_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_CORE_CONN_CREDITS_NTF:
+- nci_core_conn_credits_ntf_packet(ndev, skb);
++ if (nci_core_conn_credits_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_CORE_GENERIC_ERROR_NTF:
+- nci_core_generic_error_ntf_packet(ndev, skb);
++ if (nci_core_generic_error_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_CORE_INTF_ERROR_NTF:
+- nci_core_conn_intf_error_ntf_packet(ndev, skb);
++ if (nci_core_conn_intf_error_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_RF_DISCOVER_NTF:
+- nci_rf_discover_ntf_packet(ndev, skb);
++ if (nci_rf_discover_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_RF_INTF_ACTIVATED_NTF:
+- nci_rf_intf_activated_ntf_packet(ndev, skb);
++ if (nci_rf_intf_activated_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_RF_DEACTIVATE_NTF:
+- nci_rf_deactivate_ntf_packet(ndev, skb);
++ if (nci_rf_deactivate_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_NFCEE_DISCOVER_NTF:
+- nci_nfcee_discover_ntf_packet(ndev, skb);
++ if (nci_nfcee_discover_ntf_packet(ndev, skb))
++ goto end;
+ break;
+
+ case NCI_OP_RF_NFCEE_ACTION_NTF:
+diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
+index 08be56dfb3f24e..09745baa101700 100644
+--- a/net/smc/smc_clc.c
++++ b/net/smc/smc_clc.c
+@@ -509,10 +509,10 @@ static bool smc_clc_msg_hdr_valid(struct smc_clc_msg_hdr *clcm, bool check_trl)
+ }
+
+ /* find ipv4 addr on device and get the prefix len, fill CLC proposal msg */
+-static int smc_clc_prfx_set4_rcu(struct dst_entry *dst, __be32 ipv4,
++static int smc_clc_prfx_set4_rcu(struct net_device *dev, __be32 ipv4,
+ struct smc_clc_msg_proposal_prefix *prop)
+ {
+- struct in_device *in_dev = __in_dev_get_rcu(dst->dev);
++ struct in_device *in_dev = __in_dev_get_rcu(dev);
+ const struct in_ifaddr *ifa;
+
+ if (!in_dev)
+@@ -530,12 +530,12 @@ static int smc_clc_prfx_set4_rcu(struct dst_entry *dst, __be32 ipv4,
+ }
+
+ /* fill CLC proposal msg with ipv6 prefixes from device */
+-static int smc_clc_prfx_set6_rcu(struct dst_entry *dst,
++static int smc_clc_prfx_set6_rcu(struct net_device *dev,
+ struct smc_clc_msg_proposal_prefix *prop,
+ struct smc_clc_ipv6_prefix *ipv6_prfx)
+ {
+ #if IS_ENABLED(CONFIG_IPV6)
+- struct inet6_dev *in6_dev = __in6_dev_get(dst->dev);
++ struct inet6_dev *in6_dev = __in6_dev_get(dev);
+ struct inet6_ifaddr *ifa;
+ int cnt = 0;
+
+@@ -564,41 +564,44 @@ static int smc_clc_prfx_set(struct socket *clcsock,
+ struct smc_clc_msg_proposal_prefix *prop,
+ struct smc_clc_ipv6_prefix *ipv6_prfx)
+ {
+- struct dst_entry *dst = sk_dst_get(clcsock->sk);
+ struct sockaddr_storage addrs;
+ struct sockaddr_in6 *addr6;
+ struct sockaddr_in *addr;
++ struct net_device *dev;
++ struct dst_entry *dst;
+ int rc = -ENOENT;
+
+- if (!dst) {
+- rc = -ENOTCONN;
+- goto out;
+- }
+- if (!dst->dev) {
+- rc = -ENODEV;
+- goto out_rel;
+- }
+ /* get address to which the internal TCP socket is bound */
+ if (kernel_getsockname(clcsock, (struct sockaddr *)&addrs) < 0)
+- goto out_rel;
++ goto out;
++
+ /* analyze IP specific data of net_device belonging to TCP socket */
+ addr6 = (struct sockaddr_in6 *)&addrs;
++
+ rcu_read_lock();
++
++ dst = __sk_dst_get(clcsock->sk);
++ dev = dst ? dst_dev_rcu(dst) : NULL;
++ if (!dev) {
++ rc = -ENODEV;
++ goto out_unlock;
++ }
++
+ if (addrs.ss_family == PF_INET) {
+ /* IPv4 */
+ addr = (struct sockaddr_in *)&addrs;
+- rc = smc_clc_prfx_set4_rcu(dst, addr->sin_addr.s_addr, prop);
++ rc = smc_clc_prfx_set4_rcu(dev, addr->sin_addr.s_addr, prop);
+ } else if (ipv6_addr_v4mapped(&addr6->sin6_addr)) {
+ /* mapped IPv4 address - peer is IPv4 only */
+- rc = smc_clc_prfx_set4_rcu(dst, addr6->sin6_addr.s6_addr32[3],
++ rc = smc_clc_prfx_set4_rcu(dev, addr6->sin6_addr.s6_addr32[3],
+ prop);
+ } else {
+ /* IPv6 */
+- rc = smc_clc_prfx_set6_rcu(dst, prop, ipv6_prfx);
++ rc = smc_clc_prfx_set6_rcu(dev, prop, ipv6_prfx);
+ }
++
++out_unlock:
+ rcu_read_unlock();
+-out_rel:
+- dst_release(dst);
+ out:
+ return rc;
+ }
+@@ -654,26 +657,26 @@ static int smc_clc_prfx_match6_rcu(struct net_device *dev,
+ int smc_clc_prfx_match(struct socket *clcsock,
+ struct smc_clc_msg_proposal_prefix *prop)
+ {
+- struct dst_entry *dst = sk_dst_get(clcsock->sk);
++ struct net_device *dev;
++ struct dst_entry *dst;
+ int rc;
+
+- if (!dst) {
+- rc = -ENOTCONN;
+- goto out;
+- }
+- if (!dst->dev) {
++ rcu_read_lock();
++
++ dst = __sk_dst_get(clcsock->sk);
++ dev = dst ? dst_dev_rcu(dst) : NULL;
++ if (!dev) {
+ rc = -ENODEV;
+- goto out_rel;
++ goto out;
+ }
+- rcu_read_lock();
++
+ if (!prop->ipv6_prefixes_cnt)
+- rc = smc_clc_prfx_match4_rcu(dst->dev, prop);
++ rc = smc_clc_prfx_match4_rcu(dev, prop);
+ else
+- rc = smc_clc_prfx_match6_rcu(dst->dev, prop);
+- rcu_read_unlock();
+-out_rel:
+- dst_release(dst);
++ rc = smc_clc_prfx_match6_rcu(dev, prop);
+ out:
++ rcu_read_unlock();
++
+ return rc;
+ }
+
+diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
+index 262746e304ddae..2a559a98541c75 100644
+--- a/net/smc/smc_core.c
++++ b/net/smc/smc_core.c
+@@ -1883,35 +1883,32 @@ static int smc_vlan_by_tcpsk_walk(struct net_device *lower_dev,
+ /* Determine vlan of internal TCP socket. */
+ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini)
+ {
+- struct dst_entry *dst = sk_dst_get(clcsock->sk);
+ struct netdev_nested_priv priv;
+ struct net_device *ndev;
++ struct dst_entry *dst;
+ int rc = 0;
+
+ ini->vlan_id = 0;
+- if (!dst) {
+- rc = -ENOTCONN;
+- goto out;
+- }
+- if (!dst->dev) {
++
++ rcu_read_lock();
++
++ dst = __sk_dst_get(clcsock->sk);
++ ndev = dst ? dst_dev_rcu(dst) : NULL;
++ if (!ndev) {
+ rc = -ENODEV;
+- goto out_rel;
++ goto out;
+ }
+
+- ndev = dst->dev;
+ if (is_vlan_dev(ndev)) {
+ ini->vlan_id = vlan_dev_vlan_id(ndev);
+- goto out_rel;
++ goto out;
+ }
+
+ priv.data = (void *)&ini->vlan_id;
+- rtnl_lock();
+- netdev_walk_all_lower_dev(ndev, smc_vlan_by_tcpsk_walk, &priv);
+- rtnl_unlock();
+-
+-out_rel:
+- dst_release(dst);
++ netdev_walk_all_lower_dev_rcu(ndev, smc_vlan_by_tcpsk_walk, &priv);
+ out:
++ rcu_read_unlock();
++
+ return rc;
+ }
+
+diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
+index 76ad29e31d605d..db3043b1e3fdb1 100644
+--- a/net/smc/smc_pnet.c
++++ b/net/smc/smc_pnet.c
+@@ -1126,37 +1126,38 @@ static void smc_pnet_find_ism_by_pnetid(struct net_device *ndev,
+ */
+ void smc_pnet_find_roce_resource(struct sock *sk, struct smc_init_info *ini)
+ {
+- struct dst_entry *dst = sk_dst_get(sk);
+-
+- if (!dst)
+- goto out;
+- if (!dst->dev)
+- goto out_rel;
++ struct net_device *dev;
++ struct dst_entry *dst;
+
+- smc_pnet_find_roce_by_pnetid(dst->dev, ini);
++ rcu_read_lock();
++ dst = __sk_dst_get(sk);
++ dev = dst ? dst_dev_rcu(dst) : NULL;
++ dev_hold(dev);
++ rcu_read_unlock();
+
+-out_rel:
+- dst_release(dst);
+-out:
+- return;
++ if (dev) {
++ smc_pnet_find_roce_by_pnetid(dev, ini);
++ dev_put(dev);
++ }
+ }
+
+ void smc_pnet_find_ism_resource(struct sock *sk, struct smc_init_info *ini)
+ {
+- struct dst_entry *dst = sk_dst_get(sk);
++ struct net_device *dev;
++ struct dst_entry *dst;
+
+ ini->ism_dev[0] = NULL;
+- if (!dst)
+- goto out;
+- if (!dst->dev)
+- goto out_rel;
+
+- smc_pnet_find_ism_by_pnetid(dst->dev, ini);
++ rcu_read_lock();
++ dst = __sk_dst_get(sk);
++ dev = dst ? dst_dev_rcu(dst) : NULL;
++ dev_hold(dev);
++ rcu_read_unlock();
+
+-out_rel:
+- dst_release(dst);
+-out:
+- return;
++ if (dev) {
++ smc_pnet_find_ism_by_pnetid(dev, ini);
++ dev_put(dev);
++ }
+ }
+
+ /* Lookup and apply a pnet table entry to the given ib device.
+diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
+index e82212f6b5620b..a8ec30759a184e 100644
+--- a/net/sunrpc/auth_gss/svcauth_gss.c
++++ b/net/sunrpc/auth_gss/svcauth_gss.c
+@@ -724,7 +724,7 @@ svcauth_gss_verify_header(struct svc_rqst *rqstp, struct rsc *rsci,
+ rqstp->rq_auth_stat = rpc_autherr_badverf;
+ return SVC_DENIED;
+ }
+- if (flavor != RPC_AUTH_GSS) {
++ if (flavor != RPC_AUTH_GSS || checksum.len < XDR_UNIT) {
+ rqstp->rq_auth_stat = rpc_autherr_badverf;
+ return SVC_DENIED;
+ }
+diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
+index f672a62a9a52f6..a82fdcf199690f 100644
+--- a/net/tls/tls_device.c
++++ b/net/tls/tls_device.c
+@@ -123,17 +123,19 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
+ /* We assume that the socket is already connected */
+ static struct net_device *get_netdev_for_sock(struct sock *sk)
+ {
+- struct dst_entry *dst = sk_dst_get(sk);
+- struct net_device *netdev = NULL;
++ struct net_device *dev, *lowest_dev = NULL;
++ struct dst_entry *dst;
+
+- if (likely(dst)) {
+- netdev = netdev_sk_get_lowest_dev(dst->dev, sk);
+- dev_hold(netdev);
++ rcu_read_lock();
++ dst = __sk_dst_get(sk);
++ dev = dst ? dst_dev_rcu(dst) : NULL;
++ if (likely(dev)) {
++ lowest_dev = netdev_sk_get_lowest_dev(dev, sk);
++ dev_hold(lowest_dev);
+ }
++ rcu_read_unlock();
+
+- dst_release(dst);
+-
+- return netdev;
++ return lowest_dev;
+ }
+
+ static void destroy_record(struct tls_record_info *record)
+diff --git a/net/wireless/util.c b/net/wireless/util.c
+index 240c68baa3d1f7..341dbf642181bb 100644
+--- a/net/wireless/util.c
++++ b/net/wireless/util.c
+@@ -2992,7 +2992,7 @@ bool cfg80211_radio_chandef_valid(const struct wiphy_radio *radio,
+ u32 freq, width;
+
+ freq = ieee80211_chandef_to_khz(chandef);
+- width = cfg80211_chandef_get_width(chandef);
++ width = MHZ_TO_KHZ(cfg80211_chandef_get_width(chandef));
+ if (!ieee80211_radio_freq_range_valid(radio, freq, width))
+ return false;
+
+diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
+index 84d60635e8a9ba..7c47875cd2ae41 100644
+--- a/rust/bindings/bindings_helper.h
++++ b/rust/bindings/bindings_helper.h
+@@ -99,3 +99,4 @@ const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
+
+ const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC = XA_FLAGS_ALLOC;
+ const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC1 = XA_FLAGS_ALLOC1;
++const vm_flags_t RUST_CONST_HELPER_VM_MERGEABLE = VM_MERGEABLE;
+diff --git a/rust/kernel/cpumask.rs b/rust/kernel/cpumask.rs
+index 3fcbff43867054..05e1c882404e45 100644
+--- a/rust/kernel/cpumask.rs
++++ b/rust/kernel/cpumask.rs
+@@ -212,6 +212,7 @@ pub fn copy(&self, dstp: &mut Self) {
+ /// }
+ /// assert_eq!(mask2.weight(), count);
+ /// ```
++#[repr(transparent)]
+ pub struct CpumaskVar {
+ #[cfg(CONFIG_CPUMASK_OFFSTACK)]
+ ptr: NonNull<Cpumask>,
+diff --git a/scripts/misc-check b/scripts/misc-check
+index 84f08da17b2c05..40e5a4b01ff473 100755
+--- a/scripts/misc-check
++++ b/scripts/misc-check
+@@ -45,7 +45,7 @@ check_tracked_ignored_files () {
+ # does not automatically fix it.
+ check_missing_include_linux_export_h () {
+
+- git -C "${srctree:-.}" grep --files-with-matches -E 'EXPORT_SYMBOL((_NS)?(_GPL)?|_GPL_FOR_MODULES)\(.*\)' \
++ git -C "${srctree:-.}" grep --files-with-matches -E 'EXPORT_SYMBOL((_NS)?(_GPL)?|_FOR_MODULES)\(.*\)' \
+ -- '*.[ch]' :^tools/ :^include/linux/export.h |
+ xargs -r git -C "${srctree:-.}" grep --files-without-match '#include[[:space:]]*<linux/export\.h>' |
+ xargs -r printf "%s: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing\n" >&2
+@@ -58,7 +58,7 @@ check_unnecessary_include_linux_export_h () {
+
+ git -C "${srctree:-.}" grep --files-with-matches '#include[[:space:]]*<linux/export\.h>' \
+ -- '*.[c]' :^tools/ |
+- xargs -r git -C "${srctree:-.}" grep --files-without-match -E 'EXPORT_SYMBOL((_NS)?(_GPL)?|_GPL_FOR_MODULES)\(.*\)' |
++ xargs -r git -C "${srctree:-.}" grep --files-without-match -E 'EXPORT_SYMBOL((_NS)?(_GPL)?|_FOR_MODULES)\(.*\)' |
+ xargs -r printf "%s: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present\n" >&2
+ }
+
+diff --git a/security/Kconfig b/security/Kconfig
+index 4816fc74f81ebe..285f284dfcac44 100644
+--- a/security/Kconfig
++++ b/security/Kconfig
+@@ -269,6 +269,7 @@ endchoice
+
+ config LSM
+ string "Ordered list of enabled LSMs"
++ depends on SECURITY
+ default "landlock,lockdown,yama,loadpin,safesetid,smack,selinux,tomoyo,apparmor,ipe,bpf" if DEFAULT_SECURITY_SMACK
+ default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,ipe,bpf" if DEFAULT_SECURITY_APPARMOR
+ default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,ipe,bpf" if DEFAULT_SECURITY_TOMOYO
+diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
+index 1eab940fa2e5ac..68bee40c9adafd 100644
+--- a/sound/core/pcm_native.c
++++ b/sound/core/pcm_native.c
+@@ -84,19 +84,24 @@ void snd_pcm_group_init(struct snd_pcm_group *group)
+ }
+
+ /* define group lock helpers */
+-#define DEFINE_PCM_GROUP_LOCK(action, mutex_action) \
++#define DEFINE_PCM_GROUP_LOCK(action, bh_lock, bh_unlock, mutex_action) \
+ static void snd_pcm_group_ ## action(struct snd_pcm_group *group, bool nonatomic) \
+ { \
+- if (nonatomic) \
++ if (nonatomic) { \
+ mutex_ ## mutex_action(&group->mutex); \
+- else \
+- spin_ ## action(&group->lock); \
+-}
+-
+-DEFINE_PCM_GROUP_LOCK(lock, lock);
+-DEFINE_PCM_GROUP_LOCK(unlock, unlock);
+-DEFINE_PCM_GROUP_LOCK(lock_irq, lock);
+-DEFINE_PCM_GROUP_LOCK(unlock_irq, unlock);
++ } else { \
++ if (IS_ENABLED(CONFIG_PREEMPT_RT) && bh_lock) \
++ local_bh_disable(); \
++ spin_ ## action(&group->lock); \
++ if (IS_ENABLED(CONFIG_PREEMPT_RT) && bh_unlock) \
++ local_bh_enable(); \
++ } \
++}
++
++DEFINE_PCM_GROUP_LOCK(lock, false, false, lock);
++DEFINE_PCM_GROUP_LOCK(unlock, false, false, unlock);
++DEFINE_PCM_GROUP_LOCK(lock_irq, true, false, lock);
++DEFINE_PCM_GROUP_LOCK(unlock_irq, false, true, unlock);
+
+ /**
+ * snd_pcm_stream_lock - Lock the PCM stream
+diff --git a/sound/hda/codecs/hdmi/hdmi.c b/sound/hda/codecs/hdmi/hdmi.c
+index 44576b30f69951..774969dbfde457 100644
+--- a/sound/hda/codecs/hdmi/hdmi.c
++++ b/sound/hda/codecs/hdmi/hdmi.c
+@@ -1583,6 +1583,7 @@ static const struct snd_pci_quirk force_connect_list[] = {
+ SND_PCI_QUIRK(0x103c, 0x83e2, "HP EliteDesk 800 G4", 1),
+ SND_PCI_QUIRK(0x103c, 0x83ef, "HP MP9 G4 Retail System AMS", 1),
+ SND_PCI_QUIRK(0x103c, 0x845a, "HP EliteDesk 800 G4 DM 65W", 1),
++ SND_PCI_QUIRK(0x103c, 0x83f3, "HP ProDesk 400", 1),
+ SND_PCI_QUIRK(0x103c, 0x870f, "HP", 1),
+ SND_PCI_QUIRK(0x103c, 0x871a, "HP", 1),
+ SND_PCI_QUIRK(0x103c, 0x8711, "HP", 1),
+diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c
+index f267437c96981e..07ea76efa5de8f 100644
+--- a/sound/hda/codecs/realtek/alc269.c
++++ b/sound/hda/codecs/realtek/alc269.c
+@@ -6487,6 +6487,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x89c6, "Zbook Fury 17 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89ca, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
+ SND_PCI_QUIRK(0x103c, 0x89d3, "HP EliteBook 645 G9 (MB 89D2)", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
++ SND_PCI_QUIRK(0x103c, 0x89da, "HP Spectre x360 14t-ea100", ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX),
+ SND_PCI_QUIRK(0x103c, 0x89e7, "HP Elite x2 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8a0f, "HP Pavilion 14-ec1xxx", ALC287_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8a20, "HP Laptop 15s-fq5xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2),
+diff --git a/sound/pci/lx6464es/lx_core.c b/sound/pci/lx6464es/lx_core.c
+index 9d95ecb299aed8..a99acd1125e74f 100644
+--- a/sound/pci/lx6464es/lx_core.c
++++ b/sound/pci/lx6464es/lx_core.c
+@@ -316,7 +316,7 @@ static int lx_message_send_atomic(struct lx6464es *chip, struct lx_rmh *rmh)
+ /* low-level dsp access */
+ int lx_dsp_get_version(struct lx6464es *chip, u32 *rdsp_version)
+ {
+- u16 ret;
++ int ret;
+
+ mutex_lock(&chip->msg_lock);
+
+@@ -330,10 +330,10 @@ int lx_dsp_get_version(struct lx6464es *chip, u32 *rdsp_version)
+
+ int lx_dsp_get_clock_frequency(struct lx6464es *chip, u32 *rfreq)
+ {
+- u16 ret = 0;
+ u32 freq_raw = 0;
+ u32 freq = 0;
+ u32 frequency = 0;
++ int ret;
+
+ mutex_lock(&chip->msg_lock);
+
+diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
+index 1bb7e1dc7e6b0a..e92939068bf75e 100644
+--- a/sound/soc/codecs/wcd934x.c
++++ b/sound/soc/codecs/wcd934x.c
+@@ -5831,6 +5831,13 @@ static const struct snd_soc_component_driver wcd934x_component_drv = {
+ .endianness = 1,
+ };
+
++static void wcd934x_put_device_action(void *data)
++{
++ struct device *dev = data;
++
++ put_device(dev);
++}
++
+ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
+ {
+ struct device *dev = &wcd->sdev->dev;
+@@ -5847,11 +5854,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
+ return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
+
+ slim_get_logical_addr(wcd->sidev);
+- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
++ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
+ &wcd934x_ifc_regmap_config);
+- if (IS_ERR(wcd->if_regmap))
++ if (IS_ERR(wcd->if_regmap)) {
++ put_device(&wcd->sidev->dev);
+ return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
+ "Failed to allocate ifc register map\n");
++ }
+
+ of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
+ &wcd->dmic_sample_rate);
+@@ -5893,6 +5902,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
+ if (ret)
+ return ret;
+
++ ret = devm_add_action_or_reset(dev, wcd934x_put_device_action, &wcd->sidev->dev);
++ if (ret)
++ return ret;
++
+ /* set default rate 9P6MHz */
+ regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
+ WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
+diff --git a/sound/soc/codecs/wcd937x.c b/sound/soc/codecs/wcd937x.c
+index 3b0a8cc314e059..de2dff3c56d328 100644
+--- a/sound/soc/codecs/wcd937x.c
++++ b/sound/soc/codecs/wcd937x.c
+@@ -2046,9 +2046,9 @@ static const struct snd_kcontrol_new wcd937x_snd_controls[] = {
+ SOC_ENUM_EXT("RX HPH Mode", rx_hph_mode_mux_enum,
+ wcd937x_rx_hph_mode_get, wcd937x_rx_hph_mode_put),
+
+- SOC_SINGLE_EXT("HPHL_COMP Switch", SND_SOC_NOPM, 0, 1, 0,
++ SOC_SINGLE_EXT("HPHL_COMP Switch", WCD937X_COMP_L, 0, 1, 0,
+ wcd937x_get_compander, wcd937x_set_compander),
+- SOC_SINGLE_EXT("HPHR_COMP Switch", SND_SOC_NOPM, 1, 1, 0,
++ SOC_SINGLE_EXT("HPHR_COMP Switch", WCD937X_COMP_R, 1, 1, 0,
+ wcd937x_get_compander, wcd937x_set_compander),
+
+ SOC_SINGLE_TLV("HPHL Volume", WCD937X_HPH_L_EN, 0, 20, 1, line_gain),
+diff --git a/sound/soc/codecs/wcd937x.h b/sound/soc/codecs/wcd937x.h
+index 3ab21bb5846e2c..d20886a2803a4c 100644
+--- a/sound/soc/codecs/wcd937x.h
++++ b/sound/soc/codecs/wcd937x.h
+@@ -552,21 +552,21 @@ int wcd937x_sdw_hw_params(struct wcd937x_sdw_priv *wcd,
+ struct device *wcd937x_sdw_device_get(struct device_node *np);
+
+ #else
+-int wcd937x_sdw_free(struct wcd937x_sdw_priv *wcd,
++static inline int wcd937x_sdw_free(struct wcd937x_sdw_priv *wcd,
+ struct snd_pcm_substream *substream,
+ struct snd_soc_dai *dai)
+ {
+ return -EOPNOTSUPP;
+ }
+
+-int wcd937x_sdw_set_sdw_stream(struct wcd937x_sdw_priv *wcd,
++static inline int wcd937x_sdw_set_sdw_stream(struct wcd937x_sdw_priv *wcd,
+ struct snd_soc_dai *dai,
+ void *stream, int direction)
+ {
+ return -EOPNOTSUPP;
+ }
+
+-int wcd937x_sdw_hw_params(struct wcd937x_sdw_priv *wcd,
++static inline int wcd937x_sdw_hw_params(struct wcd937x_sdw_priv *wcd,
+ struct snd_pcm_substream *substream,
+ struct snd_pcm_hw_params *params,
+ struct snd_soc_dai *dai)
+diff --git a/sound/soc/intel/boards/bytcht_es8316.c b/sound/soc/intel/boards/bytcht_es8316.c
+index 62594e7966ab0f..b384d38654e658 100644
+--- a/sound/soc/intel/boards/bytcht_es8316.c
++++ b/sound/soc/intel/boards/bytcht_es8316.c
+@@ -47,7 +47,8 @@ enum {
+ BYT_CHT_ES8316_INTMIC_IN2_MAP,
+ };
+
+-#define BYT_CHT_ES8316_MAP(quirk) ((quirk) & GENMASK(3, 0))
++#define BYT_CHT_ES8316_MAP_MASK GENMASK(3, 0)
++#define BYT_CHT_ES8316_MAP(quirk) ((quirk) & BYT_CHT_ES8316_MAP_MASK)
+ #define BYT_CHT_ES8316_SSP0 BIT(16)
+ #define BYT_CHT_ES8316_MONO_SPEAKER BIT(17)
+ #define BYT_CHT_ES8316_JD_INVERTED BIT(18)
+@@ -60,10 +61,23 @@ MODULE_PARM_DESC(quirk, "Board-specific quirk override");
+
+ static void log_quirks(struct device *dev)
+ {
+- if (BYT_CHT_ES8316_MAP(quirk) == BYT_CHT_ES8316_INTMIC_IN1_MAP)
++ int map;
++
++ map = BYT_CHT_ES8316_MAP(quirk);
++ switch (map) {
++ case BYT_CHT_ES8316_INTMIC_IN1_MAP:
+ dev_info(dev, "quirk IN1_MAP enabled");
+- if (BYT_CHT_ES8316_MAP(quirk) == BYT_CHT_ES8316_INTMIC_IN2_MAP)
++ break;
++ case BYT_CHT_ES8316_INTMIC_IN2_MAP:
+ dev_info(dev, "quirk IN2_MAP enabled");
++ break;
++ default:
++ dev_warn_once(dev, "quirk sets invalid input map: 0x%x, default to INTMIC_IN1_MAP\n", map);
++ quirk &= ~BYT_CHT_ES8316_MAP_MASK;
++ quirk |= BYT_CHT_ES8316_INTMIC_IN1_MAP;
++ break;
++ }
++
+ if (quirk & BYT_CHT_ES8316_SSP0)
+ dev_info(dev, "quirk SSP0 enabled");
+ if (quirk & BYT_CHT_ES8316_MONO_SPEAKER)
+diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c
+index 0f3b8f44e70112..bc846558480e41 100644
+--- a/sound/soc/intel/boards/bytcr_rt5640.c
++++ b/sound/soc/intel/boards/bytcr_rt5640.c
+@@ -68,7 +68,8 @@ enum {
+ BYT_RT5640_OVCD_SF_1P5 = (RT5640_OVCD_SF_1P5 << 13),
+ };
+
+-#define BYT_RT5640_MAP(quirk) ((quirk) & GENMASK(3, 0))
++#define BYT_RT5640_MAP_MASK GENMASK(3, 0)
++#define BYT_RT5640_MAP(quirk) ((quirk) & BYT_RT5640_MAP_MASK)
+ #define BYT_RT5640_JDSRC(quirk) (((quirk) & GENMASK(7, 4)) >> 4)
+ #define BYT_RT5640_OVCD_TH(quirk) (((quirk) & GENMASK(12, 8)) >> 8)
+ #define BYT_RT5640_OVCD_SF(quirk) (((quirk) & GENMASK(14, 13)) >> 13)
+@@ -140,7 +141,9 @@ static void log_quirks(struct device *dev)
+ dev_info(dev, "quirk NO_INTERNAL_MIC_MAP enabled\n");
+ break;
+ default:
+- dev_err(dev, "quirk map 0x%x is not supported, microphone input will not work\n", map);
++ dev_warn_once(dev, "quirk sets invalid input map: 0x%x, default to DMIC1_MAP\n", map);
++ byt_rt5640_quirk &= ~BYT_RT5640_MAP_MASK;
++ byt_rt5640_quirk |= BYT_RT5640_DMIC1_MAP;
+ break;
+ }
+ if (byt_rt5640_quirk & BYT_RT5640_HSMIC2_ON_IN1)
+diff --git a/sound/soc/intel/boards/bytcr_rt5651.c b/sound/soc/intel/boards/bytcr_rt5651.c
+index 67c62844ca2a91..604a35d380e9ab 100644
+--- a/sound/soc/intel/boards/bytcr_rt5651.c
++++ b/sound/soc/intel/boards/bytcr_rt5651.c
+@@ -58,7 +58,8 @@ enum {
+ BYT_RT5651_OVCD_SF_1P5 = (RT5651_OVCD_SF_1P5 << 13),
+ };
+
+-#define BYT_RT5651_MAP(quirk) ((quirk) & GENMASK(3, 0))
++#define BYT_RT5651_MAP_MASK GENMASK(3, 0)
++#define BYT_RT5651_MAP(quirk) ((quirk) & BYT_RT5651_MAP_MASK)
+ #define BYT_RT5651_JDSRC(quirk) (((quirk) & GENMASK(7, 4)) >> 4)
+ #define BYT_RT5651_OVCD_TH(quirk) (((quirk) & GENMASK(12, 8)) >> 8)
+ #define BYT_RT5651_OVCD_SF(quirk) (((quirk) & GENMASK(14, 13)) >> 13)
+@@ -100,14 +101,29 @@ MODULE_PARM_DESC(quirk, "Board-specific quirk override");
+
+ static void log_quirks(struct device *dev)
+ {
+- if (BYT_RT5651_MAP(byt_rt5651_quirk) == BYT_RT5651_DMIC_MAP)
++ int map;
++
++ map = BYT_RT5651_MAP(byt_rt5651_quirk);
++ switch (map) {
++ case BYT_RT5651_DMIC_MAP:
+ dev_info(dev, "quirk DMIC_MAP enabled");
+- if (BYT_RT5651_MAP(byt_rt5651_quirk) == BYT_RT5651_IN1_MAP)
++ break;
++ case BYT_RT5651_IN1_MAP:
+ dev_info(dev, "quirk IN1_MAP enabled");
+- if (BYT_RT5651_MAP(byt_rt5651_quirk) == BYT_RT5651_IN2_MAP)
++ break;
++ case BYT_RT5651_IN2_MAP:
+ dev_info(dev, "quirk IN2_MAP enabled");
+- if (BYT_RT5651_MAP(byt_rt5651_quirk) == BYT_RT5651_IN1_IN2_MAP)
++ break;
++ case BYT_RT5651_IN1_IN2_MAP:
+ dev_info(dev, "quirk IN1_IN2_MAP enabled");
++ break;
++ default:
++ dev_warn_once(dev, "quirk sets invalid input map: 0x%x, default to DMIC_MAP\n", map);
++ byt_rt5651_quirk &= ~BYT_RT5651_MAP_MASK;
++ byt_rt5651_quirk |= BYT_RT5651_DMIC_MAP;
++ break;
++ }
++
+ if (BYT_RT5651_JDSRC(byt_rt5651_quirk)) {
+ dev_info(dev, "quirk realtek,jack-detect-source %ld\n",
+ BYT_RT5651_JDSRC(byt_rt5651_quirk));
+diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c
+index 28f03a5f29f741..c013e31d098e71 100644
+--- a/sound/soc/intel/boards/sof_sdw.c
++++ b/sound/soc/intel/boards/sof_sdw.c
+@@ -841,7 +841,7 @@ static int create_sdw_dailink(struct snd_soc_card *card,
+ (*codec_conf)++;
+ }
+
+- if (sof_end->include_sidecar) {
++ if (sof_end->include_sidecar && sof_end->codec_info->add_sidecar) {
+ ret = sof_end->codec_info->add_sidecar(card, dai_links, codec_conf);
+ if (ret)
+ return ret;
+diff --git a/sound/soc/qcom/sc8280xp.c b/sound/soc/qcom/sc8280xp.c
+index 288ccd7f8866a6..6847ae4acbd183 100644
+--- a/sound/soc/qcom/sc8280xp.c
++++ b/sound/soc/qcom/sc8280xp.c
+@@ -191,8 +191,8 @@ static const struct of_device_id snd_sc8280xp_dt_match[] = {
+ {.compatible = "qcom,qcm6490-idp-sndcard", "qcm6490"},
+ {.compatible = "qcom,qcs6490-rb3gen2-sndcard", "qcs6490"},
+ {.compatible = "qcom,qcs8275-sndcard", "qcs8300"},
+- {.compatible = "qcom,qcs9075-sndcard", "qcs9075"},
+- {.compatible = "qcom,qcs9100-sndcard", "qcs9100"},
++ {.compatible = "qcom,qcs9075-sndcard", "sa8775p"},
++ {.compatible = "qcom,qcs9100-sndcard", "sa8775p"},
+ {.compatible = "qcom,sc8280xp-sndcard", "sc8280xp"},
+ {.compatible = "qcom,sm8450-sndcard", "sm8450"},
+ {.compatible = "qcom,sm8550-sndcard", "sm8550"},
+diff --git a/sound/soc/sof/intel/hda-sdw-bpt.c b/sound/soc/sof/intel/hda-sdw-bpt.c
+index 1327f1cad0bcd9..ff5abccf0d88b6 100644
+--- a/sound/soc/sof/intel/hda-sdw-bpt.c
++++ b/sound/soc/sof/intel/hda-sdw-bpt.c
+@@ -150,7 +150,7 @@ static int hda_sdw_bpt_dma_deprepare(struct device *dev, struct hdac_ext_stream
+ u32 mask;
+ int ret;
+
+- ret = hda_cl_cleanup(sdev->dev, dmab_bdl, true, sdw_bpt_stream);
++ ret = hda_cl_cleanup(sdev->dev, dmab_bdl, false, sdw_bpt_stream);
+ if (ret < 0) {
+ dev_err(sdev->dev, "%s: SDW BPT DMA cleanup failed\n",
+ __func__);
+diff --git a/sound/soc/sof/ipc3-topology.c b/sound/soc/sof/ipc3-topology.c
+index 473d416bc91064..f449362a2905a3 100644
+--- a/sound/soc/sof/ipc3-topology.c
++++ b/sound/soc/sof/ipc3-topology.c
+@@ -2473,11 +2473,6 @@ static int sof_ipc3_tear_down_all_pipelines(struct snd_sof_dev *sdev, bool verif
+ if (ret < 0)
+ return ret;
+
+- /* free all the scheduler widgets now */
+- ret = sof_ipc3_free_widgets_in_list(sdev, true, &dyn_widgets, verify);
+- if (ret < 0)
+- return ret;
+-
+ /*
+ * Tear down all pipelines associated with PCMs that did not get suspended
+ * and unset the prepare flag so that they can be set up again during resume.
+@@ -2493,6 +2488,11 @@ static int sof_ipc3_tear_down_all_pipelines(struct snd_sof_dev *sdev, bool verif
+ }
+ }
+
++ /* free all the scheduler widgets now. This will also power down the secondary cores */
++ ret = sof_ipc3_free_widgets_in_list(sdev, true, &dyn_widgets, verify);
++ if (ret < 0)
++ return ret;
++
+ list_for_each_entry(sroute, &sdev->route_list, list)
+ sroute->setup = false;
+
+diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c
+index 374dc10d10fd52..37d72a50c12721 100644
+--- a/sound/soc/sof/ipc4-pcm.c
++++ b/sound/soc/sof/ipc4-pcm.c
+@@ -19,12 +19,14 @@
+ * struct sof_ipc4_timestamp_info - IPC4 timestamp info
+ * @host_copier: the host copier of the pcm stream
+ * @dai_copier: the dai copier of the pcm stream
+- * @stream_start_offset: reported by fw in memory window (converted to frames)
+- * @stream_end_offset: reported by fw in memory window (converted to frames)
++ * @stream_start_offset: reported by fw in memory window (converted to
++ * frames at host_copier sampling rate)
++ * @stream_end_offset: reported by fw in memory window (converted to
++ * frames at host_copier sampling rate)
+ * @llp_offset: llp offset in memory window
+- * @boundary: wrap boundary should be used for the LLP frame counter
+ * @delay: Calculated and stored in pointer callback. The stored value is
+- * returned in the delay callback.
++ * returned in the delay callback. Expressed in frames at host copier
++ * sampling rate.
+ */
+ struct sof_ipc4_timestamp_info {
+ struct sof_ipc4_copier *host_copier;
+@@ -33,7 +35,6 @@ struct sof_ipc4_timestamp_info {
+ u64 stream_end_offset;
+ u32 llp_offset;
+
+- u64 boundary;
+ snd_pcm_sframes_t delay;
+ };
+
+@@ -48,6 +49,16 @@ struct sof_ipc4_pcm_stream_priv {
+ bool chain_dma_allocated;
+ };
+
++/*
++ * Modulus to use to compare host and link position counters. The sampling
++ * rates may be different, so the raw hardware counters will wrap
++ * around at different times. To calculate differences, use
++ * DELAY_BOUNDARY as a common modulus. This value must be smaller than
++ * the wrap-around point of any hardware counter, and larger than any
++ * valid delay measurement.
++ */
++#define DELAY_BOUNDARY U32_MAX
++
+ static inline struct sof_ipc4_timestamp_info *
+ sof_ipc4_sps_to_time_info(struct snd_sof_pcm_stream *sps)
+ {
+@@ -639,14 +650,14 @@ static int ipc4_ssp_dai_config_pcm_params_match(struct snd_sof_dev *sdev,
+
+ if (params_rate(params) == le32_to_cpu(hw_config->fsync_rate) &&
+ params_width(params) == le32_to_cpu(hw_config->tdm_slot_width) &&
+- params_channels(params) == le32_to_cpu(hw_config->tdm_slots)) {
++ params_channels(params) <= le32_to_cpu(hw_config->tdm_slots)) {
+ current_config = le32_to_cpu(hw_config->id);
+ partial_match = false;
+ /* best match found */
+ break;
+ } else if (current_config < 0 &&
+ params_rate(params) == le32_to_cpu(hw_config->fsync_rate) &&
+- params_channels(params) == le32_to_cpu(hw_config->tdm_slots)) {
++ params_channels(params) <= le32_to_cpu(hw_config->tdm_slots)) {
+ current_config = le32_to_cpu(hw_config->id);
+ partial_match = true;
+ /* keep looking for better match */
+@@ -993,6 +1004,35 @@ static int sof_ipc4_pcm_hw_params(struct snd_soc_component *component,
+ return 0;
+ }
+
++static u64 sof_ipc4_frames_dai_to_host(struct sof_ipc4_timestamp_info *time_info, u64 value)
++{
++ u64 dai_rate, host_rate;
++
++ if (!time_info->dai_copier || !time_info->host_copier)
++ return value;
++
++ /*
++ * copiers do not change sampling rate, so we can use the
++ * out_format independently of stream direction
++ */
++ dai_rate = time_info->dai_copier->data.out_format.sampling_frequency;
++ host_rate = time_info->host_copier->data.out_format.sampling_frequency;
++
++ if (!dai_rate || !host_rate || dai_rate == host_rate)
++ return value;
++
++ /* take care not to overflow u64, rates can be up to 768000 */
++ if (value > U32_MAX) {
++ value = div64_u64(value, dai_rate);
++ value *= host_rate;
++ } else {
++ value *= host_rate;
++ value = div64_u64(value, dai_rate);
++ }
++
++ return value;
++}
++
+ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
+ struct snd_pcm_substream *substream,
+ struct snd_sof_pcm_stream *sps,
+@@ -1012,7 +1052,7 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
+ return -EINVAL;
+ } else if (host_copier->data.gtw_cfg.node_id == SOF_IPC4_CHAIN_DMA_NODE_ID) {
+ /*
+- * While the firmware does not supports time_info reporting for
++ * While the firmware does not support time_info reporting for
+ * streams using ChainDMA, it is granted that ChainDMA can only
+ * be used on Host+Link pairs where the link position is
+ * accessible from the host side.
+@@ -1020,10 +1060,16 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
+ * Enable delay calculation in case of ChainDMA via host
+ * accessible registers.
+ *
+- * The ChainDMA uses 2x 1ms ping-pong buffer, dai side starts
+- * when 1ms data is available
++ * The ChainDMA prefills the link DMA with a preamble
++ * of zero samples. Set the stream start offset based
++ * on size of the preamble (driver provided fifo size
++ * multiplied by 2.5). We add 1ms of margin as the FW
++ * will align the buffer size to DMA hardware
++ * alignment that is not known to host.
+ */
+- time_info->stream_start_offset = substream->runtime->rate / MSEC_PER_SEC;
++ int pre_ms = SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS * 5 / 2 + 1;
++
++ time_info->stream_start_offset = pre_ms * substream->runtime->rate / MSEC_PER_SEC;
+ goto out;
+ }
+
+@@ -1043,14 +1089,13 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
+ time_info->stream_end_offset = ppl_reg.stream_end_offset;
+ do_div(time_info->stream_end_offset, dai_sample_size);
+
++ /* convert to host frame time */
++ time_info->stream_start_offset =
++ sof_ipc4_frames_dai_to_host(time_info, time_info->stream_start_offset);
++ time_info->stream_end_offset =
++ sof_ipc4_frames_dai_to_host(time_info, time_info->stream_end_offset);
++
+ out:
+- /*
+- * Calculate the wrap boundary need to be used for delay calculation
+- * The host counter is in bytes, it will wrap earlier than the frames
+- * based link counter.
+- */
+- time_info->boundary = div64_u64(~((u64)0),
+- frames_to_bytes(substream->runtime, 1));
+ /* Initialize the delay value to 0 (no delay) */
+ time_info->delay = 0;
+
+@@ -1093,6 +1138,8 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
+
+ /* For delay calculation we need the host counter */
+ host_cnt = snd_sof_pcm_get_host_byte_counter(sdev, component, substream);
++
++ /* Store the original value to host_ptr */
+ host_ptr = host_cnt;
+
+ /* convert the host_cnt to frames */
+@@ -1111,6 +1158,8 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
+ sof_mailbox_read(sdev, time_info->llp_offset, &llp, sizeof(llp));
+ dai_cnt = ((u64)llp.reading.llp_u << 32) | llp.reading.llp_l;
+ }
++
++ dai_cnt = sof_ipc4_frames_dai_to_host(time_info, dai_cnt);
+ dai_cnt += time_info->stream_end_offset;
+
+ /* In two cases dai dma counter is not accurate
+@@ -1144,8 +1193,9 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
+ dai_cnt -= time_info->stream_start_offset;
+ }
+
+- /* Wrap the dai counter at the boundary where the host counter wraps */
+- div64_u64_rem(dai_cnt, time_info->boundary, &dai_cnt);
++ /* Convert to a common base before comparisons */
++ dai_cnt &= DELAY_BOUNDARY;
++ host_cnt &= DELAY_BOUNDARY;
+
+ if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+ head_cnt = host_cnt;
+@@ -1155,14 +1205,11 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
+ tail_cnt = host_cnt;
+ }
+
+- if (head_cnt < tail_cnt) {
+- time_info->delay = time_info->boundary - tail_cnt + head_cnt;
+- goto out;
+- }
+-
+- time_info->delay = head_cnt - tail_cnt;
++ if (unlikely(head_cnt < tail_cnt))
++ time_info->delay = DELAY_BOUNDARY - tail_cnt + head_cnt;
++ else
++ time_info->delay = head_cnt - tail_cnt;
+
+-out:
+ /*
+ * Convert the host byte counter to PCM pointer which wraps in buffer
+ * and it is in frames
+diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c
+index 591ee30551baa8..c93db452bbc07a 100644
+--- a/sound/soc/sof/ipc4-topology.c
++++ b/sound/soc/sof/ipc4-topology.c
+@@ -33,7 +33,6 @@ MODULE_PARM_DESC(ipc4_ignore_cpc,
+
+ #define SOF_IPC4_GAIN_PARAM_ID 0
+ #define SOF_IPC4_TPLG_ABI_SIZE 6
+-#define SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS 2
+
+ static DEFINE_IDA(alh_group_ida);
+ static DEFINE_IDA(pipeline_ida);
+diff --git a/sound/soc/sof/ipc4-topology.h b/sound/soc/sof/ipc4-topology.h
+index 14ba58d2be03f8..659e1ae0a85f95 100644
+--- a/sound/soc/sof/ipc4-topology.h
++++ b/sound/soc/sof/ipc4-topology.h
+@@ -247,6 +247,8 @@ struct sof_ipc4_dma_stream_ch_map {
+ #define SOF_IPC4_DMA_METHOD_HDA 1
+ #define SOF_IPC4_DMA_METHOD_GPDMA 2 /* defined for consistency but not used */
+
++#define SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS 2
++
+ /**
+ * struct sof_ipc4_dma_config: DMA configuration
+ * @dma_method: HDAudio or GPDMA
+diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
+index c199ade200c240..d2f5aa085f8e36 100644
+--- a/tools/include/nolibc/nolibc.h
++++ b/tools/include/nolibc/nolibc.h
+@@ -116,6 +116,7 @@
+ #include "sched.h"
+ #include "signal.h"
+ #include "unistd.h"
++#include "stdbool.h"
+ #include "stdio.h"
+ #include "stdlib.h"
+ #include "string.h"
+diff --git a/tools/include/nolibc/std.h b/tools/include/nolibc/std.h
+index ba950f0e733843..2c1ad23b9b5c17 100644
+--- a/tools/include/nolibc/std.h
++++ b/tools/include/nolibc/std.h
+@@ -29,6 +29,6 @@ typedef unsigned long nlink_t;
+ typedef signed long off_t;
+ typedef signed long blksize_t;
+ typedef signed long blkcnt_t;
+-typedef __kernel_old_time_t time_t;
++typedef __kernel_time_t time_t;
+
+ #endif /* _NOLIBC_STD_H */
+diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h
+index 295e71d34abadb..90aadad31f6cb5 100644
+--- a/tools/include/nolibc/sys.h
++++ b/tools/include/nolibc/sys.h
+@@ -238,6 +238,19 @@ static __attribute__((unused))
+ int sys_dup2(int old, int new)
+ {
+ #if defined(__NR_dup3)
++ int ret, nr_fcntl;
++
++#ifdef __NR_fcntl64
++ nr_fcntl = __NR_fcntl64;
++#else
++ nr_fcntl = __NR_fcntl;
++#endif
++
++ if (old == new) {
++ ret = my_syscall2(nr_fcntl, old, F_GETFD);
++ return ret < 0 ? ret : old;
++ }
++
+ return my_syscall3(__NR_dup3, old, new, 0);
+ #elif defined(__NR_dup2)
+ return my_syscall2(__NR_dup2, old, new);
+diff --git a/tools/include/nolibc/time.h b/tools/include/nolibc/time.h
+index d02bc44d2643a5..e9c1b976791a65 100644
+--- a/tools/include/nolibc/time.h
++++ b/tools/include/nolibc/time.h
+@@ -133,7 +133,8 @@ static __attribute__((unused))
+ int clock_nanosleep(clockid_t clockid, int flags, const struct timespec *rqtp,
+ struct timespec *rmtp)
+ {
+- return __sysret(sys_clock_nanosleep(clockid, flags, rqtp, rmtp));
++ /* Directly return a positive error number */
++ return -sys_clock_nanosleep(clockid, flags, rqtp, rmtp);
+ }
+
+ static __inline__
+@@ -145,7 +146,7 @@ double difftime(time_t time1, time_t time2)
+ static __inline__
+ int nanosleep(const struct timespec *rqtp, struct timespec *rmtp)
+ {
+- return clock_nanosleep(CLOCK_REALTIME, 0, rqtp, rmtp);
++ return __sysret(sys_clock_nanosleep(CLOCK_REALTIME, 0, rqtp, rmtp));
+ }
+
+
+diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
+index 8f5a81b672e1b8..8f9261279b9212 100644
+--- a/tools/lib/bpf/libbpf.c
++++ b/tools/lib/bpf/libbpf.c
+@@ -1013,35 +1013,33 @@ find_struct_ops_kern_types(struct bpf_object *obj, const char *tname_raw,
+ const struct btf_member *kern_data_member;
+ struct btf *btf = NULL;
+ __s32 kern_vtype_id, kern_type_id;
+- char tname[256];
++ char tname[192], stname[256];
+ __u32 i;
+
+ snprintf(tname, sizeof(tname), "%.*s",
+ (int)bpf_core_essential_name_len(tname_raw), tname_raw);
+
+- kern_type_id = find_ksym_btf_id(obj, tname, BTF_KIND_STRUCT,
+- &btf, mod_btf);
+- if (kern_type_id < 0) {
+- pr_warn("struct_ops init_kern: struct %s is not found in kernel BTF\n",
+- tname);
+- return kern_type_id;
+- }
+- kern_type = btf__type_by_id(btf, kern_type_id);
++ snprintf(stname, sizeof(stname), "%s%s", STRUCT_OPS_VALUE_PREFIX, tname);
+
+- /* Find the corresponding "map_value" type that will be used
+- * in map_update(BPF_MAP_TYPE_STRUCT_OPS). For example,
+- * find "struct bpf_struct_ops_tcp_congestion_ops" from the
+- * btf_vmlinux.
++ /* Look for the corresponding "map_value" type that will be used
++ * in map_update(BPF_MAP_TYPE_STRUCT_OPS) first, figure out the btf
++ * and the mod_btf.
++ * For example, find "struct bpf_struct_ops_tcp_congestion_ops".
+ */
+- kern_vtype_id = find_btf_by_prefix_kind(btf, STRUCT_OPS_VALUE_PREFIX,
+- tname, BTF_KIND_STRUCT);
++ kern_vtype_id = find_ksym_btf_id(obj, stname, BTF_KIND_STRUCT, &btf, mod_btf);
+ if (kern_vtype_id < 0) {
+- pr_warn("struct_ops init_kern: struct %s%s is not found in kernel BTF\n",
+- STRUCT_OPS_VALUE_PREFIX, tname);
++ pr_warn("struct_ops init_kern: struct %s is not found in kernel BTF\n", stname);
+ return kern_vtype_id;
+ }
+ kern_vtype = btf__type_by_id(btf, kern_vtype_id);
+
++ kern_type_id = btf__find_by_name_kind(btf, tname, BTF_KIND_STRUCT);
++ if (kern_type_id < 0) {
++ pr_warn("struct_ops init_kern: struct %s is not found in kernel BTF\n", tname);
++ return kern_type_id;
++ }
++ kern_type = btf__type_by_id(btf, kern_type_id);
++
+ /* Find "struct tcp_congestion_ops" from
+ * struct bpf_struct_ops_tcp_congestion_ops {
+ * [ ... ]
+@@ -1054,8 +1052,8 @@ find_struct_ops_kern_types(struct bpf_object *obj, const char *tname_raw,
+ break;
+ }
+ if (i == btf_vlen(kern_vtype)) {
+- pr_warn("struct_ops init_kern: struct %s data is not found in struct %s%s\n",
+- tname, STRUCT_OPS_VALUE_PREFIX, tname);
++ pr_warn("struct_ops init_kern: struct %s data is not found in struct %s\n",
++ tname, stname);
+ return -EINVAL;
+ }
+
+@@ -5093,6 +5091,16 @@ static bool map_is_reuse_compat(const struct bpf_map *map, int map_fd)
+ return false;
+ }
+
++ /*
++ * bpf_get_map_info_by_fd() for DEVMAP will always return flags with
++ * BPF_F_RDONLY_PROG set, but it generally is not set at map creation time.
++ * Thus, ignore the BPF_F_RDONLY_PROG flag in the flags returned from
++ * bpf_get_map_info_by_fd() when checking for compatibility with an
++ * existing DEVMAP.
++ */
++ if (map->def.type == BPF_MAP_TYPE_DEVMAP || map->def.type == BPF_MAP_TYPE_DEVMAP_HASH)
++ map_info.map_flags &= ~BPF_F_RDONLY_PROG;
++
+ return (map_info.type == map->def.type &&
+ map_info.key_size == map->def.key_size &&
+ map_info.value_size == map->def.value_size &&
+diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
+index 455a957cb702ca..2b86e21190d37b 100644
+--- a/tools/lib/bpf/libbpf.h
++++ b/tools/lib/bpf/libbpf.h
+@@ -252,7 +252,7 @@ bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
+ * @return 0, on success; negative error code, otherwise, error code is
+ * stored in errno
+ */
+-int bpf_object__prepare(struct bpf_object *obj);
++LIBBPF_API int bpf_object__prepare(struct bpf_object *obj);
+
+ /**
+ * @brief **bpf_object__load()** loads BPF object into kernel.
+diff --git a/tools/net/ynl/pyynl/lib/ynl.py b/tools/net/ynl/pyynl/lib/ynl.py
+index 8244a5f440b2be..15ddb0b1adb63f 100644
+--- a/tools/net/ynl/pyynl/lib/ynl.py
++++ b/tools/net/ynl/pyynl/lib/ynl.py
+@@ -746,7 +746,7 @@ class YnlFamily(SpecFamily):
+ subdict = self._decode(NlAttrs(attr.raw, offset), msg_format.attr_set)
+ decoded.update(subdict)
+ else:
+- raise Exception(f"Unknown attribute-set '{attr_space}' when decoding '{attr_spec.name}'")
++ raise Exception(f"Unknown attribute-set '{msg_format.attr_set}' when decoding '{attr_spec.name}'")
+ return decoded
+
+ def _decode(self, attrs, space, outer_attrs = None):
+diff --git a/tools/power/acpi/os_specific/service_layers/oslinuxtbl.c b/tools/power/acpi/os_specific/service_layers/oslinuxtbl.c
+index 9741e7503591c1..de93067a5da320 100644
+--- a/tools/power/acpi/os_specific/service_layers/oslinuxtbl.c
++++ b/tools/power/acpi/os_specific/service_layers/oslinuxtbl.c
+@@ -995,7 +995,7 @@ static acpi_status osl_list_customized_tables(char *directory)
+ {
+ void *table_dir;
+ u32 instance;
+- char temp_name[ACPI_NAMESEG_SIZE];
++ char temp_name[ACPI_NAMESEG_SIZE] ACPI_NONSTRING;
+ char *filename;
+ acpi_status status = AE_OK;
+
+@@ -1312,7 +1312,7 @@ osl_get_customized_table(char *pathname,
+ {
+ void *table_dir;
+ u32 current_instance = 0;
+- char temp_name[ACPI_NAMESEG_SIZE];
++ char temp_name[ACPI_NAMESEG_SIZE] ACPI_NONSTRING;
+ char table_filename[PATH_MAX];
+ char *filename;
+ acpi_status status;
+diff --git a/tools/testing/nvdimm/test/ndtest.c b/tools/testing/nvdimm/test/ndtest.c
+index 68a064ce598c93..8e3b6be53839be 100644
+--- a/tools/testing/nvdimm/test/ndtest.c
++++ b/tools/testing/nvdimm/test/ndtest.c
+@@ -850,11 +850,22 @@ static int ndtest_probe(struct platform_device *pdev)
+
+ p->dcr_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
+ sizeof(dma_addr_t), GFP_KERNEL);
++ if (!p->dcr_dma) {
++ rc = -ENOMEM;
++ goto err;
++ }
+ p->label_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
+ sizeof(dma_addr_t), GFP_KERNEL);
++ if (!p->label_dma) {
++ rc = -ENOMEM;
++ goto err;
++ }
+ p->dimm_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
+ sizeof(dma_addr_t), GFP_KERNEL);
+-
++ if (!p->dimm_dma) {
++ rc = -ENOMEM;
++ goto err;
++ }
+ rc = ndtest_nvdimm_init(p);
+ if (rc)
+ goto err;
+diff --git a/tools/testing/selftests/arm64/abi/tpidr2.c b/tools/testing/selftests/arm64/abi/tpidr2.c
+index f58a9f89b952c4..4c89ab0f101018 100644
+--- a/tools/testing/selftests/arm64/abi/tpidr2.c
++++ b/tools/testing/selftests/arm64/abi/tpidr2.c
+@@ -227,10 +227,10 @@ int main(int argc, char **argv)
+ ret = open("/proc/sys/abi/sme_default_vector_length", O_RDONLY, 0);
+ if (ret >= 0) {
+ ksft_test_result(default_value(), "default_value\n");
+- ksft_test_result(write_read, "write_read\n");
+- ksft_test_result(write_sleep_read, "write_sleep_read\n");
+- ksft_test_result(write_fork_read, "write_fork_read\n");
+- ksft_test_result(write_clone_read, "write_clone_read\n");
++ ksft_test_result(write_read(), "write_read\n");
++ ksft_test_result(write_sleep_read(), "write_sleep_read\n");
++ ksft_test_result(write_fork_read(), "write_fork_read\n");
++ ksft_test_result(write_clone_read(), "write_clone_read\n");
+
+ } else {
+ ksft_print_msg("SME support not present\n");
+diff --git a/tools/testing/selftests/arm64/gcs/basic-gcs.c b/tools/testing/selftests/arm64/gcs/basic-gcs.c
+index 54f9c888249d74..100d2a983155f7 100644
+--- a/tools/testing/selftests/arm64/gcs/basic-gcs.c
++++ b/tools/testing/selftests/arm64/gcs/basic-gcs.c
+@@ -410,7 +410,7 @@ int main(void)
+ }
+
+ /* One last test: disable GCS, we can do this one time */
+- my_syscall5(__NR_prctl, PR_SET_SHADOW_STACK_STATUS, 0, 0, 0, 0);
++ ret = my_syscall5(__NR_prctl, PR_SET_SHADOW_STACK_STATUS, 0, 0, 0, 0);
+ if (ret != 0)
+ ksft_print_msg("Failed to disable GCS: %d\n", ret);
+
+diff --git a/tools/testing/selftests/arm64/pauth/exec_target.c b/tools/testing/selftests/arm64/pauth/exec_target.c
+index 4435600ca400dd..e597861b26d6bf 100644
+--- a/tools/testing/selftests/arm64/pauth/exec_target.c
++++ b/tools/testing/selftests/arm64/pauth/exec_target.c
+@@ -13,7 +13,12 @@ int main(void)
+ unsigned long hwcaps;
+ size_t val;
+
+- fread(&val, sizeof(size_t), 1, stdin);
++ size_t size = fread(&val, sizeof(size_t), 1, stdin);
++
++ if (size != 1) {
++ fprintf(stderr, "Could not read input from stdin\n");
++ return EXIT_FAILURE;
++ }
+
+ /* don't try to execute illegal (unimplemented) instructions) caller
+ * should have checked this and keep worker simple
+diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
+index 4863106034dfbc..fd6b370c816984 100644
+--- a/tools/testing/selftests/bpf/Makefile
++++ b/tools/testing/selftests/bpf/Makefile
+@@ -137,7 +137,7 @@ TEST_GEN_PROGS_EXTENDED = \
+ xdping \
+ xskxceiver
+
+-TEST_GEN_FILES += liburandom_read.so urandom_read sign-file uprobe_multi
++TEST_GEN_FILES += $(TEST_KMODS) liburandom_read.so urandom_read sign-file uprobe_multi
+
+ ifneq ($(V),1)
+ submake_extras := feature_display=0
+@@ -398,7 +398,7 @@ $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \
+ DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers
+ endif
+
+-# vmlinux.h is first dumped to a temprorary file and then compared to
++# vmlinux.h is first dumped to a temporary file and then compared to
+ # the previous version. This helps to avoid unnecessary re-builds of
+ # $(TRUNNER_BPF_OBJS)
+ $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR)
+diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
+index ddd73d06a1eb27..3ecc226ea7b25d 100644
+--- a/tools/testing/selftests/bpf/bench.c
++++ b/tools/testing/selftests/bpf/bench.c
+@@ -499,7 +499,7 @@ extern const struct bench bench_rename_rawtp;
+ extern const struct bench bench_rename_fentry;
+ extern const struct bench bench_rename_fexit;
+
+-/* pure counting benchmarks to establish theoretical lmits */
++/* pure counting benchmarks to establish theoretical limits */
+ extern const struct bench bench_trig_usermode_count;
+ extern const struct bench bench_trig_syscall_count;
+ extern const struct bench bench_trig_kernel_count;
+diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c
+index 82903585c8700c..10cba526d3e631 100644
+--- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c
++++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c
+@@ -63,7 +63,7 @@ static int test_btf_dump_case(int n, struct btf_dump_test_case *t)
+
+ /* tests with t->known_ptr_sz have no "long" or "unsigned long" type,
+ * so it's impossible to determine correct pointer size; but if they
+- * do, it should be 8 regardless of host architecture, becaues BPF
++ * do, it should be 8 regardless of host architecture, because BPF
+ * target is always 64-bit
+ */
+ if (!t->known_ptr_sz) {
+diff --git a/tools/testing/selftests/bpf/prog_tests/fd_array.c b/tools/testing/selftests/bpf/prog_tests/fd_array.c
+index 241b2c8c6e0f15..c534b4d5f9da80 100644
+--- a/tools/testing/selftests/bpf/prog_tests/fd_array.c
++++ b/tools/testing/selftests/bpf/prog_tests/fd_array.c
+@@ -293,7 +293,7 @@ static int get_btf_id_by_fd(int btf_fd, __u32 *id)
+ * 1) Create a new btf, it's referenced only by a file descriptor, so refcnt=1
+ * 2) Load a BPF prog with fd_array[0] = btf_fd; now btf's refcnt=2
+ * 3) Close the btf_fd, now refcnt=1
+- * Wait and check that BTF stil exists.
++ * Wait and check that BTF still exists.
+ */
+ static void check_fd_array_cnt__referenced_btfs(void)
+ {
+diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+index e19ef509ebf85e..171706e78da88c 100644
+--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
++++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+@@ -422,220 +422,6 @@ static void test_unique_match(void)
+ kprobe_multi__destroy(skel);
+ }
+
+-static size_t symbol_hash(long key, void *ctx __maybe_unused)
+-{
+- return str_hash((const char *) key);
+-}
+-
+-static bool symbol_equal(long key1, long key2, void *ctx __maybe_unused)
+-{
+- return strcmp((const char *) key1, (const char *) key2) == 0;
+-}
+-
+-static bool is_invalid_entry(char *buf, bool kernel)
+-{
+- if (kernel && strchr(buf, '['))
+- return true;
+- if (!kernel && !strchr(buf, '['))
+- return true;
+- return false;
+-}
+-
+-static bool skip_entry(char *name)
+-{
+- /*
+- * We attach to almost all kernel functions and some of them
+- * will cause 'suspicious RCU usage' when fprobe is attached
+- * to them. Filter out the current culprits - arch_cpu_idle
+- * default_idle and rcu_* functions.
+- */
+- if (!strcmp(name, "arch_cpu_idle"))
+- return true;
+- if (!strcmp(name, "default_idle"))
+- return true;
+- if (!strncmp(name, "rcu_", 4))
+- return true;
+- if (!strcmp(name, "bpf_dispatcher_xdp_func"))
+- return true;
+- if (!strncmp(name, "__ftrace_invalid_address__",
+- sizeof("__ftrace_invalid_address__") - 1))
+- return true;
+- return false;
+-}
+-
+-/* Do comparision by ignoring '.llvm.<hash>' suffixes. */
+-static int compare_name(const char *name1, const char *name2)
+-{
+- const char *res1, *res2;
+- int len1, len2;
+-
+- res1 = strstr(name1, ".llvm.");
+- res2 = strstr(name2, ".llvm.");
+- len1 = res1 ? res1 - name1 : strlen(name1);
+- len2 = res2 ? res2 - name2 : strlen(name2);
+-
+- if (len1 == len2)
+- return strncmp(name1, name2, len1);
+- if (len1 < len2)
+- return strncmp(name1, name2, len1) <= 0 ? -1 : 1;
+- return strncmp(name1, name2, len2) >= 0 ? 1 : -1;
+-}
+-
+-static int load_kallsyms_compare(const void *p1, const void *p2)
+-{
+- return compare_name(((const struct ksym *)p1)->name, ((const struct ksym *)p2)->name);
+-}
+-
+-static int search_kallsyms_compare(const void *p1, const struct ksym *p2)
+-{
+- return compare_name(p1, p2->name);
+-}
+-
+-static int get_syms(char ***symsp, size_t *cntp, bool kernel)
+-{
+- size_t cap = 0, cnt = 0;
+- char *name = NULL, *ksym_name, **syms = NULL;
+- struct hashmap *map;
+- struct ksyms *ksyms;
+- struct ksym *ks;
+- char buf[256];
+- FILE *f;
+- int err = 0;
+-
+- ksyms = load_kallsyms_custom_local(load_kallsyms_compare);
+- if (!ASSERT_OK_PTR(ksyms, "load_kallsyms_custom_local"))
+- return -EINVAL;
+-
+- /*
+- * The available_filter_functions contains many duplicates,
+- * but other than that all symbols are usable in kprobe multi
+- * interface.
+- * Filtering out duplicates by using hashmap__add, which won't
+- * add existing entry.
+- */
+-
+- if (access("/sys/kernel/tracing/trace", F_OK) == 0)
+- f = fopen("/sys/kernel/tracing/available_filter_functions", "r");
+- else
+- f = fopen("/sys/kernel/debug/tracing/available_filter_functions", "r");
+-
+- if (!f)
+- return -EINVAL;
+-
+- map = hashmap__new(symbol_hash, symbol_equal, NULL);
+- if (IS_ERR(map)) {
+- err = libbpf_get_error(map);
+- goto error;
+- }
+-
+- while (fgets(buf, sizeof(buf), f)) {
+- if (is_invalid_entry(buf, kernel))
+- continue;
+-
+- free(name);
+- if (sscanf(buf, "%ms$*[^\n]\n", &name) != 1)
+- continue;
+- if (skip_entry(name))
+- continue;
+-
+- ks = search_kallsyms_custom_local(ksyms, name, search_kallsyms_compare);
+- if (!ks) {
+- err = -EINVAL;
+- goto error;
+- }
+-
+- ksym_name = ks->name;
+- err = hashmap__add(map, ksym_name, 0);
+- if (err == -EEXIST) {
+- err = 0;
+- continue;
+- }
+- if (err)
+- goto error;
+-
+- err = libbpf_ensure_mem((void **) &syms, &cap,
+- sizeof(*syms), cnt + 1);
+- if (err)
+- goto error;
+-
+- syms[cnt++] = ksym_name;
+- }
+-
+- *symsp = syms;
+- *cntp = cnt;
+-
+-error:
+- free(name);
+- fclose(f);
+- hashmap__free(map);
+- if (err)
+- free(syms);
+- return err;
+-}
+-
+-static int get_addrs(unsigned long **addrsp, size_t *cntp, bool kernel)
+-{
+- unsigned long *addr, *addrs, *tmp_addrs;
+- int err = 0, max_cnt, inc_cnt;
+- char *name = NULL;
+- size_t cnt = 0;
+- char buf[256];
+- FILE *f;
+-
+- if (access("/sys/kernel/tracing/trace", F_OK) == 0)
+- f = fopen("/sys/kernel/tracing/available_filter_functions_addrs", "r");
+- else
+- f = fopen("/sys/kernel/debug/tracing/available_filter_functions_addrs", "r");
+-
+- if (!f)
+- return -ENOENT;
+-
+- /* In my local setup, the number of entries is 50k+ so Let us initially
+- * allocate space to hold 64k entries. If 64k is not enough, incrementally
+- * increase 1k each time.
+- */
+- max_cnt = 65536;
+- inc_cnt = 1024;
+- addrs = malloc(max_cnt * sizeof(long));
+- if (addrs == NULL) {
+- err = -ENOMEM;
+- goto error;
+- }
+-
+- while (fgets(buf, sizeof(buf), f)) {
+- if (is_invalid_entry(buf, kernel))
+- continue;
+-
+- free(name);
+- if (sscanf(buf, "%p %ms$*[^\n]\n", &addr, &name) != 2)
+- continue;
+- if (skip_entry(name))
+- continue;
+-
+- if (cnt == max_cnt) {
+- max_cnt += inc_cnt;
+- tmp_addrs = realloc(addrs, max_cnt);
+- if (!tmp_addrs) {
+- err = -ENOMEM;
+- goto error;
+- }
+- addrs = tmp_addrs;
+- }
+-
+- addrs[cnt++] = (unsigned long)addr;
+- }
+-
+- *addrsp = addrs;
+- *cntp = cnt;
+-
+-error:
+- free(name);
+- fclose(f);
+- if (err)
+- free(addrs);
+- return err;
+-}
+-
+ static void do_bench_test(struct kprobe_multi_empty *skel, struct bpf_kprobe_multi_opts *opts)
+ {
+ long attach_start_ns, attach_end_ns;
+@@ -670,7 +456,7 @@ static void test_kprobe_multi_bench_attach(bool kernel)
+ char **syms = NULL;
+ size_t cnt = 0;
+
+- if (!ASSERT_OK(get_syms(&syms, &cnt, kernel), "get_syms"))
++ if (!ASSERT_OK(bpf_get_ksyms(&syms, &cnt, kernel), "bpf_get_ksyms"))
+ return;
+
+ skel = kprobe_multi_empty__open_and_load();
+@@ -696,13 +482,13 @@ static void test_kprobe_multi_bench_attach_addr(bool kernel)
+ size_t cnt = 0;
+ int err;
+
+- err = get_addrs(&addrs, &cnt, kernel);
++ err = bpf_get_addrs(&addrs, &cnt, kernel);
+ if (err == -ENOENT) {
+ test__skip();
+ return;
+ }
+
+- if (!ASSERT_OK(err, "get_addrs"))
++ if (!ASSERT_OK(err, "bpf_get_addrs"))
+ return;
+
+ skel = kprobe_multi_empty__open_and_load();
+diff --git a/tools/testing/selftests/bpf/prog_tests/module_attach.c b/tools/testing/selftests/bpf/prog_tests/module_attach.c
+index 6d391d95f96e00..70fa7ae93173b6 100644
+--- a/tools/testing/selftests/bpf/prog_tests/module_attach.c
++++ b/tools/testing/selftests/bpf/prog_tests/module_attach.c
+@@ -90,7 +90,7 @@ void test_module_attach(void)
+
+ test_module_attach__detach(skel);
+
+- /* attach fentry/fexit and make sure it get's module reference */
++ /* attach fentry/fexit and make sure it gets module reference */
+ link = bpf_program__attach(skel->progs.handle_fentry);
+ if (!ASSERT_OK_PTR(link, "attach_fentry"))
+ goto cleanup;
+diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
+index e261b0e872dbba..d93a0c7b1786f1 100644
+--- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
++++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
+@@ -623,7 +623,7 @@ static void range_cond(enum num_t t, struct range x, struct range y,
+ *newx = range(t, x.a, x.b);
+ *newy = range(t, y.a + 1, y.b);
+ } else if (x.a == x.b && x.b == y.b) {
+- /* X is a constant matching rigth side of Y */
++ /* X is a constant matching right side of Y */
+ *newx = range(t, x.a, x.b);
+ *newy = range(t, y.a, y.b - 1);
+ } else if (y.a == y.b && x.a == y.a) {
+@@ -631,7 +631,7 @@ static void range_cond(enum num_t t, struct range x, struct range y,
+ *newx = range(t, x.a + 1, x.b);
+ *newy = range(t, y.a, y.b);
+ } else if (y.a == y.b && x.b == y.b) {
+- /* Y is a constant matching rigth side of X */
++ /* Y is a constant matching right side of X */
+ *newx = range(t, x.a, x.b - 1);
+ *newy = range(t, y.a, y.b);
+ } else {
+diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
+index b7ba5cd47d96fa..271b5cc9fc0153 100644
+--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
++++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
+@@ -39,7 +39,7 @@ void test_stacktrace_build_id(void)
+ bpf_map_update_elem(control_map_fd, &key, &val, 0);
+
+ /* for every element in stackid_hmap, we can find a corresponding one
+- * in stackmap, and vise versa.
++ * in stackmap, and vice versa.
+ */
+ err = compare_map_keys(stackid_hmap_fd, stackmap_fd);
+ if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap",
+diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
+index 0832fd7874575c..b277dddd5af7ff 100644
+--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
++++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
+@@ -66,7 +66,7 @@ void test_stacktrace_build_id_nmi(void)
+ bpf_map_update_elem(control_map_fd, &key, &val, 0);
+
+ /* for every element in stackid_hmap, we can find a corresponding one
+- * in stackmap, and vise versa.
++ * in stackmap, and vice versa.
+ */
+ err = compare_map_keys(stackid_hmap_fd, stackmap_fd);
+ if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap",
+diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c
+index df59e4ae295100..84a7e405e9129d 100644
+--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c
++++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c
+@@ -50,7 +50,7 @@ void test_stacktrace_map(void)
+ bpf_map_update_elem(control_map_fd, &key, &val, 0);
+
+ /* for every element in stackid_hmap, we can find a corresponding one
+- * in stackmap, and vise versa.
++ * in stackmap, and vice versa.
+ */
+ err = compare_map_keys(stackid_hmap_fd, stackmap_fd);
+ if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap",
+diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c
+index c6ef06f55cdb46..e0cb4697b4b3cc 100644
+--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c
++++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c
+@@ -46,7 +46,7 @@ void test_stacktrace_map_raw_tp(void)
+ bpf_map_update_elem(control_map_fd, &key, &val, 0);
+
+ /* for every element in stackid_hmap, we can find a corresponding one
+- * in stackmap, and vise versa.
++ * in stackmap, and vice versa.
+ */
+ err = compare_map_keys(stackid_hmap_fd, stackmap_fd);
+ if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap",
+diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c
+index 1932b1e0685cfd..dc2ccf6a14d133 100644
+--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c
++++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c
+@@ -40,7 +40,7 @@ void test_stacktrace_map_skip(void)
+ skel->bss->control = 1;
+
+ /* for every element in stackid_hmap, we can find a corresponding one
+- * in stackmap, and vise versa.
++ * in stackmap, and vice versa.
+ */
+ err = compare_map_keys(stackid_hmap_fd, stackmap_fd);
+ if (!ASSERT_OK(err, "compare_map_keys stackid_hmap vs. stackmap"))
+diff --git a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c
+index 1654a530aa3dc6..4e51785e7606e7 100644
+--- a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c
++++ b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c
+@@ -101,7 +101,7 @@ static void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked,
+ tp->snd_cwnd = pkts_in_flight + sndcnt;
+ }
+
+-/* Decide wheather to run the increase function of congestion control. */
++/* Decide whether to run the increase function of congestion control. */
+ static bool tcp_may_raise_cwnd(const struct sock *sk, const int flag)
+ {
+ if (tcp_sk(sk)->reordering > TCP_REORDERING)
+diff --git a/tools/testing/selftests/bpf/progs/bpf_dctcp.c b/tools/testing/selftests/bpf/progs/bpf_dctcp.c
+index 7cd73e75f52a2b..32c511bcd60b3a 100644
+--- a/tools/testing/selftests/bpf/progs/bpf_dctcp.c
++++ b/tools/testing/selftests/bpf/progs/bpf_dctcp.c
+@@ -1,7 +1,7 @@
+ // SPDX-License-Identifier: GPL-2.0
+ /* Copyright (c) 2019 Facebook */
+
+-/* WARNING: This implemenation is not necessarily the same
++/* WARNING: This implementation is not necessarily the same
+ * as the tcp_dctcp.c. The purpose is mainly for testing
+ * the kernel BPF logic.
+ */
+diff --git a/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c b/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c
+index 544e5ac9046106..d09bbd8ae8a85b 100644
+--- a/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c
++++ b/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c
+@@ -12,7 +12,7 @@
+ SEC("freplace/connect_v4_prog")
+ int new_connect_v4_prog(struct bpf_sock_addr *ctx)
+ {
+- // return value thats in invalid range
++ // return value that's in invalid range
+ return 255;
+ }
+
+diff --git a/tools/testing/selftests/bpf/progs/iters_state_safety.c b/tools/testing/selftests/bpf/progs/iters_state_safety.c
+index f41257eadbb258..b381ac0c736cf0 100644
+--- a/tools/testing/selftests/bpf/progs/iters_state_safety.c
++++ b/tools/testing/selftests/bpf/progs/iters_state_safety.c
+@@ -345,7 +345,7 @@ int __naked read_from_iter_slot_fail(void)
+ "r3 = 1000;"
+ "call %[bpf_iter_num_new];"
+
+- /* attemp to leak bpf_iter_num state */
++ /* attempt to leak bpf_iter_num state */
+ "r7 = *(u64 *)(r6 + 0);"
+ "r8 = *(u64 *)(r6 + 8);"
+
+diff --git a/tools/testing/selftests/bpf/progs/rbtree_search.c b/tools/testing/selftests/bpf/progs/rbtree_search.c
+index 098ef970fac160..b05565d1db0d47 100644
+--- a/tools/testing/selftests/bpf/progs/rbtree_search.c
++++ b/tools/testing/selftests/bpf/progs/rbtree_search.c
+@@ -183,7 +183,7 @@ long test_##op##_spinlock_##dolock(void *ctx) \
+ }
+
+ /*
+- * Use a spearate MSG macro instead of passing to TEST_XXX(..., MSG)
++ * Use a separate MSG macro instead of passing to TEST_XXX(..., MSG)
+ * to ensure the message itself is not in the bpf prog lineinfo
+ * which the verifier includes in its log.
+ * Otherwise, the test_loader will incorrectly match the prog lineinfo
+diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c
+index 36386b3c23a1f6..2b98b7710816dc 100644
+--- a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c
++++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c
+@@ -9,7 +9,7 @@ void bpf_task_release(struct task_struct *p) __ksym;
+
+ /* This test struct_ops BPF programs returning referenced kptr. The verifier should
+ * allow a referenced kptr or a NULL pointer to be returned. A referenced kptr to task
+- * here is acquried automatically as the task argument is tagged with "__ref".
++ * here is acquired automatically as the task argument is tagged with "__ref".
+ */
+ SEC("struct_ops/test_return_ref_kptr")
+ struct task_struct *BPF_PROG(kptr_return, int dummy,
+diff --git a/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c b/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c
+index 76dcb6089d7f8e..9c0a65466356c9 100644
+--- a/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c
++++ b/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c
+@@ -9,7 +9,7 @@ __attribute__((nomerge)) extern void bpf_task_release(struct task_struct *p) __k
+
+ /* This is a test BPF program that uses struct_ops to access a referenced
+ * kptr argument. This is a test for the verifier to ensure that it
+- * 1) recongnizes the task as a referenced object (i.e., ref_obj_id > 0), and
++ * 1) recognizes the task as a referenced object (i.e., ref_obj_id > 0), and
+ * 2) the same reference can be acquired from multiple paths as long as it
+ * has not been released.
+ */
+diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect.c b/tools/testing/selftests/bpf/progs/test_cls_redirect.c
+index f344c6835e84e7..823169fb6e4c7f 100644
+--- a/tools/testing/selftests/bpf/progs/test_cls_redirect.c
++++ b/tools/testing/selftests/bpf/progs/test_cls_redirect.c
+@@ -129,7 +129,7 @@ typedef uint8_t *net_ptr __attribute__((align_value(8)));
+ typedef struct buf {
+ struct __sk_buff *skb;
+ net_ptr head;
+- /* NB: tail musn't have alignment other than 1, otherwise
++ /* NB: tail mustn't have alignment other than 1, otherwise
+ * LLVM will go and eliminate code, e.g. when checking packet lengths.
+ */
+ uint8_t *const tail;
+diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c b/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c
+index d0f7670351e587..dfd4a2710391d9 100644
+--- a/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c
++++ b/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c
+@@ -494,7 +494,7 @@ static ret_t get_next_hop(struct bpf_dynptr *dynptr, __u64 *offset, encap_header
+
+ *offset += sizeof(*next_hop);
+
+- /* Skip the remainig next hops (may be zero). */
++ /* Skip the remaining next hops (may be zero). */
+ return skip_next_hops(offset, encap->unigue.hop_count - encap->unigue.next_hop - 1);
+ }
+
+diff --git a/tools/testing/selftests/bpf/progs/test_tcpnotify_kern.c b/tools/testing/selftests/bpf/progs/test_tcpnotify_kern.c
+index 540181c115a85a..ef00d38b0a8d24 100644
+--- a/tools/testing/selftests/bpf/progs/test_tcpnotify_kern.c
++++ b/tools/testing/selftests/bpf/progs/test_tcpnotify_kern.c
+@@ -23,7 +23,6 @@ struct {
+
+ struct {
+ __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+- __uint(max_entries, 2);
+ __type(key, int);
+ __type(value, __u32);
+ } perf_event_map SEC(".maps");
+diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
+index 9fdcf396b8f467..a2951e2f1711b8 100644
+--- a/tools/testing/selftests/bpf/progs/uretprobe_stack.c
++++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c
+@@ -26,8 +26,8 @@ int usdt_len;
+ SEC("uprobe//proc/self/exe:target_1")
+ int BPF_UPROBE(uprobe_1)
+ {
+- /* target_1 is recursive wit depth of 2, so we capture two separate
+- * stack traces, depending on which occurence it is
++ /* target_1 is recursive with depth of 2, so we capture two separate
++ * stack traces, depending on which occurrence it is
+ */
+ static bool recur = false;
+
+diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
+index 7c5e5e6d10ebc2..dba3ca728f6e6f 100644
+--- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
++++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
+@@ -349,7 +349,7 @@ __naked void precision_two_ids(void)
+ SEC("socket")
+ __success __log_level(2)
+ __flag(BPF_F_TEST_STATE_FREQ)
+-/* check thar r0 and r6 have different IDs after 'if',
++/* check that r0 and r6 have different IDs after 'if',
+ * collect_linked_regs() can't tie more than 6 registers for a single insn.
+ */
+ __msg("8: (25) if r0 > 0x7 goto pc+0 ; R0=scalar(id=1")
+diff --git a/tools/testing/selftests/bpf/progs/verifier_var_off.c b/tools/testing/selftests/bpf/progs/verifier_var_off.c
+index 1d36d01b746e78..f345466bca6868 100644
+--- a/tools/testing/selftests/bpf/progs/verifier_var_off.c
++++ b/tools/testing/selftests/bpf/progs/verifier_var_off.c
+@@ -114,8 +114,8 @@ __naked void stack_write_priv_vs_unpriv(void)
+ }
+
+ /* Similar to the previous test, but this time also perform a read from the
+- * address written to with a variable offset. The read is allowed, showing that,
+- * after a variable-offset write, a priviledged program can read the slots that
++ * address written to with a variable offet. The read is allowed, showing that,
++ * after a variable-offset write, a privileged program can read the slots that
+ * were in the range of that write (even if the verifier doesn't actually know if
+ * the slot being read was really written to or not.
+ *
+@@ -157,7 +157,7 @@ __naked void stack_write_followed_by_read(void)
+ SEC("socket")
+ __description("variable-offset stack write clobbers spilled regs")
+ __failure
+-/* In the priviledged case, dereferencing a spilled-and-then-filled
++/* In the privileged case, dereferencing a spilled-and-then-filled
+ * register is rejected because the previous variable offset stack
+ * write might have overwritten the spilled pointer (i.e. we lose track
+ * of the spilled register when we analyze the write).
+diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
+index fd2da2234cc9b4..76568db7a66422 100644
+--- a/tools/testing/selftests/bpf/test_sockmap.c
++++ b/tools/testing/selftests/bpf/test_sockmap.c
+@@ -1372,7 +1372,7 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test)
+ } else
+ fprintf(stderr, "unknown test\n");
+ out:
+- /* Detatch and zero all the maps */
++ /* Detach and zero all the maps */
+ bpf_prog_detach2(bpf_program__fd(progs[3]), cg_fd, BPF_CGROUP_SOCK_OPS);
+
+ for (i = 0; i < ARRAY_SIZE(links); i++) {
+diff --git a/tools/testing/selftests/bpf/test_tcpnotify_user.c b/tools/testing/selftests/bpf/test_tcpnotify_user.c
+index 595194453ff8f8..35b4893ccdf8ae 100644
+--- a/tools/testing/selftests/bpf/test_tcpnotify_user.c
++++ b/tools/testing/selftests/bpf/test_tcpnotify_user.c
+@@ -15,20 +15,18 @@
+ #include <bpf/libbpf.h>
+ #include <sys/ioctl.h>
+ #include <linux/rtnetlink.h>
+-#include <signal.h>
+ #include <linux/perf_event.h>
+-#include <linux/err.h>
+
+-#include "bpf_util.h"
+ #include "cgroup_helpers.h"
+
+ #include "test_tcpnotify.h"
+-#include "trace_helpers.h"
+ #include "testing_helpers.h"
+
+ #define SOCKET_BUFFER_SIZE (getpagesize() < 8192L ? getpagesize() : 8192L)
+
+ pthread_t tid;
++static bool exit_thread;
++
+ int rx_callbacks;
+
+ static void dummyfn(void *ctx, int cpu, void *data, __u32 size)
+@@ -45,7 +43,7 @@ void tcp_notifier_poller(struct perf_buffer *pb)
+ {
+ int err;
+
+- while (1) {
++ while (!exit_thread) {
+ err = perf_buffer__poll(pb, 100);
+ if (err < 0 && err != -EINTR) {
+ printf("failed perf_buffer__poll: %d\n", err);
+@@ -78,15 +76,10 @@ int main(int argc, char **argv)
+ int error = EXIT_FAILURE;
+ struct bpf_object *obj;
+ char test_script[80];
+- cpu_set_t cpuset;
+ __u32 key = 0;
+
+ libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
+
+- CPU_ZERO(&cpuset);
+- CPU_SET(0, &cpuset);
+- pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+-
+ cg_fd = cgroup_setup_and_join(cg_path);
+ if (cg_fd < 0)
+ goto err;
+@@ -151,6 +144,13 @@ int main(int argc, char **argv)
+
+ sleep(10);
+
++ exit_thread = true;
++ int ret = pthread_join(tid, NULL);
++ if (ret) {
++ printf("FAILED: pthread_join\n");
++ goto err;
++ }
++
+ if (verify_result(&g)) {
+ printf("FAILED: Wrong stats Expected %d calls, got %d\n",
+ g.ncalls, rx_callbacks);
+diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
+index 81943c6254e6bc..03f223333aa4ae 100644
+--- a/tools/testing/selftests/bpf/trace_helpers.c
++++ b/tools/testing/selftests/bpf/trace_helpers.c
+@@ -17,6 +17,7 @@
+ #include <linux/limits.h>
+ #include <libelf.h>
+ #include <gelf.h>
++#include "bpf/hashmap.h"
+ #include "bpf/libbpf_internal.h"
+
+ #define TRACEFS_PIPE "/sys/kernel/tracing/trace_pipe"
+@@ -519,3 +520,216 @@ void read_trace_pipe(void)
+ {
+ read_trace_pipe_iter(trace_pipe_cb, NULL, 0);
+ }
++
++static size_t symbol_hash(long key, void *ctx __maybe_unused)
++{
++ return str_hash((const char *) key);
++}
++
++static bool symbol_equal(long key1, long key2, void *ctx __maybe_unused)
++{
++ return strcmp((const char *) key1, (const char *) key2) == 0;
++}
++
++static bool is_invalid_entry(char *buf, bool kernel)
++{
++ if (kernel && strchr(buf, '['))
++ return true;
++ if (!kernel && !strchr(buf, '['))
++ return true;
++ return false;
++}
++
++static bool skip_entry(char *name)
++{
++ /*
++ * We attach to almost all kernel functions and some of them
++ * will cause 'suspicious RCU usage' when fprobe is attached
++ * to them. Filter out the current culprits - arch_cpu_idle
++ * default_idle and rcu_* functions.
++ */
++ if (!strcmp(name, "arch_cpu_idle"))
++ return true;
++ if (!strcmp(name, "default_idle"))
++ return true;
++ if (!strncmp(name, "rcu_", 4))
++ return true;
++ if (!strcmp(name, "bpf_dispatcher_xdp_func"))
++ return true;
++ if (!strncmp(name, "__ftrace_invalid_address__",
++ sizeof("__ftrace_invalid_address__") - 1))
++ return true;
++ return false;
++}
++
++/* Do comparison by ignoring '.llvm.<hash>' suffixes. */
++static int compare_name(const char *name1, const char *name2)
++{
++ const char *res1, *res2;
++ int len1, len2;
++
++ res1 = strstr(name1, ".llvm.");
++ res2 = strstr(name2, ".llvm.");
++ len1 = res1 ? res1 - name1 : strlen(name1);
++ len2 = res2 ? res2 - name2 : strlen(name2);
++
++ if (len1 == len2)
++ return strncmp(name1, name2, len1);
++ if (len1 < len2)
++ return strncmp(name1, name2, len1) <= 0 ? -1 : 1;
++ return strncmp(name1, name2, len2) >= 0 ? 1 : -1;
++}
++
++static int load_kallsyms_compare(const void *p1, const void *p2)
++{
++ return compare_name(((const struct ksym *)p1)->name, ((const struct ksym *)p2)->name);
++}
++
++static int search_kallsyms_compare(const void *p1, const struct ksym *p2)
++{
++ return compare_name(p1, p2->name);
++}
++
++int bpf_get_ksyms(char ***symsp, size_t *cntp, bool kernel)
++{
++ size_t cap = 0, cnt = 0;
++ char *name = NULL, *ksym_name, **syms = NULL;
++ struct hashmap *map;
++ struct ksyms *ksyms;
++ struct ksym *ks;
++ char buf[256];
++ FILE *f;
++ int err = 0;
++
++ ksyms = load_kallsyms_custom_local(load_kallsyms_compare);
++ if (!ksyms)
++ return -EINVAL;
++
++ /*
++ * The available_filter_functions contains many duplicates,
++ * but other than that all symbols are usable to trace.
++ * Filtering out duplicates by using hashmap__add, which won't
++ * add existing entry.
++ */
++
++ if (access("/sys/kernel/tracing/trace", F_OK) == 0)
++ f = fopen("/sys/kernel/tracing/available_filter_functions", "r");
++ else
++ f = fopen("/sys/kernel/debug/tracing/available_filter_functions", "r");
++
++ if (!f)
++ return -EINVAL;
++
++ map = hashmap__new(symbol_hash, symbol_equal, NULL);
++ if (IS_ERR(map)) {
++ err = libbpf_get_error(map);
++ goto error;
++ }
++
++ while (fgets(buf, sizeof(buf), f)) {
++ if (is_invalid_entry(buf, kernel))
++ continue;
++
++ free(name);
++ if (sscanf(buf, "%ms$*[^\n]\n", &name) != 1)
++ continue;
++ if (skip_entry(name))
++ continue;
++
++ ks = search_kallsyms_custom_local(ksyms, name, search_kallsyms_compare);
++ if (!ks) {
++ err = -EINVAL;
++ goto error;
++ }
++
++ ksym_name = ks->name;
++ err = hashmap__add(map, ksym_name, 0);
++ if (err == -EEXIST) {
++ err = 0;
++ continue;
++ }
++ if (err)
++ goto error;
++
++ err = libbpf_ensure_mem((void **) &syms, &cap,
++ sizeof(*syms), cnt + 1);
++ if (err)
++ goto error;
++
++ syms[cnt++] = ksym_name;
++ }
++
++ *symsp = syms;
++ *cntp = cnt;
++
++error:
++ free(name);
++ fclose(f);
++ hashmap__free(map);
++ if (err)
++ free(syms);
++ return err;
++}
++
++int bpf_get_addrs(unsigned long **addrsp, size_t *cntp, bool kernel)
++{
++ unsigned long *addr, *addrs, *tmp_addrs;
++ int err = 0, max_cnt, inc_cnt;
++ char *name = NULL;
++ size_t cnt = 0;
++ char buf[256];
++ FILE *f;
++
++ if (access("/sys/kernel/tracing/trace", F_OK) == 0)
++ f = fopen("/sys/kernel/tracing/available_filter_functions_addrs", "r");
++ else
++ f = fopen("/sys/kernel/debug/tracing/available_filter_functions_addrs", "r");
++
++ if (!f)
++ return -ENOENT;
++
++ /* In my local setup, the number of entries is 50k+ so Let us initially
++ * allocate space to hold 64k entries. If 64k is not enough, incrementally
++ * increase 1k each time.
++ */
++ max_cnt = 65536;
++ inc_cnt = 1024;
++ addrs = malloc(max_cnt * sizeof(long));
++ if (addrs == NULL) {
++ err = -ENOMEM;
++ goto error;
++ }
++
++ while (fgets(buf, sizeof(buf), f)) {
++ if (is_invalid_entry(buf, kernel))
++ continue;
++
++ free(name);
++ if (sscanf(buf, "%p %ms$*[^\n]\n", &addr, &name) != 2)
++ continue;
++ if (skip_entry(name))
++ continue;
++
++ if (cnt == max_cnt) {
++ max_cnt += inc_cnt;
++ tmp_addrs = realloc(addrs, max_cnt * sizeof(long));
++ if (!tmp_addrs) {
++ err = -ENOMEM;
++ goto error;
++ }
++ addrs = tmp_addrs;
++ }
++
++ addrs[cnt++] = (unsigned long)addr;
++ }
++
++ *addrsp = addrs;
++ *cntp = cnt;
++
++error:
++ free(name);
++ fclose(f);
++ if (err)
++ free(addrs);
++ return err;
++}
+diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
+index 2ce873c9f9aad6..9437bdd4afa505 100644
+--- a/tools/testing/selftests/bpf/trace_helpers.h
++++ b/tools/testing/selftests/bpf/trace_helpers.h
+@@ -41,4 +41,7 @@ ssize_t get_rel_offset(uintptr_t addr);
+
+ int read_build_id(const char *path, char *build_id, size_t size);
+
++int bpf_get_ksyms(char ***symsp, size_t *cntp, bool kernel);
++int bpf_get_addrs(unsigned long **addrsp, size_t *cntp, bool kernel);
++
+ #endif
+diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c
+index f3492efc88346e..c8d640802cce41 100644
+--- a/tools/testing/selftests/bpf/verifier/calls.c
++++ b/tools/testing/selftests/bpf/verifier/calls.c
+@@ -1375,7 +1375,7 @@
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+ /* write into map value */
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+- /* fetch secound map_value_ptr from the stack */
++ /* fetch second map_value_ptr from the stack */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -16),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+ /* write into map value */
+@@ -1439,7 +1439,7 @@
+ /* second time with fp-16 */
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 1, 2),
+- /* fetch secound map_value_ptr from the stack */
++ /* fetch second map_value_ptr from the stack */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0),
+ /* write into map value */
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+@@ -1493,7 +1493,7 @@
+ /* second time with fp-16 */
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+- /* fetch secound map_value_ptr from the stack */
++ /* fetch second map_value_ptr from the stack */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0),
+ /* write into map value */
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+@@ -2380,7 +2380,7 @@
+ */
+ BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1),
+ BPF_MOV64_REG(BPF_REG_9, BPF_REG_8),
+- /* r9 = *r9 ; verifier get's to this point via two paths:
++ /* r9 = *r9 ; verifier gets to this point via two paths:
+ * ; (I) one including r9 = r8, verified first;
+ * ; (II) one excluding r9 = r8, verified next.
+ * ; After load of *r9 to r9 the frame[0].fp[-24].id == r9.id.
+diff --git a/tools/testing/selftests/bpf/xdping.c b/tools/testing/selftests/bpf/xdping.c
+index 1503a1d2faa090..9ed8c796645d00 100644
+--- a/tools/testing/selftests/bpf/xdping.c
++++ b/tools/testing/selftests/bpf/xdping.c
+@@ -155,7 +155,7 @@ int main(int argc, char **argv)
+ }
+
+ if (!server) {
+- /* Only supports IPv4; see hints initiailization above. */
++ /* Only supports IPv4; see hints initialization above. */
+ if (getaddrinfo(argv[optind], NULL, &hints, &a) || !a) {
+ fprintf(stderr, "Could not resolve %s\n", argv[optind]);
+ return 1;
+diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h
+index 93c2cc413cfcd0..48729da142c249 100644
+--- a/tools/testing/selftests/bpf/xsk.h
++++ b/tools/testing/selftests/bpf/xsk.h
+@@ -93,8 +93,8 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
+ /* Refresh the local tail pointer.
+ * cached_cons is r->size bigger than the real consumer pointer so
+ * that this addition can be avoided in the more frequently
+- * executed code that computs free_entries in the beginning of
+- * this function. Without this optimization it whould have been
++ * executed code that computes free_entries in the beginning of
++ * this function. Without this optimization it would have been
+ * free_entries = r->cached_prod - r->cached_cons + r->size.
+ */
+ r->cached_cons = __atomic_load_n(r->consumer, __ATOMIC_ACQUIRE);
+diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
+index a29de0713f19f0..352adc8df2d1cd 100644
+--- a/tools/testing/selftests/bpf/xskxceiver.c
++++ b/tools/testing/selftests/bpf/xskxceiver.c
+@@ -2276,25 +2276,13 @@ static int testapp_xdp_metadata_copy(struct test_spec *test)
+ {
+ struct xsk_xdp_progs *skel_rx = test->ifobj_rx->xdp_progs;
+ struct xsk_xdp_progs *skel_tx = test->ifobj_tx->xdp_progs;
+- struct bpf_map *data_map;
+- int count = 0;
+- int key = 0;
+
+ test_spec_set_xdp_prog(test, skel_rx->progs.xsk_xdp_populate_metadata,
+ skel_tx->progs.xsk_xdp_populate_metadata,
+ skel_rx->maps.xsk, skel_tx->maps.xsk);
+ test->ifobj_rx->use_metadata = true;
+
+- data_map = bpf_object__find_map_by_name(skel_rx->obj, "xsk_xdp_.bss");
+- if (!data_map || !bpf_map__is_internal(data_map)) {
+- ksft_print_msg("Error: could not find bss section of XDP program\n");
+- return TEST_FAILURE;
+- }
+-
+- if (bpf_map_update_elem(bpf_map__fd(data_map), &key, &count, BPF_ANY)) {
+- ksft_print_msg("Error: could not update count element\n");
+- return TEST_FAILURE;
+- }
++ skel_rx->bss->count = 0;
+
+ return testapp_validate_traffic(test);
+ }
+diff --git a/tools/testing/selftests/cgroup/lib/cgroup_util.c b/tools/testing/selftests/cgroup/lib/cgroup_util.c
+index 0e89fcff4d05d3..44c52f620fda17 100644
+--- a/tools/testing/selftests/cgroup/lib/cgroup_util.c
++++ b/tools/testing/selftests/cgroup/lib/cgroup_util.c
+@@ -522,6 +522,18 @@ int proc_mount_contains(const char *option)
+ return strstr(buf, option) != NULL;
+ }
+
++int cgroup_feature(const char *feature)
++{
++ char buf[PAGE_SIZE];
++ ssize_t read;
++
++ read = read_text("/sys/kernel/cgroup/features", buf, sizeof(buf));
++ if (read < 0)
++ return read;
++
++ return strstr(buf, feature) != NULL;
++}
++
+ ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size)
+ {
+ char path[PATH_MAX];
+diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
+index c69cab66254b41..9dc90a1b386d77 100644
+--- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
++++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
+@@ -60,6 +60,7 @@ extern int cg_run_nowait(const char *cgroup,
+ extern int cg_wait_for_proc_count(const char *cgroup, int count);
+ extern int cg_killall(const char *cgroup);
+ int proc_mount_contains(const char *option);
++int cgroup_feature(const char *feature);
+ extern ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size);
+ extern int proc_read_strstr(int pid, bool thread, const char *item, const char *needle);
+ extern pid_t clone_into_cgroup(int cgroup_fd);
+diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c
+index 9ecb83c6cc5cbf..d8a1d1cd500727 100644
+--- a/tools/testing/selftests/cgroup/test_pids.c
++++ b/tools/testing/selftests/cgroup/test_pids.c
+@@ -77,6 +77,9 @@ static int test_pids_events(const char *root)
+ char *cg_parent = NULL, *cg_child = NULL;
+ int pid;
+
++ if (cgroup_feature("pids_localevents") <= 0)
++ return KSFT_SKIP;
++
+ cg_parent = cg_name(root, "pids_parent");
+ cg_child = cg_name(cg_parent, "pids_child");
+ if (!cg_parent || !cg_child)
+diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile
+index 8cfb87f7f7c505..bd50aecfca8a31 100644
+--- a/tools/testing/selftests/futex/functional/Makefile
++++ b/tools/testing/selftests/futex/functional/Makefile
+@@ -1,6 +1,9 @@
+ # SPDX-License-Identifier: GPL-2.0
++PKG_CONFIG ?= pkg-config
++LIBNUMA_TEST = $(shell sh -c "$(PKG_CONFIG) numa --atleast-version 2.0.16 > /dev/null 2>&1 && echo SUFFICIENT || echo NO")
++
+ INCLUDES := -I../include -I../../ $(KHDR_INCLUDES)
+-CFLAGS := $(CFLAGS) -g -O2 -Wall -pthread $(INCLUDES) $(KHDR_INCLUDES)
++CFLAGS := $(CFLAGS) -g -O2 -Wall -pthread -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 $(INCLUDES) $(KHDR_INCLUDES) -DLIBNUMA_VER_$(LIBNUMA_TEST)=1
+ LDLIBS := -lpthread -lrt -lnuma
+
+ LOCAL_HDRS := \
+diff --git a/tools/testing/selftests/futex/functional/futex_numa_mpol.c b/tools/testing/selftests/futex/functional/futex_numa_mpol.c
+index a9ecfb2d3932ad..7f2b2e1ff9f8a3 100644
+--- a/tools/testing/selftests/futex/functional/futex_numa_mpol.c
++++ b/tools/testing/selftests/futex/functional/futex_numa_mpol.c
+@@ -77,7 +77,7 @@ static void join_max_threads(void)
+ }
+ }
+
+-static void __test_futex(void *futex_ptr, int must_fail, unsigned int futex_flags)
++static void __test_futex(void *futex_ptr, int err_value, unsigned int futex_flags)
+ {
+ int to_wake, ret, i, need_exit = 0;
+
+@@ -88,11 +88,17 @@ static void __test_futex(void *futex_ptr, int must_fail, unsigned int futex_flag
+
+ do {
+ ret = futex2_wake(futex_ptr, to_wake, futex_flags);
+- if (must_fail) {
+- if (ret < 0)
+- break;
+- ksft_exit_fail_msg("futex2_wake(%d, 0x%x) should fail, but didn't\n",
+- to_wake, futex_flags);
++
++ if (err_value) {
++ if (ret >= 0)
++ ksft_exit_fail_msg("futex2_wake(%d, 0x%x) should fail, but didn't\n",
++ to_wake, futex_flags);
++
++ if (errno != err_value)
++ ksft_exit_fail_msg("futex2_wake(%d, 0x%x) expected error was %d, but returned %d (%s)\n",
++ to_wake, futex_flags, err_value, errno, strerror(errno));
++
++ break;
+ }
+ if (ret < 0) {
+ ksft_exit_fail_msg("Failed futex2_wake(%d, 0x%x): %m\n",
+@@ -106,12 +112,12 @@ static void __test_futex(void *futex_ptr, int must_fail, unsigned int futex_flag
+ join_max_threads();
+
+ for (i = 0; i < MAX_THREADS; i++) {
+- if (must_fail && thread_args[i].result != -1) {
++ if (err_value && thread_args[i].result != -1) {
+ ksft_print_msg("Thread %d should fail but succeeded (%d)\n",
+ i, thread_args[i].result);
+ need_exit = 1;
+ }
+- if (!must_fail && thread_args[i].result != 0) {
++ if (!err_value && thread_args[i].result != 0) {
+ ksft_print_msg("Thread %d failed (%d)\n", i, thread_args[i].result);
+ need_exit = 1;
+ }
+@@ -120,14 +126,9 @@ static void __test_futex(void *futex_ptr, int must_fail, unsigned int futex_flag
+ ksft_exit_fail_msg("Aborting due to earlier errors.\n");
+ }
+
+-static void test_futex(void *futex_ptr, int must_fail)
++static void test_futex(void *futex_ptr, int err_value)
+ {
+- __test_futex(futex_ptr, must_fail, FUTEX2_SIZE_U32 | FUTEX_PRIVATE_FLAG | FUTEX2_NUMA);
+-}
+-
+-static void test_futex_mpol(void *futex_ptr, int must_fail)
+-{
+- __test_futex(futex_ptr, must_fail, FUTEX2_SIZE_U32 | FUTEX_PRIVATE_FLAG | FUTEX2_NUMA | FUTEX2_MPOL);
++ __test_futex(futex_ptr, err_value, FUTEX2_SIZE_U32 | FUTEX_PRIVATE_FLAG | FUTEX2_NUMA);
+ }
+
+ static void usage(char *prog)
+@@ -142,7 +143,7 @@ static void usage(char *prog)
+ int main(int argc, char *argv[])
+ {
+ struct futex32_numa *futex_numa;
+- int mem_size, i;
++ int mem_size;
+ void *futex_ptr;
+ int c;
+
+@@ -165,7 +166,7 @@ int main(int argc, char *argv[])
+ }
+
+ ksft_print_header();
+- ksft_set_plan(1);
++ ksft_set_plan(2);
+
+ mem_size = sysconf(_SC_PAGE_SIZE);
+ futex_ptr = mmap(NULL, mem_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+@@ -182,27 +183,28 @@ int main(int argc, char *argv[])
+ if (futex_numa->numa == FUTEX_NO_NODE)
+ ksft_exit_fail_msg("NUMA node is left uninitialized\n");
+
+- ksft_print_msg("Memory too small\n");
+- test_futex(futex_ptr + mem_size - 4, 1);
+-
+- ksft_print_msg("Memory out of range\n");
+- test_futex(futex_ptr + mem_size, 1);
++ /* FUTEX2_NUMA futex must be 8-byte aligned */
++ ksft_print_msg("Mis-aligned futex\n");
++ test_futex(futex_ptr + mem_size - 4, EINVAL);
+
+ futex_numa->numa = FUTEX_NO_NODE;
+ mprotect(futex_ptr, mem_size, PROT_READ);
+ ksft_print_msg("Memory, RO\n");
+- test_futex(futex_ptr, 1);
++ test_futex(futex_ptr, EFAULT);
+
+ mprotect(futex_ptr, mem_size, PROT_NONE);
+ ksft_print_msg("Memory, no access\n");
+- test_futex(futex_ptr, 1);
++ test_futex(futex_ptr, EFAULT);
+
+ mprotect(futex_ptr, mem_size, PROT_READ | PROT_WRITE);
+ ksft_print_msg("Memory back to RW\n");
+ test_futex(futex_ptr, 0);
+
++ ksft_test_result_pass("futex2 memory boundarie tests passed\n");
++
+ /* MPOL test. Does not work as expected */
+- for (i = 0; i < 4; i++) {
++#ifdef LIBNUMA_VER_SUFFICIENT
++ for (int i = 0; i < 4; i++) {
+ unsigned long nodemask;
+ int ret;
+
+@@ -221,15 +223,16 @@ int main(int argc, char *argv[])
+ ret = futex2_wake(futex_ptr, 0, FUTEX2_SIZE_U32 | FUTEX_PRIVATE_FLAG | FUTEX2_NUMA | FUTEX2_MPOL);
+ if (ret < 0)
+ ksft_test_result_fail("Failed to wake 0 with MPOL: %m\n");
+- if (0)
+- test_futex_mpol(futex_numa, 0);
+ if (futex_numa->numa != i) {
+ ksft_exit_fail_msg("Returned NUMA node is %d expected %d\n",
+ futex_numa->numa, i);
+ }
+ }
+ }
+- ksft_test_result_pass("NUMA MPOL tests passed\n");
++ ksft_test_result_pass("futex2 MPOL hints test passed\n");
++#else
++ ksft_test_result_skip("futex2 MPOL hints test requires libnuma 2.0.16+\n");
++#endif
+ ksft_finished();
+ return 0;
+ }
+diff --git a/tools/testing/selftests/futex/functional/futex_priv_hash.c b/tools/testing/selftests/futex/functional/futex_priv_hash.c
+index aea001ac494604..ec032faca6a91f 100644
+--- a/tools/testing/selftests/futex/functional/futex_priv_hash.c
++++ b/tools/testing/selftests/futex/functional/futex_priv_hash.c
+@@ -132,7 +132,6 @@ static void usage(char *prog)
+ {
+ printf("Usage: %s\n", prog);
+ printf(" -c Use color\n");
+- printf(" -g Test global hash instead intead local immutable \n");
+ printf(" -h Display this help message\n");
+ printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n",
+ VQUIET, VCRITICAL, VINFO);
+diff --git a/tools/testing/selftests/futex/functional/run.sh b/tools/testing/selftests/futex/functional/run.sh
+index 81739849f2994d..5470088dc4dfb6 100755
+--- a/tools/testing/selftests/futex/functional/run.sh
++++ b/tools/testing/selftests/futex/functional/run.sh
+@@ -85,7 +85,6 @@ echo
+
+ echo
+ ./futex_priv_hash $COLOR
+-./futex_priv_hash -g $COLOR
+
+ echo
+ ./futex_numa_mpol $COLOR
+diff --git a/tools/testing/selftests/futex/include/futextest.h b/tools/testing/selftests/futex/include/futextest.h
+index 7a5fd1d5355e7e..3d48e9789d9fe6 100644
+--- a/tools/testing/selftests/futex/include/futextest.h
++++ b/tools/testing/selftests/futex/include/futextest.h
+@@ -58,6 +58,17 @@ typedef volatile u_int32_t futex_t;
+ #define SYS_futex SYS_futex_time64
+ #endif
+
++/*
++ * On 32bit systems if we use "-D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64" or if
++ * we are using a newer compiler then the size of the timestamps will be 64bit,
++ * however, the SYS_futex will still point to the 32bit futex system call.
++ */
++#if __SIZEOF_POINTER__ == 4 && defined(SYS_futex_time64) && \
++ defined(_TIME_BITS) && _TIME_BITS == 64
++# undef SYS_futex
++# define SYS_futex SYS_futex_time64
++#endif
++
+ /**
+ * futex() - SYS_futex syscall wrapper
+ * @uaddr: address of first futex
+diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
+index 3c3e08b8c90eb3..772ca1db6e5971 100644
+--- a/tools/testing/selftests/iommu/iommufd_utils.h
++++ b/tools/testing/selftests/iommu/iommufd_utils.h
+@@ -1042,15 +1042,13 @@ static int _test_cmd_trigger_vevents(int fd, __u32 dev_id, __u32 nvevents)
+ .dev_id = dev_id,
+ },
+ };
+- int ret;
+
+ while (nvevents--) {
+- ret = ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_TRIGGER_VEVENT),
+- &trigger_vevent_cmd);
+- if (ret < 0)
++ if (!ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_TRIGGER_VEVENT),
++ &trigger_vevent_cmd))
+ return -1;
+ }
+- return ret;
++ return 0;
+ }
+
+ #define test_cmd_trigger_vevents(dev_id, nvevents) \
+diff --git a/tools/testing/selftests/kselftest_harness/Makefile b/tools/testing/selftests/kselftest_harness/Makefile
+index 0617535a6ce424..d2369c01701a09 100644
+--- a/tools/testing/selftests/kselftest_harness/Makefile
++++ b/tools/testing/selftests/kselftest_harness/Makefile
+@@ -2,6 +2,7 @@
+
+ TEST_GEN_PROGS_EXTENDED := harness-selftest
+ TEST_PROGS := harness-selftest.sh
++TEST_FILES := harness-selftest.expected
+ EXTRA_CLEAN := harness-selftest.seen
+
+ include ../lib.mk
+diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
+index 5303900339292e..a448fae57831d8 100644
+--- a/tools/testing/selftests/lib.mk
++++ b/tools/testing/selftests/lib.mk
+@@ -228,7 +228,10 @@ $(OUTPUT)/%:%.S
+ $(LINK.S) $^ $(LDLIBS) -o $@
+ endif
+
++# Extract the expected header directory
++khdr_output := $(patsubst %/usr/include,%,$(filter %/usr/include,$(KHDR_INCLUDES)))
++
+ headers:
+- $(Q)$(MAKE) -C $(top_srcdir) headers
++ $(Q)$(MAKE) -f $(top_srcdir)/Makefile -C $(khdr_output) headers
+
+ .PHONY: run_tests all clean install emit_tests gen_mods_dir clean_mods_dir headers
+diff --git a/tools/testing/selftests/mm/madv_populate.c b/tools/testing/selftests/mm/madv_populate.c
+index b6fabd5c27ed61..d8d11bc67ddced 100644
+--- a/tools/testing/selftests/mm/madv_populate.c
++++ b/tools/testing/selftests/mm/madv_populate.c
+@@ -264,23 +264,6 @@ static void test_softdirty(void)
+ munmap(addr, SIZE);
+ }
+
+-static int system_has_softdirty(void)
+-{
+- /*
+- * There is no way to check if the kernel supports soft-dirty, other
+- * than by writing to a page and seeing if the bit was set. But the
+- * tests are intended to check that the bit gets set when it should, so
+- * doing that check would turn a potentially legitimate fail into a
+- * skip. Fortunately, we know for sure that arm64 does not support
+- * soft-dirty. So for now, let's just use the arch as a corse guide.
+- */
+-#if defined(__aarch64__)
+- return 0;
+-#else
+- return 1;
+-#endif
+-}
+-
+ int main(int argc, char **argv)
+ {
+ int nr_tests = 16;
+@@ -288,7 +271,7 @@ int main(int argc, char **argv)
+
+ pagesize = getpagesize();
+
+- if (system_has_softdirty())
++ if (softdirty_supported())
+ nr_tests += 5;
+
+ ksft_print_header();
+@@ -300,7 +283,7 @@ int main(int argc, char **argv)
+ test_holes();
+ test_populate_read();
+ test_populate_write();
+- if (system_has_softdirty())
++ if (softdirty_supported())
+ test_softdirty();
+
+ err = ksft_get_fail_cnt();
+diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
+index 8a3f2b4b218698..4ee4db3750c16c 100644
+--- a/tools/testing/selftests/mm/soft-dirty.c
++++ b/tools/testing/selftests/mm/soft-dirty.c
+@@ -200,8 +200,11 @@ int main(int argc, char **argv)
+ int pagesize;
+
+ ksft_print_header();
+- ksft_set_plan(15);
+
++ if (!softdirty_supported())
++ ksft_exit_skip("soft-dirty is not support\n");
++
++ ksft_set_plan(15);
+ pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
+ if (pagemap_fd < 0)
+ ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
+diff --git a/tools/testing/selftests/mm/va_high_addr_switch.c b/tools/testing/selftests/mm/va_high_addr_switch.c
+index 896b3f73fc53bf..306eba8251077d 100644
+--- a/tools/testing/selftests/mm/va_high_addr_switch.c
++++ b/tools/testing/selftests/mm/va_high_addr_switch.c
+@@ -230,10 +230,10 @@ void testcases_init(void)
+ .msg = "mmap(-1, MAP_HUGETLB) again",
+ },
+ {
+- .addr = (void *)(addr_switch_hint - pagesize),
++ .addr = (void *)(addr_switch_hint - hugepagesize),
+ .size = 2 * hugepagesize,
+ .flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+- .msg = "mmap(addr_switch_hint - pagesize, 2*hugepagesize, MAP_HUGETLB)",
++ .msg = "mmap(addr_switch_hint - hugepagesize, 2*hugepagesize, MAP_HUGETLB)",
+ .low_addr_required = 1,
+ .keep_mapped = 1,
+ },
+diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c
+index 9dafa7669ef9c3..79ec33efcd570a 100644
+--- a/tools/testing/selftests/mm/vm_util.c
++++ b/tools/testing/selftests/mm/vm_util.c
+@@ -426,6 +426,23 @@ bool check_vmflag_io(void *addr)
+ }
+ }
+
++bool softdirty_supported(void)
++{
++ char *addr;
++ bool supported = false;
++ const size_t pagesize = getpagesize();
++
++ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
++ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
++ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
++ if (!addr)
++ ksft_exit_fail_msg("mmap failed\n");
++
++ supported = check_vmflag(addr, "sd");
++ munmap(addr, pagesize);
++ return supported;
++}
++
+ /*
+ * Open an fd at /proc/$pid/maps and configure procmap_out ready for
+ * PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
+diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
+index b55d1809debc06..f6fad71c4f2e19 100644
+--- a/tools/testing/selftests/mm/vm_util.h
++++ b/tools/testing/selftests/mm/vm_util.h
+@@ -99,6 +99,7 @@ bool find_vma_procmap(struct procmap_fd *procmap, void *address);
+ int close_procmap(struct procmap_fd *procmap);
+ int write_sysfs(const char *file_path, unsigned long val);
+ int read_sysfs(const char *file_path, unsigned long *val);
++bool softdirty_supported(void);
+
+ static inline int open_self_procmap(struct procmap_fd *procmap_out)
+ {
+diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
+index a297ee0d6d0754..d074878eb23412 100644
+--- a/tools/testing/selftests/nolibc/nolibc-test.c
++++ b/tools/testing/selftests/nolibc/nolibc-test.c
+@@ -196,8 +196,8 @@ int expect_zr(int expr, int llen)
+ }
+
+
+-#define EXPECT_NZ(cond, expr, val) \
+- do { if (!(cond)) result(llen, SKIPPED); else ret += expect_nz(expr, llen; } while (0)
++#define EXPECT_NZ(cond, expr) \
++ do { if (!(cond)) result(llen, SKIPPED); else ret += expect_nz(expr, llen); } while (0)
+
+ static __attribute__((unused))
+ int expect_nz(int expr, int llen)
+@@ -1334,6 +1334,7 @@ int run_syscall(int min, int max)
+ CASE_TEST(chroot_root); EXPECT_SYSZR(euid0, chroot("/")); break;
+ CASE_TEST(chroot_blah); EXPECT_SYSER(1, chroot("/proc/self/blah"), -1, ENOENT); break;
+ CASE_TEST(chroot_exe); EXPECT_SYSER(1, chroot(argv0), -1, ENOTDIR); break;
++ CASE_TEST(clock_nanosleep); ts.tv_nsec = -1; EXPECT_EQ(1, EINVAL, clock_nanosleep(CLOCK_REALTIME, 0, &ts, NULL)); break;
+ CASE_TEST(close_m1); EXPECT_SYSER(1, close(-1), -1, EBADF); break;
+ CASE_TEST(close_dup); EXPECT_SYSZR(1, close(dup(0))); break;
+ CASE_TEST(dup_0); tmp = dup(0); EXPECT_SYSNE(1, tmp, -1); close(tmp); break;
+diff --git a/tools/testing/selftests/vDSO/vdso_call.h b/tools/testing/selftests/vDSO/vdso_call.h
+index bb237d771051bd..e7205584cbdca5 100644
+--- a/tools/testing/selftests/vDSO/vdso_call.h
++++ b/tools/testing/selftests/vDSO/vdso_call.h
+@@ -44,7 +44,6 @@
+ register long _r6 asm ("r6"); \
+ register long _r7 asm ("r7"); \
+ register long _r8 asm ("r8"); \
+- register long _rval asm ("r3"); \
+ \
+ LOADARGS_##nr(fn, args); \
+ \
+@@ -54,13 +53,13 @@
+ " bns+ 1f\n" \
+ " neg 3, 3\n" \
+ "1:" \
+- : "+r" (_r0), "=r" (_r3), "+r" (_r4), "+r" (_r5), \
++ : "+r" (_r0), "+r" (_r3), "+r" (_r4), "+r" (_r5), \
+ "+r" (_r6), "+r" (_r7), "+r" (_r8) \
+- : "r" (_rval) \
++ : \
+ : "r9", "r10", "r11", "r12", "cr0", "cr1", "cr5", \
+ "cr6", "cr7", "xer", "lr", "ctr", "memory" \
+ ); \
+- _rval; \
++ _r3; \
+ })
+
+ #else
+diff --git a/tools/testing/selftests/vDSO/vdso_test_abi.c b/tools/testing/selftests/vDSO/vdso_test_abi.c
+index a54424e2336f45..67cbfc56e4e1b0 100644
+--- a/tools/testing/selftests/vDSO/vdso_test_abi.c
++++ b/tools/testing/selftests/vDSO/vdso_test_abi.c
+@@ -182,12 +182,11 @@ int main(int argc, char **argv)
+ unsigned long sysinfo_ehdr = getauxval(AT_SYSINFO_EHDR);
+
+ ksft_print_header();
+- ksft_set_plan(VDSO_TEST_PLAN);
+
+- if (!sysinfo_ehdr) {
+- ksft_print_msg("AT_SYSINFO_EHDR is not present!\n");
+- return KSFT_SKIP;
+- }
++ if (!sysinfo_ehdr)
++ ksft_exit_skip("AT_SYSINFO_EHDR is not present!\n");
++
++ ksft_set_plan(VDSO_TEST_PLAN);
+
+ version = versions[VDSO_VERSION];
+ name = (const char **)&names[VDSO_NAMES];
+diff --git a/tools/testing/selftests/watchdog/watchdog-test.c b/tools/testing/selftests/watchdog/watchdog-test.c
+index a1f506ba557864..4f09c5db0c7f30 100644
+--- a/tools/testing/selftests/watchdog/watchdog-test.c
++++ b/tools/testing/selftests/watchdog/watchdog-test.c
+@@ -332,6 +332,12 @@ int main(int argc, char *argv[])
+ if (oneshot)
+ goto end;
+
++ /* Check if WDIOF_KEEPALIVEPING is supported */
++ if (!(info.options & WDIOF_KEEPALIVEPING)) {
++ printf("WDIOC_KEEPALIVE not supported by this device\n");
++ goto end;
++ }
++
+ printf("Watchdog Ticking Away!\n");
+
+ /*
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-15 17:51 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-15 17:51 UTC (permalink / raw
To: gentoo-commits
commit: 247c1a7419bf47efcd7780c0a1e2bc567a29391e
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 15 17:49:52 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 15 17:49:52 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=247c1a74
Update BMQ and PDS io scheduler patch to v6.17-r1
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 2 +-
...=> 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch | 272 ++++++++++++---------
2 files changed, 155 insertions(+), 119 deletions(-)
diff --git a/0000_README b/0000_README
index 0aa228a9..7857b783 100644
--- a/0000_README
+++ b/0000_README
@@ -103,7 +103,7 @@ Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: More ISA levels and uarches for kernel 6.16+
-Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
From: https://gitlab.com/alfredchen/projectc
Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
similarity index 98%
rename from 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
rename to 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
index 6b5e3269..9e1cd866 100644
--- a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
@@ -723,10 +723,10 @@ index 8ae86371ddcd..a972ef1e31a7 100644
obj-y += build_utility.o
diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
new file mode 100644
-index 000000000000..8f03f5312e4d
+index 000000000000..db9a57681f70
--- /dev/null
+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7648 @@
+@@ -0,0 +1,7645 @@
+/*
+ * kernel/sched/alt_core.c
+ *
@@ -801,7 +801,7 @@ index 000000000000..8f03f5312e4d
+__read_mostly int sysctl_resched_latency_warn_ms = 100;
+__read_mostly int sysctl_resched_latency_warn_once = 1;
+
-+#define ALT_SCHED_VERSION "v6.17-r0"
++#define ALT_SCHED_VERSION "v6.17-r1"
+
+#define STOP_PRIO (MAX_RT_PRIO - 1)
+
@@ -842,7 +842,7 @@ index 000000000000..8f03f5312e4d
+ * the domain), this allows us to quickly tell if two cpus are in the same cache
+ * domain, see cpus_share_cache().
+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
++static DEFINE_PER_CPU_READ_MOSTLY(int, sd_llc_id);
+
+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+
@@ -919,7 +919,7 @@ index 000000000000..8f03f5312e4d
+
+ if (prio < last_prio) {
+ if (IDLE_TASK_SCHED_PRIO == last_prio) {
-+ rq->clear_idle_mask_func(cpu, sched_idle_mask);
++ sched_clear_idle_mask(cpu);
+ last_prio -= 2;
+ }
+ CLEAR_CACHED_PREEMPT_MASK(pr, prio, last_prio, cpu);
@@ -928,7 +928,7 @@ index 000000000000..8f03f5312e4d
+ }
+ /* last_prio < prio */
+ if (IDLE_TASK_SCHED_PRIO == prio) {
-+ rq->set_idle_mask_func(cpu, sched_idle_mask);
++ sched_set_idle_mask(cpu);
+ prio -= 2;
+ }
+ SET_CACHED_PREEMPT_MASK(pr, last_prio, prio, cpu);
@@ -2741,7 +2741,7 @@ index 000000000000..8f03f5312e4d
+ return cpumask_and(preempt_mask, allow_mask, mask);
+}
+
-+__read_mostly idle_select_func_t idle_select_func ____cacheline_aligned_in_smp = cpumask_and;
++DEFINE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+static inline int select_task_rq(struct task_struct *p)
+{
@@ -2750,7 +2750,7 @@ index 000000000000..8f03f5312e4d
+ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
+ return select_fallback_rq(task_cpu(p), p);
+
-+ if (idle_select_func(&mask, &allow_mask, sched_idle_mask) ||
++ if (static_call(sched_idle_select_func)(&mask, &allow_mask, sched_idle_mask) ||
+ preempt_mask_check(&mask, &allow_mask, task_sched_prio(p)))
+ return best_mask_cpu(task_cpu(p), &mask);
+
@@ -5281,8 +5281,7 @@ index 000000000000..8f03f5312e4d
+
+ if (next == rq->idle) {
+ if (!take_other_rq_tasks(rq, cpu)) {
-+ if (likely(rq->balance_func && rq->online))
-+ rq->balance_func(rq, cpu);
++ sched_cpu_topology_balance(cpu, rq);
+
+ schedstat_inc(rq->sched_goidle);
+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
@@ -7145,8 +7144,6 @@ index 000000000000..8f03f5312e4d
+ rq->online = false;
+ rq->cpu = i;
+
-+ rq->clear_idle_mask_func = cpumask_clear_cpu;
-+ rq->set_idle_mask_func = cpumask_set_cpu;
+ rq->balance_func = NULL;
+ rq->active_balance_arg.active = 0;
+
@@ -8377,10 +8374,10 @@ index 000000000000..8f03f5312e4d
+#endif /* CONFIG_SCHED_MM_CID */
diff --git a/kernel/sched/alt_core.h b/kernel/sched/alt_core.h
new file mode 100644
-index 000000000000..bb9512c76566
+index 000000000000..55497941a22b
--- /dev/null
+++ b/kernel/sched/alt_core.h
-@@ -0,0 +1,177 @@
+@@ -0,0 +1,174 @@
+#ifndef _KERNEL_SCHED_ALT_CORE_H
+#define _KERNEL_SCHED_ALT_CORE_H
+
@@ -8548,10 +8545,7 @@ index 000000000000..bb9512c76566
+
+extern struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu);
+
-+typedef bool (*idle_select_func_t)(struct cpumask *dstp, const struct cpumask *src1p,
-+ const struct cpumask *src2p);
-+
-+extern idle_select_func_t idle_select_func;
++DECLARE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+/* balance callback */
+extern struct balance_callback *splice_balance_callbacks(struct rq *rq);
@@ -8598,10 +8592,10 @@ index 000000000000..1dbd7eb6a434
+{}
diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
new file mode 100644
-index 000000000000..5b9a53c669f5
+index 000000000000..6cd5cfe3a332
--- /dev/null
+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,1018 @@
+@@ -0,0 +1,1013 @@
+#ifndef _KERNEL_SCHED_ALT_SCHED_H
+#define _KERNEL_SCHED_ALT_SCHED_H
+
@@ -8724,8 +8718,6 @@ index 000000000000..5b9a53c669f5
+};
+
+typedef void (*balance_func_t)(struct rq *rq, int cpu);
-+typedef void (*set_idle_mask_func_t)(unsigned int cpu, struct cpumask *dstp);
-+typedef void (*clear_idle_mask_func_t)(int cpu, struct cpumask *dstp);
+
+struct balance_arg {
+ struct task_struct *task;
@@ -8766,9 +8758,6 @@ index 000000000000..5b9a53c669f5
+ int membarrier_state;
+#endif
+
-+ set_idle_mask_func_t set_idle_mask_func;
-+ clear_idle_mask_func_t clear_idle_mask_func;
-+
+ int cpu; /* cpu of this runqueue */
+ bool online;
+
@@ -9622,10 +9611,10 @@ index 000000000000..5b9a53c669f5
+#endif /* _KERNEL_SCHED_ALT_SCHED_H */
diff --git a/kernel/sched/alt_topology.c b/kernel/sched/alt_topology.c
new file mode 100644
-index 000000000000..376a08a5afda
+index 000000000000..590ee3cb1b49
--- /dev/null
+++ b/kernel/sched/alt_topology.c
-@@ -0,0 +1,347 @@
+@@ -0,0 +1,287 @@
+#include "alt_core.h"
+#include "alt_topology.h"
+
@@ -9640,47 +9629,9 @@ index 000000000000..376a08a5afda
+}
+__setup("pcore_cpus=", sched_pcore_mask_setup);
+
-+/*
-+ * set/clear idle mask functions
-+ */
-+#ifdef CONFIG_SCHED_SMT
-+static void set_idle_mask_smt(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
-+ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+
-+static void clear_idle_mask_smt(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+#endif
-+
-+static void set_idle_mask_pcore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void clear_idle_mask_pcore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void set_idle_mask_ecore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
-+}
-+
-+static void clear_idle_mask_ecore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
-+}
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DEFINE_PER_CPU(struct balance_callback, active_balance_head);
+
+/*
+ * Idle cpu/rq selection functions
@@ -9785,8 +9736,6 @@ index 000000000000..376a08a5afda
+ return 0;
+}
+
-+static DEFINE_PER_CPU(struct balance_callback, active_balance_head);
-+
+#ifdef CONFIG_SCHED_SMT
+static inline int
+smt_pcore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
@@ -9807,7 +9756,7 @@ index 000000000000..376a08a5afda
+}
+
+/* smt p core balance functions */
-+static inline void smt_pcore_balance(struct rq *rq)
++void smt_pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9822,14 +9771,8 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
-+}
-+
+/* smt balance functions */
-+static inline void smt_balance(struct rq *rq)
++void smt_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9840,32 +9783,22 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
-+}
-+
+/* e core balance functions */
-+static inline void ecore_balance(struct rq *rq)
++void ecore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
+ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
+ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ cpumask_empty(sched_pcore_idle_mask) &&
+ /* smt occupied p core to idle e core balance */
+ smt_pcore_source_balance(rq, &single_task_mask, sched_ecore_idle_mask))
+ return;
+}
-+
-+static void ecore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
-+}
+#endif /* CONFIG_SCHED_SMT */
+
+/* p core balance functions */
-+static inline void pcore_balance(struct rq *rq)
++void pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9876,34 +9809,28 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
-+}
-+
+#ifdef ALT_SCHED_DEBUG
+#define SCHED_DEBUG_INFO(...) printk(KERN_INFO __VA_ARGS__)
+#else
+#define SCHED_DEBUG_INFO(...) do { } while(0)
+#endif
+
-+#define SET_IDLE_SELECT_FUNC(func) \
++#define IDLE_SELECT_FUNC_UPDATE(func) \
+{ \
-+ idle_select_func = func; \
-+ printk(KERN_INFO "sched: "#func); \
++ static_call_update(sched_idle_select_func, &func); \
++ printk(KERN_INFO "sched: idle select func -> "#func); \
+}
+
-+#define SET_RQ_BALANCE_FUNC(rq, cpu, func) \
++#define SET_SCHED_CPU_TOPOLOGY(cpu, topo) \
+{ \
-+ rq->balance_func = func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#func, cpu); \
++ per_cpu(sched_cpu_topo, (cpu)) = topo; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#topo, cpu); \
+}
+
-+#define SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_func, clear_func) \
++#define SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, balance) \
+{ \
-+ rq->set_idle_mask_func = set_func; \
-+ rq->clear_idle_mask_func = clear_func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#set_func" "#clear_func, cpu); \
++ per_cpu(sched_cpu_topo_balance, (cpu)) = balance; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#balance, cpu); \
+}
+
+void sched_init_topology(void)
@@ -9926,16 +9853,17 @@ index 000000000000..376a08a5afda
+ ecore_present = !cpumask_empty(&sched_ecore_mask);
+ }
+
-+#ifdef CONFIG_SCHED_SMT
+ /* idle select function */
++#ifdef CONFIG_SCHED_SMT
+ if (cpumask_equal(&sched_smt_mask, cpu_online_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1_idle_select_func);
+ } else
+#endif
+ if (!cpumask_empty(&sched_pcore_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1p2_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1p2_idle_select_func);
+ }
+
++ /* CPU topology setup */
+ for_each_online_cpu(cpu) {
+ rq = cpu_rq(cpu);
+ /* take chance to reset time slice for idle tasks */
@@ -9943,13 +9871,13 @@ index 000000000000..376a08a5afda
+
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) > 1) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_smt, clear_idle_mask_smt);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_SMT);
+
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask) &&
+ !cpumask_intersects(&sched_ecore_mask, &sched_smt_mask)) {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT_PCORE);
+ } else {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT);
+ }
+
+ continue;
@@ -9957,31 +9885,139 @@ index 000000000000..376a08a5afda
+#endif
+ /* !SMT or only one cpu in sg */
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_pcore, clear_idle_mask_pcore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_PCORE);
+
+ if (ecore_present)
-+ SET_RQ_BALANCE_FUNC(rq, cpu, pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_PCORE);
+
+ continue;
+ }
++
+ if (cpumask_test_cpu(cpu, &sched_ecore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_ecore, clear_idle_mask_ecore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_ECORE);
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_intersects(&sched_pcore_mask, &sched_smt_mask))
-+ SET_RQ_BALANCE_FUNC(rq, cpu, ecore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_ECORE);
+#endif
+ }
+ }
+}
diff --git a/kernel/sched/alt_topology.h b/kernel/sched/alt_topology.h
new file mode 100644
-index 000000000000..076174cd2bc6
+index 000000000000..14591a303ea5
--- /dev/null
+++ b/kernel/sched/alt_topology.h
-@@ -0,0 +1,6 @@
+@@ -0,0 +1,113 @@
+#ifndef _KERNEL_SCHED_ALT_TOPOLOGY_H
+#define _KERNEL_SCHED_ALT_TOPOLOGY_H
+
++/*
++ * CPU topology type
++ */
++enum cpu_topo_type {
++ CPU_TOPOLOGY_DEFAULT = 0,
++ CPU_TOPOLOGY_PCORE,
++ CPU_TOPOLOGY_ECORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_SMT,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++
++static inline void sched_set_idle_mask(const unsigned int cpu)
++{
++ cpumask_set_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++static inline void sched_clear_idle_mask(const unsigned int cpu)
++{
++ cpumask_clear_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++/*
++ * CPU topology balance type
++ */
++enum cpu_topo_balance_type {
++ CPU_TOPOLOGY_BALANCE_NONE = 0,
++ CPU_TOPOLOGY_BALANCE_PCORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_BALANCE_ECORE,
++ CPU_TOPOLOGY_BALANCE_SMT,
++ CPU_TOPOLOGY_BALANCE_SMT_PCORE,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DECLARE_PER_CPU(struct balance_callback, active_balance_head);
++
++extern void pcore_balance(struct rq *rq);
++#ifdef CONFIG_SCHED_SMT
++extern void ecore_balance(struct rq *rq);
++extern void smt_balance(struct rq *rq);
++extern void smt_pcore_balance(struct rq *rq);
++#endif
++
++static inline void sched_cpu_topology_balance(const unsigned int cpu, struct rq *rq)
++{
++ if (!rq->online)
++ return;
++
++ switch (per_cpu(sched_cpu_topo_balance, cpu)) {
++ case CPU_TOPOLOGY_BALANCE_NONE:
++ break;
++ case CPU_TOPOLOGY_BALANCE_PCORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_BALANCE_ECORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT_PCORE:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
++ break;
++#endif
++ }
++}
++
+extern void sched_init_topology(void);
+
+#endif /* _KERNEL_SCHED_ALT_TOPOLOGY_H */
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-15 18:18 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-15 18:18 UTC (permalink / raw
To: gentoo-commits
commit: fa1e33670f90a800dfcb535d62e93e898178f3c2
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 15 17:49:52 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 15 18:17:39 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=fa1e3367
Update BMQ and PDS io scheduler patch to v6.17-r1
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 2 +-
...=> 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch | 284 ++++++++++++---------
2 files changed, 163 insertions(+), 123 deletions(-)
diff --git a/0000_README b/0000_README
index 0aa228a9..7857b783 100644
--- a/0000_README
+++ b/0000_README
@@ -103,7 +103,7 @@ Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: More ISA levels and uarches for kernel 6.16+
-Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
From: https://gitlab.com/alfredchen/projectc
Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
similarity index 98%
rename from 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
rename to 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
index 6b5e3269..7ce5d221 100644
--- a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
@@ -1,3 +1,7 @@
+
+r2 for:
+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=661f951e371cc134ea31c84238dbdc9a898b8403
+
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 8b49eab937d0..c5d4901a9608 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
@@ -723,10 +727,10 @@ index 8ae86371ddcd..a972ef1e31a7 100644
obj-y += build_utility.o
diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
new file mode 100644
-index 000000000000..8f03f5312e4d
+index 000000000000..db9a57681f70
--- /dev/null
+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7648 @@
+@@ -0,0 +1,7645 @@
+/*
+ * kernel/sched/alt_core.c
+ *
@@ -801,7 +805,7 @@ index 000000000000..8f03f5312e4d
+__read_mostly int sysctl_resched_latency_warn_ms = 100;
+__read_mostly int sysctl_resched_latency_warn_once = 1;
+
-+#define ALT_SCHED_VERSION "v6.17-r0"
++#define ALT_SCHED_VERSION "v6.17-r1"
+
+#define STOP_PRIO (MAX_RT_PRIO - 1)
+
@@ -842,7 +846,7 @@ index 000000000000..8f03f5312e4d
+ * the domain), this allows us to quickly tell if two cpus are in the same cache
+ * domain, see cpus_share_cache().
+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
++static DEFINE_PER_CPU_READ_MOSTLY(int, sd_llc_id);
+
+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+
@@ -919,7 +923,7 @@ index 000000000000..8f03f5312e4d
+
+ if (prio < last_prio) {
+ if (IDLE_TASK_SCHED_PRIO == last_prio) {
-+ rq->clear_idle_mask_func(cpu, sched_idle_mask);
++ sched_clear_idle_mask(cpu);
+ last_prio -= 2;
+ }
+ CLEAR_CACHED_PREEMPT_MASK(pr, prio, last_prio, cpu);
@@ -928,7 +932,7 @@ index 000000000000..8f03f5312e4d
+ }
+ /* last_prio < prio */
+ if (IDLE_TASK_SCHED_PRIO == prio) {
-+ rq->set_idle_mask_func(cpu, sched_idle_mask);
++ sched_set_idle_mask(cpu);
+ prio -= 2;
+ }
+ SET_CACHED_PREEMPT_MASK(pr, last_prio, prio, cpu);
@@ -2741,7 +2745,7 @@ index 000000000000..8f03f5312e4d
+ return cpumask_and(preempt_mask, allow_mask, mask);
+}
+
-+__read_mostly idle_select_func_t idle_select_func ____cacheline_aligned_in_smp = cpumask_and;
++DEFINE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+static inline int select_task_rq(struct task_struct *p)
+{
@@ -2750,7 +2754,7 @@ index 000000000000..8f03f5312e4d
+ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
+ return select_fallback_rq(task_cpu(p), p);
+
-+ if (idle_select_func(&mask, &allow_mask, sched_idle_mask) ||
++ if (static_call(sched_idle_select_func)(&mask, &allow_mask, sched_idle_mask) ||
+ preempt_mask_check(&mask, &allow_mask, task_sched_prio(p)))
+ return best_mask_cpu(task_cpu(p), &mask);
+
@@ -5281,8 +5285,7 @@ index 000000000000..8f03f5312e4d
+
+ if (next == rq->idle) {
+ if (!take_other_rq_tasks(rq, cpu)) {
-+ if (likely(rq->balance_func && rq->online))
-+ rq->balance_func(rq, cpu);
++ sched_cpu_topology_balance(cpu, rq);
+
+ schedstat_inc(rq->sched_goidle);
+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
@@ -7145,8 +7148,6 @@ index 000000000000..8f03f5312e4d
+ rq->online = false;
+ rq->cpu = i;
+
-+ rq->clear_idle_mask_func = cpumask_clear_cpu;
-+ rq->set_idle_mask_func = cpumask_set_cpu;
+ rq->balance_func = NULL;
+ rq->active_balance_arg.active = 0;
+
@@ -8377,10 +8378,10 @@ index 000000000000..8f03f5312e4d
+#endif /* CONFIG_SCHED_MM_CID */
diff --git a/kernel/sched/alt_core.h b/kernel/sched/alt_core.h
new file mode 100644
-index 000000000000..bb9512c76566
+index 000000000000..55497941a22b
--- /dev/null
+++ b/kernel/sched/alt_core.h
-@@ -0,0 +1,177 @@
+@@ -0,0 +1,174 @@
+#ifndef _KERNEL_SCHED_ALT_CORE_H
+#define _KERNEL_SCHED_ALT_CORE_H
+
@@ -8548,10 +8549,7 @@ index 000000000000..bb9512c76566
+
+extern struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu);
+
-+typedef bool (*idle_select_func_t)(struct cpumask *dstp, const struct cpumask *src1p,
-+ const struct cpumask *src2p);
-+
-+extern idle_select_func_t idle_select_func;
++DECLARE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+/* balance callback */
+extern struct balance_callback *splice_balance_callbacks(struct rq *rq);
@@ -8598,10 +8596,10 @@ index 000000000000..1dbd7eb6a434
+{}
diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
new file mode 100644
-index 000000000000..5b9a53c669f5
+index 000000000000..6cd5cfe3a332
--- /dev/null
+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,1018 @@
+@@ -0,0 +1,1013 @@
+#ifndef _KERNEL_SCHED_ALT_SCHED_H
+#define _KERNEL_SCHED_ALT_SCHED_H
+
@@ -8724,8 +8722,6 @@ index 000000000000..5b9a53c669f5
+};
+
+typedef void (*balance_func_t)(struct rq *rq, int cpu);
-+typedef void (*set_idle_mask_func_t)(unsigned int cpu, struct cpumask *dstp);
-+typedef void (*clear_idle_mask_func_t)(int cpu, struct cpumask *dstp);
+
+struct balance_arg {
+ struct task_struct *task;
@@ -8766,9 +8762,6 @@ index 000000000000..5b9a53c669f5
+ int membarrier_state;
+#endif
+
-+ set_idle_mask_func_t set_idle_mask_func;
-+ clear_idle_mask_func_t clear_idle_mask_func;
-+
+ int cpu; /* cpu of this runqueue */
+ bool online;
+
@@ -9622,10 +9615,10 @@ index 000000000000..5b9a53c669f5
+#endif /* _KERNEL_SCHED_ALT_SCHED_H */
diff --git a/kernel/sched/alt_topology.c b/kernel/sched/alt_topology.c
new file mode 100644
-index 000000000000..376a08a5afda
+index 000000000000..590ee3cb1b49
--- /dev/null
+++ b/kernel/sched/alt_topology.c
-@@ -0,0 +1,347 @@
+@@ -0,0 +1,287 @@
+#include "alt_core.h"
+#include "alt_topology.h"
+
@@ -9640,47 +9633,9 @@ index 000000000000..376a08a5afda
+}
+__setup("pcore_cpus=", sched_pcore_mask_setup);
+
-+/*
-+ * set/clear idle mask functions
-+ */
-+#ifdef CONFIG_SCHED_SMT
-+static void set_idle_mask_smt(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
-+ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+
-+static void clear_idle_mask_smt(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+#endif
-+
-+static void set_idle_mask_pcore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void clear_idle_mask_pcore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void set_idle_mask_ecore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
-+}
-+
-+static void clear_idle_mask_ecore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
-+}
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DEFINE_PER_CPU(struct balance_callback, active_balance_head);
+
+/*
+ * Idle cpu/rq selection functions
@@ -9785,8 +9740,6 @@ index 000000000000..376a08a5afda
+ return 0;
+}
+
-+static DEFINE_PER_CPU(struct balance_callback, active_balance_head);
-+
+#ifdef CONFIG_SCHED_SMT
+static inline int
+smt_pcore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
@@ -9807,7 +9760,7 @@ index 000000000000..376a08a5afda
+}
+
+/* smt p core balance functions */
-+static inline void smt_pcore_balance(struct rq *rq)
++void smt_pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9822,14 +9775,8 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
-+}
-+
+/* smt balance functions */
-+static inline void smt_balance(struct rq *rq)
++void smt_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9840,32 +9787,22 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
-+}
-+
+/* e core balance functions */
-+static inline void ecore_balance(struct rq *rq)
++void ecore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
+ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
+ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ cpumask_empty(sched_pcore_idle_mask) &&
+ /* smt occupied p core to idle e core balance */
+ smt_pcore_source_balance(rq, &single_task_mask, sched_ecore_idle_mask))
+ return;
+}
-+
-+static void ecore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
-+}
+#endif /* CONFIG_SCHED_SMT */
+
+/* p core balance functions */
-+static inline void pcore_balance(struct rq *rq)
++void pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9876,34 +9813,28 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
-+}
-+
+#ifdef ALT_SCHED_DEBUG
+#define SCHED_DEBUG_INFO(...) printk(KERN_INFO __VA_ARGS__)
+#else
+#define SCHED_DEBUG_INFO(...) do { } while(0)
+#endif
+
-+#define SET_IDLE_SELECT_FUNC(func) \
++#define IDLE_SELECT_FUNC_UPDATE(func) \
+{ \
-+ idle_select_func = func; \
-+ printk(KERN_INFO "sched: "#func); \
++ static_call_update(sched_idle_select_func, &func); \
++ printk(KERN_INFO "sched: idle select func -> "#func); \
+}
+
-+#define SET_RQ_BALANCE_FUNC(rq, cpu, func) \
++#define SET_SCHED_CPU_TOPOLOGY(cpu, topo) \
+{ \
-+ rq->balance_func = func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#func, cpu); \
++ per_cpu(sched_cpu_topo, (cpu)) = topo; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#topo, cpu); \
+}
+
-+#define SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_func, clear_func) \
++#define SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, balance) \
+{ \
-+ rq->set_idle_mask_func = set_func; \
-+ rq->clear_idle_mask_func = clear_func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#set_func" "#clear_func, cpu); \
++ per_cpu(sched_cpu_topo_balance, (cpu)) = balance; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#balance, cpu); \
+}
+
+void sched_init_topology(void)
@@ -9926,16 +9857,17 @@ index 000000000000..376a08a5afda
+ ecore_present = !cpumask_empty(&sched_ecore_mask);
+ }
+
-+#ifdef CONFIG_SCHED_SMT
+ /* idle select function */
++#ifdef CONFIG_SCHED_SMT
+ if (cpumask_equal(&sched_smt_mask, cpu_online_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1_idle_select_func);
+ } else
+#endif
+ if (!cpumask_empty(&sched_pcore_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1p2_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1p2_idle_select_func);
+ }
+
++ /* CPU topology setup */
+ for_each_online_cpu(cpu) {
+ rq = cpu_rq(cpu);
+ /* take chance to reset time slice for idle tasks */
@@ -9943,13 +9875,13 @@ index 000000000000..376a08a5afda
+
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) > 1) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_smt, clear_idle_mask_smt);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_SMT);
+
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask) &&
+ !cpumask_intersects(&sched_ecore_mask, &sched_smt_mask)) {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT_PCORE);
+ } else {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT);
+ }
+
+ continue;
@@ -9957,31 +9889,139 @@ index 000000000000..376a08a5afda
+#endif
+ /* !SMT or only one cpu in sg */
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_pcore, clear_idle_mask_pcore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_PCORE);
+
+ if (ecore_present)
-+ SET_RQ_BALANCE_FUNC(rq, cpu, pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_PCORE);
+
+ continue;
+ }
++
+ if (cpumask_test_cpu(cpu, &sched_ecore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_ecore, clear_idle_mask_ecore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_ECORE);
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_intersects(&sched_pcore_mask, &sched_smt_mask))
-+ SET_RQ_BALANCE_FUNC(rq, cpu, ecore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_ECORE);
+#endif
+ }
+ }
+}
diff --git a/kernel/sched/alt_topology.h b/kernel/sched/alt_topology.h
new file mode 100644
-index 000000000000..076174cd2bc6
+index 000000000000..14591a303ea5
--- /dev/null
+++ b/kernel/sched/alt_topology.h
-@@ -0,0 +1,6 @@
+@@ -0,0 +1,113 @@
+#ifndef _KERNEL_SCHED_ALT_TOPOLOGY_H
+#define _KERNEL_SCHED_ALT_TOPOLOGY_H
+
++/*
++ * CPU topology type
++ */
++enum cpu_topo_type {
++ CPU_TOPOLOGY_DEFAULT = 0,
++ CPU_TOPOLOGY_PCORE,
++ CPU_TOPOLOGY_ECORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_SMT,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++
++static inline void sched_set_idle_mask(const unsigned int cpu)
++{
++ cpumask_set_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++static inline void sched_clear_idle_mask(const unsigned int cpu)
++{
++ cpumask_clear_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++/*
++ * CPU topology balance type
++ */
++enum cpu_topo_balance_type {
++ CPU_TOPOLOGY_BALANCE_NONE = 0,
++ CPU_TOPOLOGY_BALANCE_PCORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_BALANCE_ECORE,
++ CPU_TOPOLOGY_BALANCE_SMT,
++ CPU_TOPOLOGY_BALANCE_SMT_PCORE,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DECLARE_PER_CPU(struct balance_callback, active_balance_head);
++
++extern void pcore_balance(struct rq *rq);
++#ifdef CONFIG_SCHED_SMT
++extern void ecore_balance(struct rq *rq);
++extern void smt_balance(struct rq *rq);
++extern void smt_pcore_balance(struct rq *rq);
++#endif
++
++static inline void sched_cpu_topology_balance(const unsigned int cpu, struct rq *rq)
++{
++ if (!rq->online)
++ return;
++
++ switch (per_cpu(sched_cpu_topo_balance, cpu)) {
++ case CPU_TOPOLOGY_BALANCE_NONE:
++ break;
++ case CPU_TOPOLOGY_BALANCE_PCORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_BALANCE_ECORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT_PCORE:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
++ break;
++#endif
++ }
++}
++
+extern void sched_init_topology(void);
+
+#endif /* _KERNEL_SCHED_ALT_TOPOLOGY_H */
@@ -11197,7 +11237,7 @@ index 6e2f54169e66..5a5031761477 100644
static int __init setup_relax_domain_level(char *str)
{
if (kstrtoint(str, 0, &default_relax_domain_level))
-@@ -1731,6 +1734,7 @@ sd_init(struct sched_domain_topology_level *tl,
+@@ -1723,6 +1726,7 @@ sd_init(struct sched_domain_topology_level *tl,
return sd;
}
@@ -11205,15 +11245,15 @@ index 6e2f54169e66..5a5031761477 100644
/*
* Topology list, bottom-up.
-@@ -1767,6 +1771,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
+@@ -1759,6 +1763,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
sched_domain_topology_saved = NULL;
}
+#ifndef CONFIG_SCHED_ALT
#ifdef CONFIG_NUMA
- static const struct cpumask *sd_numa_mask(int cpu)
-@@ -2833,3 +2838,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ static const struct cpumask *sd_numa_mask(struct sched_domain_topology_level *tl, int cpu)
+@@ -2825,3 +2830,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
sched_domains_mutex_unlock();
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-15 18:23 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-15 18:23 UTC (permalink / raw
To: gentoo-commits
commit: ee61fff010c59835e0bb509828747bb1c3ef173d
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 15 17:49:52 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 15 18:23:09 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=ee61fff0
Update BMQ and PDS io scheduler patch to v6.17-r1
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +-
...=> 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch | 284 ++++++++++++---------
2 files changed, 164 insertions(+), 124 deletions(-)
diff --git a/0000_README b/0000_README
index 0aa228a9..0b6b13ba 100644
--- a/0000_README
+++ b/0000_README
@@ -103,8 +103,8 @@ Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: More ISA levels and uarches for kernel 6.16+
-Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
-From: https://gitlab.com/alfredchen/projectc
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
+From: https://github.com/hhoffstaette/kernel-patches/blob/6.17/6.17/sched-prjc-6.17-r2.patch
Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
similarity index 98%
rename from 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
rename to 5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
index 6b5e3269..7ce5d221 100644
--- a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.17-r1.patch
@@ -1,3 +1,7 @@
+
+r2 for:
+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=661f951e371cc134ea31c84238dbdc9a898b8403
+
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 8b49eab937d0..c5d4901a9608 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
@@ -723,10 +727,10 @@ index 8ae86371ddcd..a972ef1e31a7 100644
obj-y += build_utility.o
diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
new file mode 100644
-index 000000000000..8f03f5312e4d
+index 000000000000..db9a57681f70
--- /dev/null
+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7648 @@
+@@ -0,0 +1,7645 @@
+/*
+ * kernel/sched/alt_core.c
+ *
@@ -801,7 +805,7 @@ index 000000000000..8f03f5312e4d
+__read_mostly int sysctl_resched_latency_warn_ms = 100;
+__read_mostly int sysctl_resched_latency_warn_once = 1;
+
-+#define ALT_SCHED_VERSION "v6.17-r0"
++#define ALT_SCHED_VERSION "v6.17-r1"
+
+#define STOP_PRIO (MAX_RT_PRIO - 1)
+
@@ -842,7 +846,7 @@ index 000000000000..8f03f5312e4d
+ * the domain), this allows us to quickly tell if two cpus are in the same cache
+ * domain, see cpus_share_cache().
+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
++static DEFINE_PER_CPU_READ_MOSTLY(int, sd_llc_id);
+
+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+
@@ -919,7 +923,7 @@ index 000000000000..8f03f5312e4d
+
+ if (prio < last_prio) {
+ if (IDLE_TASK_SCHED_PRIO == last_prio) {
-+ rq->clear_idle_mask_func(cpu, sched_idle_mask);
++ sched_clear_idle_mask(cpu);
+ last_prio -= 2;
+ }
+ CLEAR_CACHED_PREEMPT_MASK(pr, prio, last_prio, cpu);
@@ -928,7 +932,7 @@ index 000000000000..8f03f5312e4d
+ }
+ /* last_prio < prio */
+ if (IDLE_TASK_SCHED_PRIO == prio) {
-+ rq->set_idle_mask_func(cpu, sched_idle_mask);
++ sched_set_idle_mask(cpu);
+ prio -= 2;
+ }
+ SET_CACHED_PREEMPT_MASK(pr, last_prio, prio, cpu);
@@ -2741,7 +2745,7 @@ index 000000000000..8f03f5312e4d
+ return cpumask_and(preempt_mask, allow_mask, mask);
+}
+
-+__read_mostly idle_select_func_t idle_select_func ____cacheline_aligned_in_smp = cpumask_and;
++DEFINE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+static inline int select_task_rq(struct task_struct *p)
+{
@@ -2750,7 +2754,7 @@ index 000000000000..8f03f5312e4d
+ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
+ return select_fallback_rq(task_cpu(p), p);
+
-+ if (idle_select_func(&mask, &allow_mask, sched_idle_mask) ||
++ if (static_call(sched_idle_select_func)(&mask, &allow_mask, sched_idle_mask) ||
+ preempt_mask_check(&mask, &allow_mask, task_sched_prio(p)))
+ return best_mask_cpu(task_cpu(p), &mask);
+
@@ -5281,8 +5285,7 @@ index 000000000000..8f03f5312e4d
+
+ if (next == rq->idle) {
+ if (!take_other_rq_tasks(rq, cpu)) {
-+ if (likely(rq->balance_func && rq->online))
-+ rq->balance_func(rq, cpu);
++ sched_cpu_topology_balance(cpu, rq);
+
+ schedstat_inc(rq->sched_goidle);
+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
@@ -7145,8 +7148,6 @@ index 000000000000..8f03f5312e4d
+ rq->online = false;
+ rq->cpu = i;
+
-+ rq->clear_idle_mask_func = cpumask_clear_cpu;
-+ rq->set_idle_mask_func = cpumask_set_cpu;
+ rq->balance_func = NULL;
+ rq->active_balance_arg.active = 0;
+
@@ -8377,10 +8378,10 @@ index 000000000000..8f03f5312e4d
+#endif /* CONFIG_SCHED_MM_CID */
diff --git a/kernel/sched/alt_core.h b/kernel/sched/alt_core.h
new file mode 100644
-index 000000000000..bb9512c76566
+index 000000000000..55497941a22b
--- /dev/null
+++ b/kernel/sched/alt_core.h
-@@ -0,0 +1,177 @@
+@@ -0,0 +1,174 @@
+#ifndef _KERNEL_SCHED_ALT_CORE_H
+#define _KERNEL_SCHED_ALT_CORE_H
+
@@ -8548,10 +8549,7 @@ index 000000000000..bb9512c76566
+
+extern struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu);
+
-+typedef bool (*idle_select_func_t)(struct cpumask *dstp, const struct cpumask *src1p,
-+ const struct cpumask *src2p);
-+
-+extern idle_select_func_t idle_select_func;
++DECLARE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+/* balance callback */
+extern struct balance_callback *splice_balance_callbacks(struct rq *rq);
@@ -8598,10 +8596,10 @@ index 000000000000..1dbd7eb6a434
+{}
diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
new file mode 100644
-index 000000000000..5b9a53c669f5
+index 000000000000..6cd5cfe3a332
--- /dev/null
+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,1018 @@
+@@ -0,0 +1,1013 @@
+#ifndef _KERNEL_SCHED_ALT_SCHED_H
+#define _KERNEL_SCHED_ALT_SCHED_H
+
@@ -8724,8 +8722,6 @@ index 000000000000..5b9a53c669f5
+};
+
+typedef void (*balance_func_t)(struct rq *rq, int cpu);
-+typedef void (*set_idle_mask_func_t)(unsigned int cpu, struct cpumask *dstp);
-+typedef void (*clear_idle_mask_func_t)(int cpu, struct cpumask *dstp);
+
+struct balance_arg {
+ struct task_struct *task;
@@ -8766,9 +8762,6 @@ index 000000000000..5b9a53c669f5
+ int membarrier_state;
+#endif
+
-+ set_idle_mask_func_t set_idle_mask_func;
-+ clear_idle_mask_func_t clear_idle_mask_func;
-+
+ int cpu; /* cpu of this runqueue */
+ bool online;
+
@@ -9622,10 +9615,10 @@ index 000000000000..5b9a53c669f5
+#endif /* _KERNEL_SCHED_ALT_SCHED_H */
diff --git a/kernel/sched/alt_topology.c b/kernel/sched/alt_topology.c
new file mode 100644
-index 000000000000..376a08a5afda
+index 000000000000..590ee3cb1b49
--- /dev/null
+++ b/kernel/sched/alt_topology.c
-@@ -0,0 +1,347 @@
+@@ -0,0 +1,287 @@
+#include "alt_core.h"
+#include "alt_topology.h"
+
@@ -9640,47 +9633,9 @@ index 000000000000..376a08a5afda
+}
+__setup("pcore_cpus=", sched_pcore_mask_setup);
+
-+/*
-+ * set/clear idle mask functions
-+ */
-+#ifdef CONFIG_SCHED_SMT
-+static void set_idle_mask_smt(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
-+ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+
-+static void clear_idle_mask_smt(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+#endif
-+
-+static void set_idle_mask_pcore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void clear_idle_mask_pcore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void set_idle_mask_ecore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
-+}
-+
-+static void clear_idle_mask_ecore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
-+}
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DEFINE_PER_CPU(struct balance_callback, active_balance_head);
+
+/*
+ * Idle cpu/rq selection functions
@@ -9785,8 +9740,6 @@ index 000000000000..376a08a5afda
+ return 0;
+}
+
-+static DEFINE_PER_CPU(struct balance_callback, active_balance_head);
-+
+#ifdef CONFIG_SCHED_SMT
+static inline int
+smt_pcore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
@@ -9807,7 +9760,7 @@ index 000000000000..376a08a5afda
+}
+
+/* smt p core balance functions */
-+static inline void smt_pcore_balance(struct rq *rq)
++void smt_pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9822,14 +9775,8 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
-+}
-+
+/* smt balance functions */
-+static inline void smt_balance(struct rq *rq)
++void smt_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9840,32 +9787,22 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
-+}
-+
+/* e core balance functions */
-+static inline void ecore_balance(struct rq *rq)
++void ecore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
+ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
+ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ cpumask_empty(sched_pcore_idle_mask) &&
+ /* smt occupied p core to idle e core balance */
+ smt_pcore_source_balance(rq, &single_task_mask, sched_ecore_idle_mask))
+ return;
+}
-+
-+static void ecore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
-+}
+#endif /* CONFIG_SCHED_SMT */
+
+/* p core balance functions */
-+static inline void pcore_balance(struct rq *rq)
++void pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9876,34 +9813,28 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
-+}
-+
+#ifdef ALT_SCHED_DEBUG
+#define SCHED_DEBUG_INFO(...) printk(KERN_INFO __VA_ARGS__)
+#else
+#define SCHED_DEBUG_INFO(...) do { } while(0)
+#endif
+
-+#define SET_IDLE_SELECT_FUNC(func) \
++#define IDLE_SELECT_FUNC_UPDATE(func) \
+{ \
-+ idle_select_func = func; \
-+ printk(KERN_INFO "sched: "#func); \
++ static_call_update(sched_idle_select_func, &func); \
++ printk(KERN_INFO "sched: idle select func -> "#func); \
+}
+
-+#define SET_RQ_BALANCE_FUNC(rq, cpu, func) \
++#define SET_SCHED_CPU_TOPOLOGY(cpu, topo) \
+{ \
-+ rq->balance_func = func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#func, cpu); \
++ per_cpu(sched_cpu_topo, (cpu)) = topo; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#topo, cpu); \
+}
+
-+#define SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_func, clear_func) \
++#define SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, balance) \
+{ \
-+ rq->set_idle_mask_func = set_func; \
-+ rq->clear_idle_mask_func = clear_func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#set_func" "#clear_func, cpu); \
++ per_cpu(sched_cpu_topo_balance, (cpu)) = balance; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#balance, cpu); \
+}
+
+void sched_init_topology(void)
@@ -9926,16 +9857,17 @@ index 000000000000..376a08a5afda
+ ecore_present = !cpumask_empty(&sched_ecore_mask);
+ }
+
-+#ifdef CONFIG_SCHED_SMT
+ /* idle select function */
++#ifdef CONFIG_SCHED_SMT
+ if (cpumask_equal(&sched_smt_mask, cpu_online_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1_idle_select_func);
+ } else
+#endif
+ if (!cpumask_empty(&sched_pcore_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1p2_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1p2_idle_select_func);
+ }
+
++ /* CPU topology setup */
+ for_each_online_cpu(cpu) {
+ rq = cpu_rq(cpu);
+ /* take chance to reset time slice for idle tasks */
@@ -9943,13 +9875,13 @@ index 000000000000..376a08a5afda
+
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) > 1) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_smt, clear_idle_mask_smt);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_SMT);
+
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask) &&
+ !cpumask_intersects(&sched_ecore_mask, &sched_smt_mask)) {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT_PCORE);
+ } else {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT);
+ }
+
+ continue;
@@ -9957,31 +9889,139 @@ index 000000000000..376a08a5afda
+#endif
+ /* !SMT or only one cpu in sg */
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_pcore, clear_idle_mask_pcore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_PCORE);
+
+ if (ecore_present)
-+ SET_RQ_BALANCE_FUNC(rq, cpu, pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_PCORE);
+
+ continue;
+ }
++
+ if (cpumask_test_cpu(cpu, &sched_ecore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_ecore, clear_idle_mask_ecore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_ECORE);
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_intersects(&sched_pcore_mask, &sched_smt_mask))
-+ SET_RQ_BALANCE_FUNC(rq, cpu, ecore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_ECORE);
+#endif
+ }
+ }
+}
diff --git a/kernel/sched/alt_topology.h b/kernel/sched/alt_topology.h
new file mode 100644
-index 000000000000..076174cd2bc6
+index 000000000000..14591a303ea5
--- /dev/null
+++ b/kernel/sched/alt_topology.h
-@@ -0,0 +1,6 @@
+@@ -0,0 +1,113 @@
+#ifndef _KERNEL_SCHED_ALT_TOPOLOGY_H
+#define _KERNEL_SCHED_ALT_TOPOLOGY_H
+
++/*
++ * CPU topology type
++ */
++enum cpu_topo_type {
++ CPU_TOPOLOGY_DEFAULT = 0,
++ CPU_TOPOLOGY_PCORE,
++ CPU_TOPOLOGY_ECORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_SMT,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++
++static inline void sched_set_idle_mask(const unsigned int cpu)
++{
++ cpumask_set_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++static inline void sched_clear_idle_mask(const unsigned int cpu)
++{
++ cpumask_clear_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++/*
++ * CPU topology balance type
++ */
++enum cpu_topo_balance_type {
++ CPU_TOPOLOGY_BALANCE_NONE = 0,
++ CPU_TOPOLOGY_BALANCE_PCORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_BALANCE_ECORE,
++ CPU_TOPOLOGY_BALANCE_SMT,
++ CPU_TOPOLOGY_BALANCE_SMT_PCORE,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DECLARE_PER_CPU(struct balance_callback, active_balance_head);
++
++extern void pcore_balance(struct rq *rq);
++#ifdef CONFIG_SCHED_SMT
++extern void ecore_balance(struct rq *rq);
++extern void smt_balance(struct rq *rq);
++extern void smt_pcore_balance(struct rq *rq);
++#endif
++
++static inline void sched_cpu_topology_balance(const unsigned int cpu, struct rq *rq)
++{
++ if (!rq->online)
++ return;
++
++ switch (per_cpu(sched_cpu_topo_balance, cpu)) {
++ case CPU_TOPOLOGY_BALANCE_NONE:
++ break;
++ case CPU_TOPOLOGY_BALANCE_PCORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_BALANCE_ECORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT_PCORE:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
++ break;
++#endif
++ }
++}
++
+extern void sched_init_topology(void);
+
+#endif /* _KERNEL_SCHED_ALT_TOPOLOGY_H */
@@ -11197,7 +11237,7 @@ index 6e2f54169e66..5a5031761477 100644
static int __init setup_relax_domain_level(char *str)
{
if (kstrtoint(str, 0, &default_relax_domain_level))
-@@ -1731,6 +1734,7 @@ sd_init(struct sched_domain_topology_level *tl,
+@@ -1723,6 +1726,7 @@ sd_init(struct sched_domain_topology_level *tl,
return sd;
}
@@ -11205,15 +11245,15 @@ index 6e2f54169e66..5a5031761477 100644
/*
* Topology list, bottom-up.
-@@ -1767,6 +1771,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
+@@ -1759,6 +1763,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
sched_domain_topology_saved = NULL;
}
+#ifndef CONFIG_SCHED_ALT
#ifdef CONFIG_NUMA
- static const struct cpumask *sd_numa_mask(int cpu)
-@@ -2833,3 +2838,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ static const struct cpumask *sd_numa_mask(struct sched_domain_topology_level *tl, int cpu)
+@@ -2825,3 +2830,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
sched_domains_mutex_unlock();
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-15 18:25 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-15 18:25 UTC (permalink / raw
To: gentoo-commits
commit: 9397fd602eba76ac1b70266db9f94d8f3fa77567
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 15 17:49:52 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Oct 15 18:25:13 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=9397fd60
Update BMQ and PDS io scheduler patch to v6.17-r1
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +-
...=> 5020_BMQ-and-PDS-io-scheduler-v6.17-r2.patch | 284 ++++++++++++---------
2 files changed, 164 insertions(+), 124 deletions(-)
diff --git a/0000_README b/0000_README
index 0aa228a9..5ebb2da9 100644
--- a/0000_README
+++ b/0000_README
@@ -103,8 +103,8 @@ Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: More ISA levels and uarches for kernel 6.16+
-Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
-From: https://gitlab.com/alfredchen/projectc
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.17-r2.patch
+From: https://github.com/hhoffstaette/kernel-patches/blob/6.17/6.17/sched-prjc-6.17-r2.patch
Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.17-r2.patch
similarity index 98%
rename from 5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
rename to 5020_BMQ-and-PDS-io-scheduler-v6.17-r2.patch
index 6b5e3269..7ce5d221 100644
--- a/5020_BMQ-and-PDS-io-scheduler-v6.17-r0.patch
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.17-r2.patch
@@ -1,3 +1,7 @@
+
+r2 for:
+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=661f951e371cc134ea31c84238dbdc9a898b8403
+
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 8b49eab937d0..c5d4901a9608 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
@@ -723,10 +727,10 @@ index 8ae86371ddcd..a972ef1e31a7 100644
obj-y += build_utility.o
diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
new file mode 100644
-index 000000000000..8f03f5312e4d
+index 000000000000..db9a57681f70
--- /dev/null
+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7648 @@
+@@ -0,0 +1,7645 @@
+/*
+ * kernel/sched/alt_core.c
+ *
@@ -801,7 +805,7 @@ index 000000000000..8f03f5312e4d
+__read_mostly int sysctl_resched_latency_warn_ms = 100;
+__read_mostly int sysctl_resched_latency_warn_once = 1;
+
-+#define ALT_SCHED_VERSION "v6.17-r0"
++#define ALT_SCHED_VERSION "v6.17-r1"
+
+#define STOP_PRIO (MAX_RT_PRIO - 1)
+
@@ -842,7 +846,7 @@ index 000000000000..8f03f5312e4d
+ * the domain), this allows us to quickly tell if two cpus are in the same cache
+ * domain, see cpus_share_cache().
+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
++static DEFINE_PER_CPU_READ_MOSTLY(int, sd_llc_id);
+
+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+
@@ -919,7 +923,7 @@ index 000000000000..8f03f5312e4d
+
+ if (prio < last_prio) {
+ if (IDLE_TASK_SCHED_PRIO == last_prio) {
-+ rq->clear_idle_mask_func(cpu, sched_idle_mask);
++ sched_clear_idle_mask(cpu);
+ last_prio -= 2;
+ }
+ CLEAR_CACHED_PREEMPT_MASK(pr, prio, last_prio, cpu);
@@ -928,7 +932,7 @@ index 000000000000..8f03f5312e4d
+ }
+ /* last_prio < prio */
+ if (IDLE_TASK_SCHED_PRIO == prio) {
-+ rq->set_idle_mask_func(cpu, sched_idle_mask);
++ sched_set_idle_mask(cpu);
+ prio -= 2;
+ }
+ SET_CACHED_PREEMPT_MASK(pr, last_prio, prio, cpu);
@@ -2741,7 +2745,7 @@ index 000000000000..8f03f5312e4d
+ return cpumask_and(preempt_mask, allow_mask, mask);
+}
+
-+__read_mostly idle_select_func_t idle_select_func ____cacheline_aligned_in_smp = cpumask_and;
++DEFINE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+static inline int select_task_rq(struct task_struct *p)
+{
@@ -2750,7 +2754,7 @@ index 000000000000..8f03f5312e4d
+ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
+ return select_fallback_rq(task_cpu(p), p);
+
-+ if (idle_select_func(&mask, &allow_mask, sched_idle_mask) ||
++ if (static_call(sched_idle_select_func)(&mask, &allow_mask, sched_idle_mask) ||
+ preempt_mask_check(&mask, &allow_mask, task_sched_prio(p)))
+ return best_mask_cpu(task_cpu(p), &mask);
+
@@ -5281,8 +5285,7 @@ index 000000000000..8f03f5312e4d
+
+ if (next == rq->idle) {
+ if (!take_other_rq_tasks(rq, cpu)) {
-+ if (likely(rq->balance_func && rq->online))
-+ rq->balance_func(rq, cpu);
++ sched_cpu_topology_balance(cpu, rq);
+
+ schedstat_inc(rq->sched_goidle);
+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
@@ -7145,8 +7148,6 @@ index 000000000000..8f03f5312e4d
+ rq->online = false;
+ rq->cpu = i;
+
-+ rq->clear_idle_mask_func = cpumask_clear_cpu;
-+ rq->set_idle_mask_func = cpumask_set_cpu;
+ rq->balance_func = NULL;
+ rq->active_balance_arg.active = 0;
+
@@ -8377,10 +8378,10 @@ index 000000000000..8f03f5312e4d
+#endif /* CONFIG_SCHED_MM_CID */
diff --git a/kernel/sched/alt_core.h b/kernel/sched/alt_core.h
new file mode 100644
-index 000000000000..bb9512c76566
+index 000000000000..55497941a22b
--- /dev/null
+++ b/kernel/sched/alt_core.h
-@@ -0,0 +1,177 @@
+@@ -0,0 +1,174 @@
+#ifndef _KERNEL_SCHED_ALT_CORE_H
+#define _KERNEL_SCHED_ALT_CORE_H
+
@@ -8548,10 +8549,7 @@ index 000000000000..bb9512c76566
+
+extern struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu);
+
-+typedef bool (*idle_select_func_t)(struct cpumask *dstp, const struct cpumask *src1p,
-+ const struct cpumask *src2p);
-+
-+extern idle_select_func_t idle_select_func;
++DECLARE_STATIC_CALL(sched_idle_select_func, cpumask_and);
+
+/* balance callback */
+extern struct balance_callback *splice_balance_callbacks(struct rq *rq);
@@ -8598,10 +8596,10 @@ index 000000000000..1dbd7eb6a434
+{}
diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
new file mode 100644
-index 000000000000..5b9a53c669f5
+index 000000000000..6cd5cfe3a332
--- /dev/null
+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,1018 @@
+@@ -0,0 +1,1013 @@
+#ifndef _KERNEL_SCHED_ALT_SCHED_H
+#define _KERNEL_SCHED_ALT_SCHED_H
+
@@ -8724,8 +8722,6 @@ index 000000000000..5b9a53c669f5
+};
+
+typedef void (*balance_func_t)(struct rq *rq, int cpu);
-+typedef void (*set_idle_mask_func_t)(unsigned int cpu, struct cpumask *dstp);
-+typedef void (*clear_idle_mask_func_t)(int cpu, struct cpumask *dstp);
+
+struct balance_arg {
+ struct task_struct *task;
@@ -8766,9 +8762,6 @@ index 000000000000..5b9a53c669f5
+ int membarrier_state;
+#endif
+
-+ set_idle_mask_func_t set_idle_mask_func;
-+ clear_idle_mask_func_t clear_idle_mask_func;
-+
+ int cpu; /* cpu of this runqueue */
+ bool online;
+
@@ -9622,10 +9615,10 @@ index 000000000000..5b9a53c669f5
+#endif /* _KERNEL_SCHED_ALT_SCHED_H */
diff --git a/kernel/sched/alt_topology.c b/kernel/sched/alt_topology.c
new file mode 100644
-index 000000000000..376a08a5afda
+index 000000000000..590ee3cb1b49
--- /dev/null
+++ b/kernel/sched/alt_topology.c
-@@ -0,0 +1,347 @@
+@@ -0,0 +1,287 @@
+#include "alt_core.h"
+#include "alt_topology.h"
+
@@ -9640,47 +9633,9 @@ index 000000000000..376a08a5afda
+}
+__setup("pcore_cpus=", sched_pcore_mask_setup);
+
-+/*
-+ * set/clear idle mask functions
-+ */
-+#ifdef CONFIG_SCHED_SMT
-+static void set_idle_mask_smt(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
-+ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+
-+static void clear_idle_mask_smt(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
-+}
-+#endif
-+
-+static void set_idle_mask_pcore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void clear_idle_mask_pcore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
-+}
-+
-+static void set_idle_mask_ecore(unsigned int cpu, struct cpumask *dstp)
-+{
-+ cpumask_set_cpu(cpu, dstp);
-+ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
-+}
-+
-+static void clear_idle_mask_ecore(int cpu, struct cpumask *dstp)
-+{
-+ cpumask_clear_cpu(cpu, dstp);
-+ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
-+}
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++DEFINE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DEFINE_PER_CPU(struct balance_callback, active_balance_head);
+
+/*
+ * Idle cpu/rq selection functions
@@ -9785,8 +9740,6 @@ index 000000000000..376a08a5afda
+ return 0;
+}
+
-+static DEFINE_PER_CPU(struct balance_callback, active_balance_head);
-+
+#ifdef CONFIG_SCHED_SMT
+static inline int
+smt_pcore_source_balance(struct rq *rq, cpumask_t *single_task_mask, cpumask_t *target_mask)
@@ -9807,7 +9760,7 @@ index 000000000000..376a08a5afda
+}
+
+/* smt p core balance functions */
-+static inline void smt_pcore_balance(struct rq *rq)
++void smt_pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9822,14 +9775,8 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
-+}
-+
+/* smt balance functions */
-+static inline void smt_balance(struct rq *rq)
++void smt_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9840,32 +9787,22 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void smt_balance_func(struct rq *rq, const int cpu)
-+{
-+ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
-+}
-+
+/* e core balance functions */
-+static inline void ecore_balance(struct rq *rq)
++void ecore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
+ if (cpumask_andnot(&single_task_mask, cpu_active_mask, sched_idle_mask) &&
+ cpumask_andnot(&single_task_mask, &single_task_mask, &sched_rq_pending_mask) &&
++ cpumask_empty(sched_pcore_idle_mask) &&
+ /* smt occupied p core to idle e core balance */
+ smt_pcore_source_balance(rq, &single_task_mask, sched_ecore_idle_mask))
+ return;
+}
-+
-+static void ecore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
-+}
+#endif /* CONFIG_SCHED_SMT */
+
+/* p core balance functions */
-+static inline void pcore_balance(struct rq *rq)
++void pcore_balance(struct rq *rq)
+{
+ cpumask_t single_task_mask;
+
@@ -9876,34 +9813,28 @@ index 000000000000..376a08a5afda
+ return;
+}
+
-+static void pcore_balance_func(struct rq *rq, const int cpu)
-+{
-+ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
-+}
-+
+#ifdef ALT_SCHED_DEBUG
+#define SCHED_DEBUG_INFO(...) printk(KERN_INFO __VA_ARGS__)
+#else
+#define SCHED_DEBUG_INFO(...) do { } while(0)
+#endif
+
-+#define SET_IDLE_SELECT_FUNC(func) \
++#define IDLE_SELECT_FUNC_UPDATE(func) \
+{ \
-+ idle_select_func = func; \
-+ printk(KERN_INFO "sched: "#func); \
++ static_call_update(sched_idle_select_func, &func); \
++ printk(KERN_INFO "sched: idle select func -> "#func); \
+}
+
-+#define SET_RQ_BALANCE_FUNC(rq, cpu, func) \
++#define SET_SCHED_CPU_TOPOLOGY(cpu, topo) \
+{ \
-+ rq->balance_func = func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#func, cpu); \
++ per_cpu(sched_cpu_topo, (cpu)) = topo; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#topo, cpu); \
+}
+
-+#define SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_func, clear_func) \
++#define SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, balance) \
+{ \
-+ rq->set_idle_mask_func = set_func; \
-+ rq->clear_idle_mask_func = clear_func; \
-+ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#set_func" "#clear_func, cpu); \
++ per_cpu(sched_cpu_topo_balance, (cpu)) = balance; \
++ SCHED_DEBUG_INFO("sched: cpu#%02d -> "#balance, cpu); \
+}
+
+void sched_init_topology(void)
@@ -9926,16 +9857,17 @@ index 000000000000..376a08a5afda
+ ecore_present = !cpumask_empty(&sched_ecore_mask);
+ }
+
-+#ifdef CONFIG_SCHED_SMT
+ /* idle select function */
++#ifdef CONFIG_SCHED_SMT
+ if (cpumask_equal(&sched_smt_mask, cpu_online_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1_idle_select_func);
+ } else
+#endif
+ if (!cpumask_empty(&sched_pcore_mask)) {
-+ SET_IDLE_SELECT_FUNC(p1p2_idle_select_func);
++ IDLE_SELECT_FUNC_UPDATE(p1p2_idle_select_func);
+ }
+
++ /* CPU topology setup */
+ for_each_online_cpu(cpu) {
+ rq = cpu_rq(cpu);
+ /* take chance to reset time slice for idle tasks */
@@ -9943,13 +9875,13 @@ index 000000000000..376a08a5afda
+
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) > 1) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_smt, clear_idle_mask_smt);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_SMT);
+
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask) &&
+ !cpumask_intersects(&sched_ecore_mask, &sched_smt_mask)) {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT_PCORE);
+ } else {
-+ SET_RQ_BALANCE_FUNC(rq, cpu, smt_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_SMT);
+ }
+
+ continue;
@@ -9957,31 +9889,139 @@ index 000000000000..376a08a5afda
+#endif
+ /* !SMT or only one cpu in sg */
+ if (cpumask_test_cpu(cpu, &sched_pcore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_pcore, clear_idle_mask_pcore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_PCORE);
+
+ if (ecore_present)
-+ SET_RQ_BALANCE_FUNC(rq, cpu, pcore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_PCORE);
+
+ continue;
+ }
++
+ if (cpumask_test_cpu(cpu, &sched_ecore_mask)) {
-+ SET_RQ_IDLE_MASK_FUNC(rq, cpu, set_idle_mask_ecore, clear_idle_mask_ecore);
++ SET_SCHED_CPU_TOPOLOGY(cpu, CPU_TOPOLOGY_ECORE);
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_intersects(&sched_pcore_mask, &sched_smt_mask))
-+ SET_RQ_BALANCE_FUNC(rq, cpu, ecore_balance_func);
++ SET_SCHED_CPU_TOPOLOGY_BALANCE(cpu, CPU_TOPOLOGY_BALANCE_ECORE);
+#endif
+ }
+ }
+}
diff --git a/kernel/sched/alt_topology.h b/kernel/sched/alt_topology.h
new file mode 100644
-index 000000000000..076174cd2bc6
+index 000000000000..14591a303ea5
--- /dev/null
+++ b/kernel/sched/alt_topology.h
-@@ -0,0 +1,6 @@
+@@ -0,0 +1,113 @@
+#ifndef _KERNEL_SCHED_ALT_TOPOLOGY_H
+#define _KERNEL_SCHED_ALT_TOPOLOGY_H
+
++/*
++ * CPU topology type
++ */
++enum cpu_topo_type {
++ CPU_TOPOLOGY_DEFAULT = 0,
++ CPU_TOPOLOGY_PCORE,
++ CPU_TOPOLOGY_ECORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_SMT,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_type, sched_cpu_topo);
++
++static inline void sched_set_idle_mask(const unsigned int cpu)
++{
++ cpumask_set_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_set_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_set_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ if (cpumask_subset(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++static inline void sched_clear_idle_mask(const unsigned int cpu)
++{
++ cpumask_clear_cpu(cpu, sched_idle_mask);
++
++ switch (per_cpu(sched_cpu_topo, cpu)) {
++ case CPU_TOPOLOGY_DEFAULT:
++ break;
++ case CPU_TOPOLOGY_PCORE:
++ cpumask_clear_cpu(cpu, sched_pcore_idle_mask);
++ break;
++ case CPU_TOPOLOGY_ECORE:
++ cpumask_clear_cpu(cpu, sched_ecore_idle_mask);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_SMT:
++ cpumask_andnot(sched_sg_idle_mask, sched_sg_idle_mask, cpu_smt_mask(cpu));
++ break;
++#endif
++ }
++}
++
++/*
++ * CPU topology balance type
++ */
++enum cpu_topo_balance_type {
++ CPU_TOPOLOGY_BALANCE_NONE = 0,
++ CPU_TOPOLOGY_BALANCE_PCORE,
++#ifdef CONFIG_SCHED_SMT
++ CPU_TOPOLOGY_BALANCE_ECORE,
++ CPU_TOPOLOGY_BALANCE_SMT,
++ CPU_TOPOLOGY_BALANCE_SMT_PCORE,
++#endif
++};
++
++DECLARE_PER_CPU_READ_MOSTLY(enum cpu_topo_balance_type, sched_cpu_topo_balance);
++DECLARE_PER_CPU(struct balance_callback, active_balance_head);
++
++extern void pcore_balance(struct rq *rq);
++#ifdef CONFIG_SCHED_SMT
++extern void ecore_balance(struct rq *rq);
++extern void smt_balance(struct rq *rq);
++extern void smt_pcore_balance(struct rq *rq);
++#endif
++
++static inline void sched_cpu_topology_balance(const unsigned int cpu, struct rq *rq)
++{
++ if (!rq->online)
++ return;
++
++ switch (per_cpu(sched_cpu_topo_balance, cpu)) {
++ case CPU_TOPOLOGY_BALANCE_NONE:
++ break;
++ case CPU_TOPOLOGY_BALANCE_PCORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), pcore_balance);
++ break;
++#ifdef CONFIG_SCHED_SMT
++ case CPU_TOPOLOGY_BALANCE_ECORE:
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), ecore_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_balance);
++ break;
++ case CPU_TOPOLOGY_BALANCE_SMT_PCORE:
++ if (cpumask_test_cpu(cpu, sched_sg_idle_mask))
++ queue_balance_callback(rq, &per_cpu(active_balance_head, cpu), smt_pcore_balance);
++ break;
++#endif
++ }
++}
++
+extern void sched_init_topology(void);
+
+#endif /* _KERNEL_SCHED_ALT_TOPOLOGY_H */
@@ -11197,7 +11237,7 @@ index 6e2f54169e66..5a5031761477 100644
static int __init setup_relax_domain_level(char *str)
{
if (kstrtoint(str, 0, &default_relax_domain_level))
-@@ -1731,6 +1734,7 @@ sd_init(struct sched_domain_topology_level *tl,
+@@ -1723,6 +1726,7 @@ sd_init(struct sched_domain_topology_level *tl,
return sd;
}
@@ -11205,15 +11245,15 @@ index 6e2f54169e66..5a5031761477 100644
/*
* Topology list, bottom-up.
-@@ -1767,6 +1771,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
+@@ -1759,6 +1763,7 @@ void __init set_sched_topology(struct sched_domain_topology_level *tl)
sched_domain_topology_saved = NULL;
}
+#ifndef CONFIG_SCHED_ALT
#ifdef CONFIG_NUMA
- static const struct cpumask *sd_numa_mask(int cpu)
-@@ -2833,3 +2838,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ static const struct cpumask *sd_numa_mask(struct sched_domain_topology_level *tl, int cpu)
+@@ -2825,3 +2830,31 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
sched_domains_mutex_unlock();
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-20 5:29 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-20 5:29 UTC (permalink / raw
To: gentoo-commits
commit: b6ddbec9e2909c04b03937c660b048692bea8015
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 20 05:29:27 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 20 05:29:27 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=b6ddbec9
Linux patch 6.17.4
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +
1003_linux-6.17.4.patch | 14927 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 14931 insertions(+)
diff --git a/0000_README b/0000_README
index 5ebb2da9..44b62f97 100644
--- a/0000_README
+++ b/0000_README
@@ -55,6 +55,10 @@ Patch: 1002_linux-6.17.3.patch
From: https://www.kernel.org
Desc: Linux 6.17.3
+Patch: 1003_linux-6.17.4.patch
+From: https://www.kernel.org
+Desc: Linux 6.17.4
+
Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
diff --git a/1003_linux-6.17.4.patch b/1003_linux-6.17.4.patch
new file mode 100644
index 00000000..bde3e162
--- /dev/null
+++ b/1003_linux-6.17.4.patch
@@ -0,0 +1,14927 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 5a7a83c411e9c5..e92c0056e4e0a6 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -6429,6 +6429,9 @@
+
+ rootflags= [KNL] Set root filesystem mount option string
+
++ initramfs_options= [KNL]
++ Specify mount options for for the initramfs mount.
++
+ rootfstype= [KNL] Set root filesystem type
+
+ rootwait [KNL] Wait (indefinitely) for root device to show up.
+diff --git a/Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml b/Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml
+index 5ac994b3c0aa15..b304bc5a08c402 100644
+--- a/Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml
++++ b/Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml
+@@ -57,11 +57,24 @@ required:
+ - clocks
+ - clock-names
+ - '#phy-cells'
+- - power-domains
+ - resets
+ - reset-names
+ - rockchip,grf
+
++allOf:
++ - if:
++ properties:
++ compatible:
++ contains:
++ enum:
++ - rockchip,px30-csi-dphy
++ - rockchip,rk1808-csi-dphy
++ - rockchip,rk3326-csi-dphy
++ - rockchip,rk3368-csi-dphy
++ then:
++ required:
++ - power-domains
++
+ additionalProperties: false
+
+ examples:
+diff --git a/Makefile b/Makefile
+index 22ee632f9104aa..4c3092dae03cf5 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 17
+-SUBLEVEL = 3
++SUBLEVEL = 4
+ EXTRAVERSION =
+ NAME = Baby Opossum Posse
+
+diff --git a/arch/arm/mach-omap2/am33xx-restart.c b/arch/arm/mach-omap2/am33xx-restart.c
+index fcf3d557aa7866..3cdf223addcc28 100644
+--- a/arch/arm/mach-omap2/am33xx-restart.c
++++ b/arch/arm/mach-omap2/am33xx-restart.c
+@@ -2,12 +2,46 @@
+ /*
+ * am33xx-restart.c - Code common to all AM33xx machines.
+ */
++#include <dt-bindings/pinctrl/am33xx.h>
++#include <linux/delay.h>
+ #include <linux/kernel.h>
+ #include <linux/reboot.h>
+
+ #include "common.h"
++#include "control.h"
+ #include "prm.h"
+
++/*
++ * Advisory 1.0.36 EMU0 and EMU1: Terminals Must be Pulled High Before
++ * ICEPick Samples
++ *
++ * If EMU0/EMU1 pins have been used as GPIO outputs and actively driving low
++ * level, the device might not reboot in normal mode. We are in a bad position
++ * to override GPIO state here, so just switch the pins into EMU input mode
++ * (that's what reset will do anyway) and wait a bit, because the state will be
++ * latched 190 ns after reset.
++ */
++static void am33xx_advisory_1_0_36(void)
++{
++ u32 emu0 = omap_ctrl_readl(AM335X_PIN_EMU0);
++ u32 emu1 = omap_ctrl_readl(AM335X_PIN_EMU1);
++
++ /* If both pins are in EMU mode, nothing to do */
++ if (!(emu0 & 7) && !(emu1 & 7))
++ return;
++
++ /* Switch GPIO3_7/GPIO3_8 into EMU0/EMU1 modes respectively */
++ omap_ctrl_writel(emu0 & ~7, AM335X_PIN_EMU0);
++ omap_ctrl_writel(emu1 & ~7, AM335X_PIN_EMU1);
++
++ /*
++ * Give pull-ups time to load the pin/PCB trace capacity.
++ * 5 ms shall be enough to load 1 uF (would be huge capacity for these
++ * pins) with TI-recommended 4k7 external pull-ups.
++ */
++ mdelay(5);
++}
++
+ /**
+ * am33xx_restart - trigger a software restart of the SoC
+ * @mode: the "reboot mode", see arch/arm/kernel/{setup,process}.c
+@@ -18,6 +52,8 @@
+ */
+ void am33xx_restart(enum reboot_mode mode, const char *cmd)
+ {
++ am33xx_advisory_1_0_36();
++
+ /* TODO: Handle cmd if necessary */
+ prm_reboot_mode = mode;
+
+diff --git a/arch/arm/mach-omap2/pm33xx-core.c b/arch/arm/mach-omap2/pm33xx-core.c
+index c907478be196ed..4abb86dc98fdac 100644
+--- a/arch/arm/mach-omap2/pm33xx-core.c
++++ b/arch/arm/mach-omap2/pm33xx-core.c
+@@ -388,12 +388,15 @@ static int __init amx3_idle_init(struct device_node *cpu_node, int cpu)
+ if (!state_node)
+ break;
+
+- if (!of_device_is_available(state_node))
++ if (!of_device_is_available(state_node)) {
++ of_node_put(state_node);
+ continue;
++ }
+
+ if (i == CPUIDLE_STATE_MAX) {
+ pr_warn("%s: cpuidle states reached max possible\n",
+ __func__);
++ of_node_put(state_node);
+ break;
+ }
+
+@@ -403,6 +406,7 @@ static int __init amx3_idle_init(struct device_node *cpu_node, int cpu)
+ states[state_count].wfi_flags |= WFI_FLAG_WAKE_M3 |
+ WFI_FLAG_FLUSH_CACHE;
+
++ of_node_put(state_node);
+ state_count++;
+ }
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+index de9fdc0dfc5f9b..224540f93c9ac2 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+@@ -1562,6 +1562,8 @@ mdss: display-subsystem@1a00000 {
+
+ interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
+
++ resets = <&gcc GCC_MDSS_BCR>;
++
+ interrupt-controller;
+ #interrupt-cells = <1>;
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8939.dtsi b/arch/arm64/boot/dts/qcom/msm8939.dtsi
+index 68b92fdb996c26..eb64ec35e7f0e1 100644
+--- a/arch/arm64/boot/dts/qcom/msm8939.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8939.dtsi
+@@ -1249,6 +1249,8 @@ mdss: display-subsystem@1a00000 {
+
+ power-domains = <&gcc MDSS_GDSC>;
+
++ resets = <&gcc GCC_MDSS_BCR>;
++
+ #address-cells = <1>;
+ #size-cells = <1>;
+ #interrupt-cells = <1>;
+diff --git a/arch/arm64/boot/dts/qcom/qcs615.dtsi b/arch/arm64/boot/dts/qcom/qcs615.dtsi
+index bfbb2103549227..e033b53f0f0f42 100644
+--- a/arch/arm64/boot/dts/qcom/qcs615.dtsi
++++ b/arch/arm64/boot/dts/qcom/qcs615.dtsi
+@@ -631,6 +631,7 @@ &mc_virt SLAVE_EBI1 QCOM_ICC_TAG_ALWAYS>,
+ interconnect-names = "qup-core",
+ "qup-config";
+ power-domains = <&rpmhpd RPMHPD_CX>;
++ operating-points-v2 = <&qup_opp_table>;
+ status = "disabled";
+ };
+
+@@ -654,6 +655,7 @@ &config_noc SLAVE_QUP_0 QCOM_ICC_TAG_ALWAYS>,
+ "qup-config",
+ "qup-memory";
+ power-domains = <&rpmhpd RPMHPD_CX>;
++ required-opps = <&rpmhpd_opp_low_svs>;
+ dmas = <&gpi_dma0 0 1 QCOM_GPI_I2C>,
+ <&gpi_dma0 1 1 QCOM_GPI_I2C>;
+ dma-names = "tx",
+@@ -681,6 +683,7 @@ &config_noc SLAVE_QUP_0 QCOM_ICC_TAG_ALWAYS>,
+ "qup-config",
+ "qup-memory";
+ power-domains = <&rpmhpd RPMHPD_CX>;
++ required-opps = <&rpmhpd_opp_low_svs>;
+ dmas = <&gpi_dma0 0 2 QCOM_GPI_I2C>,
+ <&gpi_dma0 1 2 QCOM_GPI_I2C>;
+ dma-names = "tx",
+@@ -703,6 +706,7 @@ &mc_virt SLAVE_EBI1 QCOM_ICC_TAG_ALWAYS>,
+ interconnect-names = "qup-core",
+ "qup-config";
+ power-domains = <&rpmhpd RPMHPD_CX>;
++ operating-points-v2 = <&qup_opp_table>;
+ dmas = <&gpi_dma0 0 2 QCOM_GPI_SPI>,
+ <&gpi_dma0 1 2 QCOM_GPI_SPI>;
+ dma-names = "tx",
+@@ -728,6 +732,7 @@ &mc_virt SLAVE_EBI1 QCOM_ICC_TAG_ALWAYS>,
+ interconnect-names = "qup-core",
+ "qup-config";
+ power-domains = <&rpmhpd RPMHPD_CX>;
++ operating-points-v2 = <&qup_opp_table>;
+ status = "disabled";
+ };
+
+@@ -751,6 +756,7 @@ &config_noc SLAVE_QUP_0 QCOM_ICC_TAG_ALWAYS>,
+ "qup-config",
+ "qup-memory";
+ power-domains = <&rpmhpd RPMHPD_CX>;
++ required-opps = <&rpmhpd_opp_low_svs>;
+ dmas = <&gpi_dma0 0 3 QCOM_GPI_I2C>,
+ <&gpi_dma0 1 3 QCOM_GPI_I2C>;
+ dma-names = "tx",
+diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+index c0f466d966305a..b5cd3933b020a8 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+@@ -5404,11 +5404,11 @@ slimbam: dma-controller@17184000 {
+ compatible = "qcom,bam-v1.7.4", "qcom,bam-v1.7.0";
+ qcom,controlled-remotely;
+ reg = <0 0x17184000 0 0x2a000>;
+- num-channels = <31>;
++ num-channels = <23>;
+ interrupts = <GIC_SPI 164 IRQ_TYPE_LEVEL_HIGH>;
+ #dma-cells = <1>;
+ qcom,ee = <1>;
+- qcom,num-ees = <2>;
++ qcom,num-ees = <4>;
+ iommus = <&apps_smmu 0x1806 0x0>;
+ };
+
+diff --git a/arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi b/arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi
+index e3888bc143a0aa..621890ada1536d 100644
+--- a/arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi
++++ b/arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi
+@@ -475,6 +475,8 @@ pm8010: pmic@c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
++ status = "disabled";
++
+ pm8010_temp_alarm: temp-alarm@2400 {
+ compatible = "qcom,spmi-temp-alarm";
+ reg = <0x2400>;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
+index 44e7e459f1769e..b4b66a505db108 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
+@@ -267,7 +267,7 @@ secure_proxy_sa3: mailbox@43600000 {
+
+ main_pmx0: pinctrl@f4000 {
+ compatible = "pinctrl-single";
+- reg = <0x00 0xf4000 0x00 0x2ac>;
++ reg = <0x00 0xf4000 0x00 0x25c>;
+ #pinctrl-cells = <1>;
+ pinctrl-single,register-width = <32>;
+ pinctrl-single,function-mask = <0xffffffff>;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62p5.dtsi b/arch/arm64/boot/dts/ti/k3-am62p5.dtsi
+index 202378d9d5cfdc..8982a7b9f1a6a1 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62p5.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62p5.dtsi
+@@ -135,7 +135,7 @@ opp-800000000 {
+
+ opp-1000000000 {
+ opp-hz = /bits/ 64 <1000000000>;
+- opp-supported-hw = <0x01 0x0006>;
++ opp-supported-hw = <0x01 0x0007>;
+ clock-latency-ns = <6000000>;
+ };
+
+diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
+index bfe3ce9df19781..ba7cf7fec5e978 100644
+--- a/arch/arm64/include/asm/ftrace.h
++++ b/arch/arm64/include/asm/ftrace.h
+@@ -153,6 +153,7 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
+ regs->pc = afregs->pc;
+ regs->regs[29] = afregs->fp;
+ regs->regs[30] = afregs->lr;
++ regs->pstate = PSR_MODE_EL1h;
+ return regs;
+ }
+
+diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
+index ef269a5a37e12c..3e9d1aa37bbfbb 100644
+--- a/arch/arm64/kernel/cpufeature.c
++++ b/arch/arm64/kernel/cpufeature.c
+@@ -2408,17 +2408,21 @@ static void bti_enable(const struct arm64_cpu_capabilities *__unused)
+ #ifdef CONFIG_ARM64_MTE
+ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
+ {
++ static bool cleared_zero_page = false;
++
+ sysreg_clear_set(sctlr_el1, 0, SCTLR_ELx_ATA | SCTLR_EL1_ATA0);
+
+ mte_cpu_setup();
+
+ /*
+ * Clear the tags in the zero page. This needs to be done via the
+- * linear map which has the Tagged attribute.
++ * linear map which has the Tagged attribute. Since this page is
++ * always mapped as pte_special(), set_pte_at() will not attempt to
++ * clear the tags or set PG_mte_tagged.
+ */
+- if (try_page_mte_tagging(ZERO_PAGE(0))) {
++ if (!cleared_zero_page) {
++ cleared_zero_page = true;
+ mte_clear_page_tags(lm_alias(empty_zero_page));
+- set_page_mte_tagged(ZERO_PAGE(0));
+ }
+
+ kasan_init_hw_tags_cpu();
+diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
+index e5e773844889a8..63aed49ac181a0 100644
+--- a/arch/arm64/kernel/mte.c
++++ b/arch/arm64/kernel/mte.c
+@@ -460,7 +460,7 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
+ if (folio_test_hugetlb(folio))
+ WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio));
+ else
+- WARN_ON_ONCE(!page_mte_tagged(page));
++ WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page));
+
+ /* limit access to the end of the page */
+ offset = offset_in_page(addr);
+diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
+index 0f4bd77718590c..a8d76d0354da6b 100644
+--- a/arch/arm64/kernel/pi/map_kernel.c
++++ b/arch/arm64/kernel/pi/map_kernel.c
+@@ -78,6 +78,12 @@ static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
+ twopass |= enable_scs;
+ prot = twopass ? data_prot : text_prot;
+
++ /*
++ * [_stext, _text) isn't executed after boot and contains some
++ * non-executable, unpredictable data, so map it non-executable.
++ */
++ map_segment(init_pg_dir, &pgdp, va_offset, _text, _stext, data_prot,
++ false, root_level);
+ map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot,
+ !twopass, root_level);
+ map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata,
+diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
+index 0c5d408afd95d4..8ab6104a4883dc 100644
+--- a/arch/arm64/kernel/probes/kprobes.c
++++ b/arch/arm64/kernel/probes/kprobes.c
+@@ -10,6 +10,7 @@
+
+ #define pr_fmt(fmt) "kprobes: " fmt
+
++#include <linux/execmem.h>
+ #include <linux/extable.h>
+ #include <linux/kasan.h>
+ #include <linux/kernel.h>
+@@ -41,6 +42,17 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
+ static void __kprobes
+ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
+
++void *alloc_insn_page(void)
++{
++ void *addr;
++
++ addr = execmem_alloc(EXECMEM_KPROBES, PAGE_SIZE);
++ if (!addr)
++ return NULL;
++ set_memory_rox((unsigned long)addr, 1);
++ return addr;
++}
++
+ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
+ {
+ kprobe_opcode_t *addr = p->ainsn.xol_insn;
+diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
+index 77c7926a4df660..23c05dc7a8f2ac 100644
+--- a/arch/arm64/kernel/setup.c
++++ b/arch/arm64/kernel/setup.c
+@@ -214,7 +214,7 @@ static void __init request_standard_resources(void)
+ unsigned long i = 0;
+ size_t res_size;
+
+- kernel_code.start = __pa_symbol(_stext);
++ kernel_code.start = __pa_symbol(_text);
+ kernel_code.end = __pa_symbol(__init_begin - 1);
+ kernel_data.start = __pa_symbol(_sdata);
+ kernel_data.end = __pa_symbol(_end - 1);
+@@ -280,7 +280,7 @@ u64 cpu_logical_map(unsigned int cpu)
+
+ void __init __no_sanitize_address setup_arch(char **cmdline_p)
+ {
+- setup_initial_init_mm(_stext, _etext, _edata, _end);
++ setup_initial_init_mm(_text, _etext, _edata, _end);
+
+ *cmdline_p = boot_command_line;
+
+diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+index 8957734d6183e6..ddc8beb55eee6d 100644
+--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
++++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+@@ -1010,9 +1010,12 @@ static int __check_host_shared_guest(struct pkvm_hyp_vm *vm, u64 *__phys, u64 ip
+ return ret;
+ if (!kvm_pte_valid(pte))
+ return -ENOENT;
+- if (kvm_granule_size(level) != size)
++ if (size && kvm_granule_size(level) != size)
+ return -E2BIG;
+
++ if (!size)
++ size = kvm_granule_size(level);
++
+ state = guest_get_page_state(pte, ipa);
+ if (state != PKVM_PAGE_SHARED_BORROWED)
+ return -EPERM;
+@@ -1100,7 +1103,7 @@ int __pkvm_host_relax_perms_guest(u64 gfn, struct pkvm_hyp_vcpu *vcpu, enum kvm_
+ if (prot & ~KVM_PGTABLE_PROT_RWX)
+ return -EINVAL;
+
+- assert_host_shared_guest(vm, ipa, PAGE_SIZE);
++ assert_host_shared_guest(vm, ipa, 0);
+ guest_lock_component(vm);
+ ret = kvm_pgtable_stage2_relax_perms(&vm->pgt, ipa, prot, 0);
+ guest_unlock_component(vm);
+@@ -1156,7 +1159,7 @@ int __pkvm_host_mkyoung_guest(u64 gfn, struct pkvm_hyp_vcpu *vcpu)
+ if (pkvm_hyp_vm_is_protected(vm))
+ return -EPERM;
+
+- assert_host_shared_guest(vm, ipa, PAGE_SIZE);
++ assert_host_shared_guest(vm, ipa, 0);
+ guest_lock_component(vm);
+ kvm_pgtable_stage2_mkyoung(&vm->pgt, ipa, 0);
+ guest_unlock_component(vm);
+diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
+index 7363942925038e..705c06d6752d4c 100644
+--- a/arch/arm64/kvm/mmu.c
++++ b/arch/arm64/kvm/mmu.c
+@@ -1673,7 +1673,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
+ * cache maintenance.
+ */
+ if (!kvm_supports_cacheable_pfnmap())
+- return -EFAULT;
++ ret = -EFAULT;
+ } else {
+ /*
+ * If the page was identified as device early by looking at
+@@ -1696,7 +1696,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
+ }
+
+ if (exec_fault && s2_force_noncacheable)
+- return -ENOEXEC;
++ ret = -ENOEXEC;
++
++ if (ret) {
++ kvm_release_page_unused(page);
++ return ret;
++ }
+
+ /*
+ * Potentially reduce shadow S2 permissions to match the guest's own
+diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
+index ea84a61ed50848..0dd558613bd71c 100644
+--- a/arch/arm64/mm/init.c
++++ b/arch/arm64/mm/init.c
+@@ -279,7 +279,7 @@ void __init arm64_memblock_init(void)
+ * Register the kernel text, kernel data, initrd, and initial
+ * pagetables with memblock.
+ */
+- memblock_reserve(__pa_symbol(_stext), _end - _stext);
++ memblock_reserve(__pa_symbol(_text), _end - _text);
+ if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) {
+ /* the generic initrd code expects virtual addresses */
+ initrd_start = __phys_to_virt(phys_initrd_start);
+diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
+index 1838015207404d..324b96e3632cad 100644
+--- a/arch/arm64/mm/mmu.c
++++ b/arch/arm64/mm/mmu.c
+@@ -574,8 +574,8 @@ void __init mark_linear_text_alias_ro(void)
+ /*
+ * Remove the write permissions from the linear alias of .text/.rodata
+ */
+- update_mapping_prot(__pa_symbol(_stext), (unsigned long)lm_alias(_stext),
+- (unsigned long)__init_begin - (unsigned long)_stext,
++ update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text),
++ (unsigned long)__init_begin - (unsigned long)_text,
+ PAGE_KERNEL_RO);
+ }
+
+@@ -636,7 +636,7 @@ static inline void arm64_kfence_map_pool(phys_addr_t kfence_pool, pgd_t *pgdp) {
+ static void __init map_mem(pgd_t *pgdp)
+ {
+ static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
+- phys_addr_t kernel_start = __pa_symbol(_stext);
++ phys_addr_t kernel_start = __pa_symbol(_text);
+ phys_addr_t kernel_end = __pa_symbol(__init_begin);
+ phys_addr_t start, end;
+ phys_addr_t early_kfence_pool;
+@@ -683,7 +683,7 @@ static void __init map_mem(pgd_t *pgdp)
+ }
+
+ /*
+- * Map the linear alias of the [_stext, __init_begin) interval
++ * Map the linear alias of the [_text, __init_begin) interval
+ * as non-executable now, and remove the write permission in
+ * mark_linear_text_alias_ro() below (which will be called after
+ * alternative patching has completed). This makes the contents
+@@ -710,6 +710,10 @@ void mark_rodata_ro(void)
+ WRITE_ONCE(rodata_is_rw, false);
+ update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
+ section_size, PAGE_KERNEL_RO);
++ /* mark the range between _text and _stext as read only. */
++ update_mapping_prot(__pa_symbol(_text), (unsigned long)_text,
++ (unsigned long)_stext - (unsigned long)_text,
++ PAGE_KERNEL_RO);
+ }
+
+ static void __init declare_vma(struct vm_struct *vma,
+@@ -780,7 +784,7 @@ static void __init declare_kernel_vmas(void)
+ {
+ static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT];
+
+- declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD);
++ declare_vma(&vmlinux_seg[0], _text, _etext, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD);
+diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
+index ae419e32f22e2f..dc5bd3f1b8d2cb 100644
+--- a/arch/loongarch/Makefile
++++ b/arch/loongarch/Makefile
+@@ -115,7 +115,7 @@ ifdef CONFIG_LTO_CLANG
+ # The annotate-tablejump option can not be passed to LLVM backend when LTO is enabled.
+ # Ensure it is aware of linker with LTO, '--loongarch-annotate-tablejump' also needs to
+ # be passed via '-mllvm' to ld.lld.
+-KBUILD_LDFLAGS += -mllvm --loongarch-annotate-tablejump
++KBUILD_LDFLAGS += $(call ld-option,-mllvm --loongarch-annotate-tablejump)
+ endif
+ endif
+
+@@ -129,7 +129,7 @@ KBUILD_RUSTFLAGS_KERNEL += -Crelocation-model=pie
+ LDFLAGS_vmlinux += -static -pie --no-dynamic-linker -z notext $(call ld-option, --apply-dynamic-relocs)
+ endif
+
+-cflags-y += $(call cc-option, -mno-check-zero-division)
++cflags-y += $(call cc-option, -mno-check-zero-division -fno-isolate-erroneous-paths-dereference)
+
+ ifndef CONFIG_KASAN
+ cflags-y += -fno-builtin-memcpy -fno-builtin-memmove -fno-builtin-memset
+diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
+index 075b79b2c1d39e..69c17d162fff3c 100644
+--- a/arch/loongarch/kernel/setup.c
++++ b/arch/loongarch/kernel/setup.c
+@@ -355,6 +355,7 @@ void __init platform_init(void)
+
+ #ifdef CONFIG_ACPI
+ acpi_table_upgrade();
++ acpi_gbl_use_global_lock = false;
+ acpi_gbl_use_default_register_widths = false;
+ acpi_boot_table_init();
+ #endif
+diff --git a/arch/parisc/include/uapi/asm/ioctls.h b/arch/parisc/include/uapi/asm/ioctls.h
+index 82d1148c6379a5..74b4027a4e8083 100644
+--- a/arch/parisc/include/uapi/asm/ioctls.h
++++ b/arch/parisc/include/uapi/asm/ioctls.h
+@@ -10,10 +10,10 @@
+ #define TCSETS _IOW('T', 17, struct termios) /* TCSETATTR */
+ #define TCSETSW _IOW('T', 18, struct termios) /* TCSETATTRD */
+ #define TCSETSF _IOW('T', 19, struct termios) /* TCSETATTRF */
+-#define TCGETA _IOR('T', 1, struct termio)
+-#define TCSETA _IOW('T', 2, struct termio)
+-#define TCSETAW _IOW('T', 3, struct termio)
+-#define TCSETAF _IOW('T', 4, struct termio)
++#define TCGETA 0x40125401
++#define TCSETA 0x80125402
++#define TCSETAW 0x80125403
++#define TCSETAF 0x80125404
+ #define TCSBRK _IO('T', 5)
+ #define TCXONC _IO('T', 6)
+ #define TCFLSH _IO('T', 7)
+diff --git a/arch/parisc/lib/memcpy.c b/arch/parisc/lib/memcpy.c
+index 69d65ffab31263..03165c82dfdbd9 100644
+--- a/arch/parisc/lib/memcpy.c
++++ b/arch/parisc/lib/memcpy.c
+@@ -41,7 +41,6 @@ unsigned long raw_copy_from_user(void *dst, const void __user *src,
+ mtsp(get_kernel_space(), SR_TEMP2);
+
+ /* Check region is user accessible */
+- if (start)
+ while (start < end) {
+ if (!prober_user(SR_TEMP1, start)) {
+ newlen = (start - (unsigned long) src);
+diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
+index d8ccf2c9b98ad0..0166bf39ce1e52 100644
+--- a/arch/powerpc/platforms/powernv/pci-ioda.c
++++ b/arch/powerpc/platforms/powernv/pci-ioda.c
+@@ -1854,7 +1854,7 @@ static int pnv_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+ return 0;
+
+ out:
+- irq_domain_free_irqs_parent(domain, virq, i - 1);
++ irq_domain_free_irqs_parent(domain, virq, i);
+ msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, nr_irqs);
+ return ret;
+ }
+diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
+index ee1c8c6898a3c7..9dc294de631f1d 100644
+--- a/arch/powerpc/platforms/pseries/msi.c
++++ b/arch/powerpc/platforms/pseries/msi.c
+@@ -593,7 +593,7 @@ static int pseries_irq_domain_alloc(struct irq_domain *domain, unsigned int virq
+
+ out:
+ /* TODO: handle RTAS cleanup in ->msi_finish() ? */
+- irq_domain_free_irqs_parent(domain, virq, i - 1);
++ irq_domain_free_irqs_parent(domain, virq, i);
+ return ret;
+ }
+
+diff --git a/arch/s390/Makefile b/arch/s390/Makefile
+index 7679bc16b692bd..b4769241332bb9 100644
+--- a/arch/s390/Makefile
++++ b/arch/s390/Makefile
+@@ -25,6 +25,7 @@ endif
+ KBUILD_CFLAGS_DECOMPRESSOR := $(CLANG_FLAGS) -m64 -O2 -mpacked-stack -std=gnu11
+ KBUILD_CFLAGS_DECOMPRESSOR += -DDISABLE_BRANCH_PROFILING -D__NO_FORTIFY
+ KBUILD_CFLAGS_DECOMPRESSOR += -D__DECOMPRESSOR
++KBUILD_CFLAGS_DECOMPRESSOR += -Wno-pointer-sign
+ KBUILD_CFLAGS_DECOMPRESSOR += -fno-delete-null-pointer-checks -msoft-float -mbackchain
+ KBUILD_CFLAGS_DECOMPRESSOR += -fno-asynchronous-unwind-tables
+ KBUILD_CFLAGS_DECOMPRESSOR += -ffreestanding
+diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
+index c1a7a92f057511..b7100c6a405449 100644
+--- a/arch/s390/include/asm/pgtable.h
++++ b/arch/s390/include/asm/pgtable.h
+@@ -2055,4 +2055,26 @@ static inline unsigned long gmap_pgste_get_pgt_addr(unsigned long *pgt)
+ return res;
+ }
+
++static inline pgste_t pgste_get_lock(pte_t *ptep)
++{
++ unsigned long value = 0;
++#ifdef CONFIG_PGSTE
++ unsigned long *ptr = (unsigned long *)(ptep + PTRS_PER_PTE);
++
++ do {
++ value = __atomic64_or_barrier(PGSTE_PCL_BIT, ptr);
++ } while (value & PGSTE_PCL_BIT);
++ value |= PGSTE_PCL_BIT;
++#endif
++ return __pgste(value);
++}
++
++static inline void pgste_set_unlock(pte_t *ptep, pgste_t pgste)
++{
++#ifdef CONFIG_PGSTE
++ barrier();
++ WRITE_ONCE(*(unsigned long *)(ptep + PTRS_PER_PTE), pgste_val(pgste) & ~PGSTE_PCL_BIT);
++#endif
++}
++
+ #endif /* _S390_PAGE_H */
+diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
+index 1c606dfa595d8a..d74d4c52ccd05a 100644
+--- a/arch/s390/kernel/vmlinux.lds.S
++++ b/arch/s390/kernel/vmlinux.lds.S
+@@ -209,6 +209,33 @@ SECTIONS
+ . = ALIGN(PAGE_SIZE);
+ _end = . ;
+
++ /* Debugging sections. */
++ STABS_DEBUG
++ DWARF_DEBUG
++ ELF_DETAILS
++
++ /*
++ * Make sure that the .got.plt is either completely empty or it
++ * contains only the three reserved double words.
++ */
++ .got.plt : {
++ *(.got.plt)
++ }
++ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
++
++ /*
++ * Sections that should stay zero sized, which is safer to
++ * explicitly check instead of blindly discarding.
++ */
++ .plt : {
++ *(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt)
++ }
++ ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
++ .rela.dyn : {
++ *(.rela.*) *(.rela_*)
++ }
++ ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
++
+ /*
+ * uncompressed image info used by the decompressor
+ * it should match struct vmlinux_info
+@@ -239,33 +266,6 @@ SECTIONS
+ #endif
+ } :NONE
+
+- /* Debugging sections. */
+- STABS_DEBUG
+- DWARF_DEBUG
+- ELF_DETAILS
+-
+- /*
+- * Make sure that the .got.plt is either completely empty or it
+- * contains only the three reserved double words.
+- */
+- .got.plt : {
+- *(.got.plt)
+- }
+- ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
+-
+- /*
+- * Sections that should stay zero sized, which is safer to
+- * explicitly check instead of blindly discarding.
+- */
+- .plt : {
+- *(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt)
+- }
+- ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
+- .rela.dyn : {
+- *(.rela.*) *(.rela_*)
+- }
+- ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
+-
+ /* Sections to be discarded */
+ DISCARDS
+ /DISCARD/ : {
+diff --git a/arch/s390/mm/gmap_helpers.c b/arch/s390/mm/gmap_helpers.c
+index b63f427e7289ad..d4c3c36855e26c 100644
+--- a/arch/s390/mm/gmap_helpers.c
++++ b/arch/s390/mm/gmap_helpers.c
+@@ -15,6 +15,7 @@
+ #include <linux/pagewalk.h>
+ #include <linux/ksm.h>
+ #include <asm/gmap_helpers.h>
++#include <asm/pgtable.h>
+
+ /**
+ * ptep_zap_swap_entry() - discard a swap entry.
+@@ -47,6 +48,7 @@ void gmap_helper_zap_one_page(struct mm_struct *mm, unsigned long vmaddr)
+ {
+ struct vm_area_struct *vma;
+ spinlock_t *ptl;
++ pgste_t pgste;
+ pte_t *ptep;
+
+ mmap_assert_locked(mm);
+@@ -60,8 +62,16 @@ void gmap_helper_zap_one_page(struct mm_struct *mm, unsigned long vmaddr)
+ ptep = get_locked_pte(mm, vmaddr, &ptl);
+ if (unlikely(!ptep))
+ return;
+- if (pte_swap(*ptep))
++ if (pte_swap(*ptep)) {
++ preempt_disable();
++ pgste = pgste_get_lock(ptep);
++
+ ptep_zap_swap_entry(mm, pte_to_swp_entry(*ptep));
++ pte_clear(mm, vmaddr, ptep);
++
++ pgste_set_unlock(ptep, pgste);
++ preempt_enable();
++ }
+ pte_unmap_unlock(ptep, ptl);
+ }
+ EXPORT_SYMBOL_GPL(gmap_helper_zap_one_page);
+diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
+index 50eb57c976bc30..0fde20bbc50bf3 100644
+--- a/arch/s390/mm/pgtable.c
++++ b/arch/s390/mm/pgtable.c
+@@ -24,6 +24,7 @@
+ #include <asm/tlbflush.h>
+ #include <asm/mmu_context.h>
+ #include <asm/page-states.h>
++#include <asm/pgtable.h>
+ #include <asm/machine.h>
+
+ pgprot_t pgprot_writecombine(pgprot_t prot)
+@@ -115,28 +116,6 @@ static inline pte_t ptep_flush_lazy(struct mm_struct *mm,
+ return old;
+ }
+
+-static inline pgste_t pgste_get_lock(pte_t *ptep)
+-{
+- unsigned long value = 0;
+-#ifdef CONFIG_PGSTE
+- unsigned long *ptr = (unsigned long *)(ptep + PTRS_PER_PTE);
+-
+- do {
+- value = __atomic64_or_barrier(PGSTE_PCL_BIT, ptr);
+- } while (value & PGSTE_PCL_BIT);
+- value |= PGSTE_PCL_BIT;
+-#endif
+- return __pgste(value);
+-}
+-
+-static inline void pgste_set_unlock(pte_t *ptep, pgste_t pgste)
+-{
+-#ifdef CONFIG_PGSTE
+- barrier();
+- WRITE_ONCE(*(unsigned long *)(ptep + PTRS_PER_PTE), pgste_val(pgste) & ~PGSTE_PCL_BIT);
+-#endif
+-}
+-
+ static inline pgste_t pgste_get(pte_t *ptep)
+ {
+ unsigned long pgste = 0;
+diff --git a/arch/sparc/kernel/of_device_32.c b/arch/sparc/kernel/of_device_32.c
+index 06012e68bdcaec..284a4cafa4324c 100644
+--- a/arch/sparc/kernel/of_device_32.c
++++ b/arch/sparc/kernel/of_device_32.c
+@@ -387,6 +387,7 @@ static struct platform_device * __init scan_one_device(struct device_node *dp,
+
+ if (of_device_register(op)) {
+ printk("%pOF: Could not register of device.\n", dp);
++ put_device(&op->dev);
+ kfree(op);
+ op = NULL;
+ }
+diff --git a/arch/sparc/kernel/of_device_64.c b/arch/sparc/kernel/of_device_64.c
+index f98c2901f3357a..f53092b07b9e7d 100644
+--- a/arch/sparc/kernel/of_device_64.c
++++ b/arch/sparc/kernel/of_device_64.c
+@@ -677,6 +677,7 @@ static struct platform_device * __init scan_one_device(struct device_node *dp,
+
+ if (of_device_register(op)) {
+ printk("%pOF: Could not register of device.\n", dp);
++ put_device(&op->dev);
+ kfree(op);
+ op = NULL;
+ }
+diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
+index 4b9431311e059b..4652e868663bee 100644
+--- a/arch/sparc/mm/hugetlbpage.c
++++ b/arch/sparc/mm/hugetlbpage.c
+@@ -22,6 +22,26 @@
+
+ static pte_t sun4u_hugepage_shift_to_tte(pte_t entry, unsigned int shift)
+ {
++ unsigned long hugepage_size = _PAGE_SZ4MB_4U;
++
++ pte_val(entry) = pte_val(entry) & ~_PAGE_SZALL_4U;
++
++ switch (shift) {
++ case HPAGE_256MB_SHIFT:
++ hugepage_size = _PAGE_SZ256MB_4U;
++ pte_val(entry) |= _PAGE_PMD_HUGE;
++ break;
++ case HPAGE_SHIFT:
++ pte_val(entry) |= _PAGE_PMD_HUGE;
++ break;
++ case HPAGE_64K_SHIFT:
++ hugepage_size = _PAGE_SZ64K_4U;
++ break;
++ default:
++ WARN_ONCE(1, "unsupported hugepage shift=%u\n", shift);
++ }
++
++ pte_val(entry) = pte_val(entry) | hugepage_size;
+ return entry;
+ }
+
+diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S
+index 29c5c32c16c364..907bd233c6c130 100644
+--- a/arch/x86/entry/entry_64_fred.S
++++ b/arch/x86/entry/entry_64_fred.S
+@@ -16,7 +16,7 @@
+
+ .macro FRED_ENTER
+ UNWIND_HINT_END_OF_STACK
+- ENDBR
++ ANNOTATE_NOENDBR
+ PUSH_AND_CLEAR_REGS
+ movq %rsp, %rdi /* %rdi -> pt_regs */
+ .endm
+diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
+index f19a76d3ca0ed2..a35ee44ec70ad9 100644
+--- a/arch/x86/include/asm/kvm_host.h
++++ b/arch/x86/include/asm/kvm_host.h
+@@ -2356,6 +2356,7 @@ int kvm_add_user_return_msr(u32 msr);
+ int kvm_find_user_return_msr(u32 msr);
+ int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
+ void kvm_user_return_msr_update_cache(unsigned int index, u64 val);
++u64 kvm_get_user_return_msr(unsigned int slot);
+
+ static inline bool kvm_is_supported_user_return_msr(u32 msr)
+ {
+diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
+index b65c3ba5fa1410..20fa4a79df1378 100644
+--- a/arch/x86/include/asm/msr-index.h
++++ b/arch/x86/include/asm/msr-index.h
+@@ -733,6 +733,7 @@
+ #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS 0xc0000300
+ #define MSR_AMD64_PERF_CNTR_GLOBAL_CTL 0xc0000301
+ #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR 0xc0000302
++#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET 0xc0000303
+
+ /* AMD Hardware Feedback Support MSRs */
+ #define MSR_AMD_WORKLOAD_CLASS_CONFIG 0xc0000500
+diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
+index 8ae750cde0c657..57379698015ed3 100644
+--- a/arch/x86/kernel/kvm.c
++++ b/arch/x86/kernel/kvm.c
+@@ -933,6 +933,19 @@ static void kvm_sev_hc_page_enc_status(unsigned long pfn, int npages, bool enc)
+
+ static void __init kvm_init_platform(void)
+ {
++ u64 tolud = PFN_PHYS(e820__end_of_low_ram_pfn());
++ /*
++ * Note, hardware requires variable MTRR ranges to be power-of-2 sized
++ * and naturally aligned. But when forcing guest MTRR state, Linux
++ * doesn't program the forced ranges into hardware. Don't bother doing
++ * the math to generate a technically-legal range.
++ */
++ struct mtrr_var_range pci_hole = {
++ .base_lo = tolud | X86_MEMTYPE_UC,
++ .mask_lo = (u32)(~(SZ_4G - tolud - 1)) | MTRR_PHYSMASK_V,
++ .mask_hi = (BIT_ULL(boot_cpu_data.x86_phys_bits) - 1) >> 32,
++ };
++
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
+ kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
+ unsigned long nr_pages;
+@@ -982,8 +995,12 @@ static void __init kvm_init_platform(void)
+ kvmclock_init();
+ x86_platform.apic_post_init = kvm_apic_init;
+
+- /* Set WB as the default cache mode for SEV-SNP and TDX */
+- guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK);
++ /*
++ * Set WB as the default cache mode for SEV-SNP and TDX, with a single
++ * UC range for the legacy PCI hole, e.g. so that devices that expect
++ * to get UC/WC mappings don't get surprised with WB.
++ */
++ guest_force_mtrr_state(&pci_hole, 1, MTRR_TYPE_WRBACK);
+ }
+
+ #if defined(CONFIG_AMD_MEM_ENCRYPT)
+diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
+index 5a4b21389b1d98..d432f3824f0c29 100644
+--- a/arch/x86/kernel/umip.c
++++ b/arch/x86/kernel/umip.c
+@@ -156,15 +156,26 @@ static int identify_insn(struct insn *insn)
+ if (!insn->modrm.nbytes)
+ return -EINVAL;
+
+- /* All the instructions of interest start with 0x0f. */
+- if (insn->opcode.bytes[0] != 0xf)
++ /* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
++ if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
+ return -EINVAL;
+
+ if (insn->opcode.bytes[1] == 0x1) {
+ switch (X86_MODRM_REG(insn->modrm.value)) {
+ case 0:
++ /* The reg form of 0F 01 /0 encodes VMX instructions. */
++ if (X86_MODRM_MOD(insn->modrm.value) == 3)
++ return -EINVAL;
++
+ return UMIP_INST_SGDT;
+ case 1:
++ /*
++ * The reg form of 0F 01 /1 encodes MONITOR/MWAIT,
++ * STAC/CLAC, and ENCLS.
++ */
++ if (X86_MODRM_MOD(insn->modrm.value) == 3)
++ return -EINVAL;
++
+ return UMIP_INST_SIDT;
+ case 4:
+ return UMIP_INST_SMSW;
+diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
+index 75e9cfc689f895..a84fb3d28885b1 100644
+--- a/arch/x86/kvm/pmu.c
++++ b/arch/x86/kvm/pmu.c
+@@ -650,6 +650,7 @@ int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
+ msr_info->data = pmu->global_ctrl;
+ break;
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
++ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET:
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ msr_info->data = 0;
+ break;
+@@ -711,6 +712,10 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
+ if (!msr_info->host_initiated)
+ pmu->global_status &= ~data;
+ break;
++ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET:
++ if (!msr_info->host_initiated)
++ pmu->global_status |= data & ~pmu->global_status_rsvd;
++ break;
+ default:
+ kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
+ return kvm_pmu_call(set_msr)(vcpu, msr_info);
+diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
+index 288f7f2a46f233..aa4379e46e969d 100644
+--- a/arch/x86/kvm/svm/pmu.c
++++ b/arch/x86/kvm/svm/pmu.c
+@@ -113,6 +113,7 @@ static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
++ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET:
+ return pmu->version > 1;
+ default:
+ if (msr > MSR_F15H_PERF_CTR5 &&
+diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
+index 0635bd71c10e78..7b1e9424af1576 100644
+--- a/arch/x86/kvm/svm/sev.c
++++ b/arch/x86/kvm/svm/sev.c
+@@ -4618,6 +4618,16 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
+ hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2);
+ hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3);
+ }
++
++ /*
++ * TSC_AUX is always virtualized for SEV-ES guests when the feature is
++ * available, i.e. TSC_AUX is loaded on #VMEXIT from the host save area.
++ * Set the save area to the current hardware value, i.e. the current
++ * user return value, so that the correct value is restored on #VMEXIT.
++ */
++ if (cpu_feature_enabled(X86_FEATURE_V_TSC_AUX) &&
++ !WARN_ON_ONCE(tsc_aux_uret_slot < 0))
++ hostsa->tsc_aux = kvm_get_user_return_msr(tsc_aux_uret_slot);
+ }
+
+ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index c813d6cce69ff3..83ca0b05abc1d5 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -195,7 +195,7 @@ static DEFINE_MUTEX(vmcb_dump_mutex);
+ * RDTSCP and RDPID are not used in the kernel, specifically to allow KVM to
+ * defer the restoration of TSC_AUX until the CPU returns to userspace.
+ */
+-static int tsc_aux_uret_slot __read_mostly = -1;
++int tsc_aux_uret_slot __ro_after_init = -1;
+
+ static int get_npt_level(void)
+ {
+@@ -577,18 +577,6 @@ static int svm_enable_virtualization_cpu(void)
+
+ amd_pmu_enable_virt();
+
+- /*
+- * If TSC_AUX virtualization is supported, TSC_AUX becomes a swap type
+- * "B" field (see sev_es_prepare_switch_to_guest()) for SEV-ES guests.
+- * Since Linux does not change the value of TSC_AUX once set, prime the
+- * TSC_AUX field now to avoid a RDMSR on every vCPU run.
+- */
+- if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
+- u32 __maybe_unused msr_hi;
+-
+- rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
+- }
+-
+ return 0;
+ }
+
+@@ -1423,10 +1411,10 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+
+ /*
+- * TSC_AUX is always virtualized for SEV-ES guests when the feature is
+- * available. The user return MSR support is not required in this case
+- * because TSC_AUX is restored on #VMEXIT from the host save area
+- * (which has been initialized in svm_enable_virtualization_cpu()).
++ * TSC_AUX is always virtualized (context switched by hardware) for
++ * SEV-ES guests when the feature is available. For non-SEV-ES guests,
++ * context switch TSC_AUX via the user_return MSR infrastructure (not
++ * all CPUs support TSC_AUX virtualization).
+ */
+ if (likely(tsc_aux_uret_slot >= 0) &&
+ (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
+@@ -3021,8 +3009,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
+ * TSC_AUX is always virtualized for SEV-ES guests when the
+ * feature is available. The user return MSR support is not
+ * required in this case because TSC_AUX is restored on #VMEXIT
+- * from the host save area (which has been initialized in
+- * svm_enable_virtualization_cpu()).
++ * from the host save area.
+ */
+ if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && sev_es_guest(vcpu->kvm))
+ break;
+diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
+index 58b9d168e0c8ec..04371aa8c8f282 100644
+--- a/arch/x86/kvm/svm/svm.h
++++ b/arch/x86/kvm/svm/svm.h
+@@ -52,6 +52,8 @@ extern bool x2avic_enabled;
+ extern bool vnmi;
+ extern int lbrv;
+
++extern int tsc_aux_uret_slot __ro_after_init;
++
+ /*
+ * Clean bits in VMCB.
+ * VMCB_ALL_CLEAN_MASK might also need to
+diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
+index 66744f5768c8eb..d91d9d6bb26c1c 100644
+--- a/arch/x86/kvm/vmx/tdx.c
++++ b/arch/x86/kvm/vmx/tdx.c
+@@ -3457,12 +3457,11 @@ static int __init __tdx_bringup(void)
+ if (r)
+ goto tdx_bringup_err;
+
++ r = -EINVAL;
+ /* Get TDX global information for later use */
+ tdx_sysinfo = tdx_get_sysinfo();
+- if (WARN_ON_ONCE(!tdx_sysinfo)) {
+- r = -EINVAL;
++ if (WARN_ON_ONCE(!tdx_sysinfo))
+ goto get_sysinfo_err;
+- }
+
+ /* Check TDX module and KVM capabilities */
+ if (!tdx_get_supported_attrs(&tdx_sysinfo->td_conf) ||
+@@ -3505,14 +3504,11 @@ static int __init __tdx_bringup(void)
+ if (td_conf->max_vcpus_per_td < num_present_cpus()) {
+ pr_err("Disable TDX: MAX_VCPU_PER_TD (%u) smaller than number of logical CPUs (%u).\n",
+ td_conf->max_vcpus_per_td, num_present_cpus());
+- r = -EINVAL;
+ goto get_sysinfo_err;
+ }
+
+- if (misc_cg_set_capacity(MISC_CG_RES_TDX, tdx_get_nr_guest_keyids())) {
+- r = -EINVAL;
++ if (misc_cg_set_capacity(MISC_CG_RES_TDX, tdx_get_nr_guest_keyids()))
+ goto get_sysinfo_err;
+- }
+
+ /*
+ * Leave hardware virtualization enabled after TDX is enabled
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index e6ae226704cba5..0affe0ec34dc0b 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -367,6 +367,7 @@ static const u32 msrs_to_save_pmu[] = {
+ MSR_AMD64_PERF_CNTR_GLOBAL_CTL,
+ MSR_AMD64_PERF_CNTR_GLOBAL_STATUS,
+ MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR,
++ MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET,
+ };
+
+ static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_base) +
+@@ -677,6 +678,12 @@ void kvm_user_return_msr_update_cache(unsigned int slot, u64 value)
+ }
+ EXPORT_SYMBOL_GPL(kvm_user_return_msr_update_cache);
+
++u64 kvm_get_user_return_msr(unsigned int slot)
++{
++ return this_cpu_ptr(user_return_msrs)->values[slot].curr;
++}
++EXPORT_SYMBOL_GPL(kvm_get_user_return_msr);
++
+ static void drop_user_return_notifiers(void)
+ {
+ struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs);
+@@ -7353,6 +7360,7 @@ static void kvm_probe_msr_to_save(u32 msr_index)
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
++ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET:
+ if (!kvm_cpu_cap_has(X86_FEATURE_PERFMON_V2))
+ return;
+ break;
+diff --git a/arch/xtensa/platforms/iss/simdisk.c b/arch/xtensa/platforms/iss/simdisk.c
+index 6ed009318d2418..3cafc8feddeee9 100644
+--- a/arch/xtensa/platforms/iss/simdisk.c
++++ b/arch/xtensa/platforms/iss/simdisk.c
+@@ -231,10 +231,14 @@ static ssize_t proc_read_simdisk(struct file *file, char __user *buf,
+ static ssize_t proc_write_simdisk(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+ {
+- char *tmp = memdup_user_nul(buf, count);
++ char *tmp;
+ struct simdisk *dev = pde_data(file_inode(file));
+ int err;
+
++ if (count == 0 || count > PAGE_SIZE)
++ return -EINVAL;
++
++ tmp = memdup_user_nul(buf, count);
+ if (IS_ERR(tmp))
+ return PTR_ERR(tmp);
+
+diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
+index 005c9157ffb3d4..1f9a4c33d2bdc8 100644
+--- a/block/blk-crypto-fallback.c
++++ b/block/blk-crypto-fallback.c
+@@ -18,6 +18,7 @@
+ #include <linux/module.h>
+ #include <linux/random.h>
+ #include <linux/scatterlist.h>
++#include <trace/events/block.h>
+
+ #include "blk-cgroup.h"
+ #include "blk-crypto-internal.h"
+@@ -231,7 +232,9 @@ static bool blk_crypto_fallback_split_bio_if_needed(struct bio **bio_ptr)
+ bio->bi_status = BLK_STS_RESOURCE;
+ return false;
+ }
++
+ bio_chain(split_bio, bio);
++ trace_block_split(split_bio, bio->bi_iter.bi_sector);
+ submit_bio_noacct(bio);
+ *bio_ptr = split_bio;
+ }
+diff --git a/crypto/essiv.c b/crypto/essiv.c
+index d003b78fcd855a..a47a3eab693519 100644
+--- a/crypto/essiv.c
++++ b/crypto/essiv.c
+@@ -186,9 +186,14 @@ static int essiv_aead_crypt(struct aead_request *req, bool enc)
+ const struct essiv_tfm_ctx *tctx = crypto_aead_ctx(tfm);
+ struct essiv_aead_request_ctx *rctx = aead_request_ctx(req);
+ struct aead_request *subreq = &rctx->aead_req;
++ int ivsize = crypto_aead_ivsize(tfm);
++ int ssize = req->assoclen - ivsize;
+ struct scatterlist *src = req->src;
+ int err;
+
++ if (ssize < 0)
++ return -EINVAL;
++
+ crypto_cipher_encrypt_one(tctx->essiv_cipher, req->iv, req->iv);
+
+ /*
+@@ -198,19 +203,12 @@ static int essiv_aead_crypt(struct aead_request *req, bool enc)
+ */
+ rctx->assoc = NULL;
+ if (req->src == req->dst || !enc) {
+- scatterwalk_map_and_copy(req->iv, req->dst,
+- req->assoclen - crypto_aead_ivsize(tfm),
+- crypto_aead_ivsize(tfm), 1);
++ scatterwalk_map_and_copy(req->iv, req->dst, ssize, ivsize, 1);
+ } else {
+ u8 *iv = (u8 *)aead_request_ctx(req) + tctx->ivoffset;
+- int ivsize = crypto_aead_ivsize(tfm);
+- int ssize = req->assoclen - ivsize;
+ struct scatterlist *sg;
+ int nents;
+
+- if (ssize < 0)
+- return -EINVAL;
+-
+ nents = sg_nents_for_len(req->src, ssize);
+ if (nents < 0)
+ return -EINVAL;
+diff --git a/crypto/skcipher.c b/crypto/skcipher.c
+index de5fc91bba267e..8fa5d9686d0850 100644
+--- a/crypto/skcipher.c
++++ b/crypto/skcipher.c
+@@ -294,6 +294,8 @@ static int crypto_skcipher_init_tfm(struct crypto_tfm *tfm)
+ return crypto_init_lskcipher_ops_sg(tfm);
+ }
+
++ crypto_skcipher_set_reqsize(skcipher, crypto_tfm_alg_reqsize(tfm));
++
+ if (alg->exit)
+ skcipher->base.exit = crypto_skcipher_exit_tfm;
+
+diff --git a/drivers/acpi/acpi_dbg.c b/drivers/acpi/acpi_dbg.c
+index d50261d05f3a1a..515b20d0b698a4 100644
+--- a/drivers/acpi/acpi_dbg.c
++++ b/drivers/acpi/acpi_dbg.c
+@@ -569,11 +569,11 @@ static int acpi_aml_release(struct inode *inode, struct file *file)
+ return 0;
+ }
+
+-static int acpi_aml_read_user(char __user *buf, int len)
++static ssize_t acpi_aml_read_user(char __user *buf, size_t len)
+ {
+- int ret;
+ struct circ_buf *crc = &acpi_aml_io.out_crc;
+- int n;
++ ssize_t ret;
++ size_t n;
+ char *p;
+
+ ret = acpi_aml_lock_read(crc, ACPI_AML_OUT_USER);
+@@ -582,7 +582,7 @@ static int acpi_aml_read_user(char __user *buf, int len)
+ /* sync head before removing logs */
+ smp_rmb();
+ p = &crc->buf[crc->tail];
+- n = min(len, circ_count_to_end(crc));
++ n = min_t(size_t, len, circ_count_to_end(crc));
+ if (copy_to_user(buf, p, n)) {
+ ret = -EFAULT;
+ goto out;
+@@ -599,8 +599,8 @@ static int acpi_aml_read_user(char __user *buf, int len)
+ static ssize_t acpi_aml_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+ {
+- int ret = 0;
+- int size = 0;
++ ssize_t ret = 0;
++ ssize_t size = 0;
+
+ if (!count)
+ return 0;
+@@ -639,11 +639,11 @@ static ssize_t acpi_aml_read(struct file *file, char __user *buf,
+ return size > 0 ? size : ret;
+ }
+
+-static int acpi_aml_write_user(const char __user *buf, int len)
++static ssize_t acpi_aml_write_user(const char __user *buf, size_t len)
+ {
+- int ret;
+ struct circ_buf *crc = &acpi_aml_io.in_crc;
+- int n;
++ ssize_t ret;
++ size_t n;
+ char *p;
+
+ ret = acpi_aml_lock_write(crc, ACPI_AML_IN_USER);
+@@ -652,7 +652,7 @@ static int acpi_aml_write_user(const char __user *buf, int len)
+ /* sync tail before inserting cmds */
+ smp_mb();
+ p = &crc->buf[crc->head];
+- n = min(len, circ_space_to_end(crc));
++ n = min_t(size_t, len, circ_space_to_end(crc));
+ if (copy_from_user(p, buf, n)) {
+ ret = -EFAULT;
+ goto out;
+@@ -663,14 +663,14 @@ static int acpi_aml_write_user(const char __user *buf, int len)
+ ret = n;
+ out:
+ acpi_aml_unlock_fifo(ACPI_AML_IN_USER, ret >= 0);
+- return n;
++ return ret;
+ }
+
+ static ssize_t acpi_aml_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+ {
+- int ret = 0;
+- int size = 0;
++ ssize_t ret = 0;
++ ssize_t size = 0;
+
+ if (!count)
+ return 0;
+diff --git a/drivers/acpi/acpi_tad.c b/drivers/acpi/acpi_tad.c
+index 91d7d90c47dacc..33418dd6768a1b 100644
+--- a/drivers/acpi/acpi_tad.c
++++ b/drivers/acpi/acpi_tad.c
+@@ -565,6 +565,9 @@ static void acpi_tad_remove(struct platform_device *pdev)
+
+ pm_runtime_get_sync(dev);
+
++ if (dd->capabilities & ACPI_TAD_RT)
++ sysfs_remove_group(&dev->kobj, &acpi_tad_time_attr_group);
++
+ if (dd->capabilities & ACPI_TAD_DC_WAKE)
+ sysfs_remove_group(&dev->kobj, &acpi_tad_dc_attr_group);
+
+diff --git a/drivers/acpi/acpica/acdebug.h b/drivers/acpi/acpica/acdebug.h
+index fe6d38b43c9a5c..91241bd6917a43 100644
+--- a/drivers/acpi/acpica/acdebug.h
++++ b/drivers/acpi/acpica/acdebug.h
+@@ -37,7 +37,7 @@ struct acpi_db_argument_info {
+ struct acpi_db_execute_walk {
+ u32 count;
+ u32 max_count;
+- char name_seg[ACPI_NAMESEG_SIZE + 1] ACPI_NONSTRING;
++ char name_seg[ACPI_NAMESEG_SIZE + 1];
+ };
+
+ #define PARAM_LIST(pl) pl
+diff --git a/drivers/acpi/acpica/evglock.c b/drivers/acpi/acpica/evglock.c
+index fa3e0d00d1ca96..df2a4ab0e0da9d 100644
+--- a/drivers/acpi/acpica/evglock.c
++++ b/drivers/acpi/acpica/evglock.c
+@@ -42,6 +42,10 @@ acpi_status acpi_ev_init_global_lock_handler(void)
+ return_ACPI_STATUS(AE_OK);
+ }
+
++ if (!acpi_gbl_use_global_lock) {
++ return_ACPI_STATUS(AE_OK);
++ }
++
+ /* Attempt installation of the global lock handler */
+
+ status = acpi_install_fixed_event_handler(ACPI_EVENT_GLOBAL,
+diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
+index 6905b56bf3e458..67b76492c839c4 100644
+--- a/drivers/acpi/battery.c
++++ b/drivers/acpi/battery.c
+@@ -92,7 +92,7 @@ enum {
+
+ struct acpi_battery {
+ struct mutex lock;
+- struct mutex sysfs_lock;
++ struct mutex update_lock;
+ struct power_supply *bat;
+ struct power_supply_desc bat_desc;
+ struct acpi_device *device;
+@@ -904,15 +904,12 @@ static int sysfs_add_battery(struct acpi_battery *battery)
+
+ static void sysfs_remove_battery(struct acpi_battery *battery)
+ {
+- mutex_lock(&battery->sysfs_lock);
+- if (!battery->bat) {
+- mutex_unlock(&battery->sysfs_lock);
++ if (!battery->bat)
+ return;
+- }
++
+ battery_hook_remove_battery(battery);
+ power_supply_unregister(battery->bat);
+ battery->bat = NULL;
+- mutex_unlock(&battery->sysfs_lock);
+ }
+
+ static void find_battery(const struct dmi_header *dm, void *private)
+@@ -1072,6 +1069,9 @@ static void acpi_battery_notify(acpi_handle handle, u32 event, void *data)
+
+ if (!battery)
+ return;
++
++ guard(mutex)(&battery->update_lock);
++
+ old = battery->bat;
+ /*
+ * On Acer Aspire V5-573G notifications are sometimes triggered too
+@@ -1094,21 +1094,22 @@ static void acpi_battery_notify(acpi_handle handle, u32 event, void *data)
+ }
+
+ static int battery_notify(struct notifier_block *nb,
+- unsigned long mode, void *_unused)
++ unsigned long mode, void *_unused)
+ {
+ struct acpi_battery *battery = container_of(nb, struct acpi_battery,
+ pm_nb);
+- int result;
+
+- switch (mode) {
+- case PM_POST_HIBERNATION:
+- case PM_POST_SUSPEND:
++ if (mode == PM_POST_SUSPEND || mode == PM_POST_HIBERNATION) {
++ guard(mutex)(&battery->update_lock);
++
+ if (!acpi_battery_present(battery))
+ return 0;
+
+ if (battery->bat) {
+ acpi_battery_refresh(battery);
+ } else {
++ int result;
++
+ result = acpi_battery_get_info(battery);
+ if (result)
+ return result;
+@@ -1120,7 +1121,6 @@ static int battery_notify(struct notifier_block *nb,
+
+ acpi_battery_init_alarm(battery);
+ acpi_battery_get_state(battery);
+- break;
+ }
+
+ return 0;
+@@ -1198,6 +1198,8 @@ static int acpi_battery_update_retry(struct acpi_battery *battery)
+ {
+ int retry, ret;
+
++ guard(mutex)(&battery->update_lock);
++
+ for (retry = 5; retry; retry--) {
+ ret = acpi_battery_update(battery, false);
+ if (!ret)
+@@ -1208,6 +1210,13 @@ static int acpi_battery_update_retry(struct acpi_battery *battery)
+ return ret;
+ }
+
++static void sysfs_battery_cleanup(struct acpi_battery *battery)
++{
++ guard(mutex)(&battery->update_lock);
++
++ sysfs_remove_battery(battery);
++}
++
+ static int acpi_battery_add(struct acpi_device *device)
+ {
+ int result = 0;
+@@ -1230,7 +1239,7 @@ static int acpi_battery_add(struct acpi_device *device)
+ if (result)
+ return result;
+
+- result = devm_mutex_init(&device->dev, &battery->sysfs_lock);
++ result = devm_mutex_init(&device->dev, &battery->update_lock);
+ if (result)
+ return result;
+
+@@ -1262,7 +1271,7 @@ static int acpi_battery_add(struct acpi_device *device)
+ device_init_wakeup(&device->dev, 0);
+ unregister_pm_notifier(&battery->pm_nb);
+ fail:
+- sysfs_remove_battery(battery);
++ sysfs_battery_cleanup(battery);
+
+ return result;
+ }
+@@ -1281,6 +1290,9 @@ static void acpi_battery_remove(struct acpi_device *device)
+
+ device_init_wakeup(&device->dev, 0);
+ unregister_pm_notifier(&battery->pm_nb);
++
++ guard(mutex)(&battery->update_lock);
++
+ sysfs_remove_battery(battery);
+ }
+
+@@ -1297,6 +1309,9 @@ static int acpi_battery_resume(struct device *dev)
+ return -EINVAL;
+
+ battery->update_time = 0;
++
++ guard(mutex)(&battery->update_lock);
++
+ acpi_battery_update(battery, true);
+ return 0;
+ }
+diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
+index 436019d96027bd..c086786fe84cb4 100644
+--- a/drivers/acpi/property.c
++++ b/drivers/acpi/property.c
+@@ -83,6 +83,7 @@ static bool acpi_nondev_subnode_extract(union acpi_object *desc,
+ struct fwnode_handle *parent)
+ {
+ struct acpi_data_node *dn;
++ acpi_handle scope = NULL;
+ bool result;
+
+ if (acpi_graph_ignore_port(handle))
+@@ -98,29 +99,35 @@ static bool acpi_nondev_subnode_extract(union acpi_object *desc,
+ INIT_LIST_HEAD(&dn->data.properties);
+ INIT_LIST_HEAD(&dn->data.subnodes);
+
+- result = acpi_extract_properties(handle, desc, &dn->data);
+-
+- if (handle) {
+- acpi_handle scope;
+- acpi_status status;
++ /*
++ * The scope for the completion of relative pathname segments and
++ * subnode object lookup is the one of the namespace node (device)
++ * containing the object that has returned the package. That is, it's
++ * the scope of that object's parent device.
++ */
++ if (handle)
++ acpi_get_parent(handle, &scope);
+
+- /*
+- * The scope for the subnode object lookup is the one of the
+- * namespace node (device) containing the object that has
+- * returned the package. That is, it's the scope of that
+- * object's parent.
+- */
+- status = acpi_get_parent(handle, &scope);
+- if (ACPI_SUCCESS(status)
+- && acpi_enumerate_nondev_subnodes(scope, desc, &dn->data,
+- &dn->fwnode))
+- result = true;
+- } else if (acpi_enumerate_nondev_subnodes(NULL, desc, &dn->data,
+- &dn->fwnode)) {
++ /*
++ * Extract properties from the _DSD-equivalent package pointed to by
++ * desc and use scope (if not NULL) for the completion of relative
++ * pathname segments.
++ *
++ * The extracted properties will be held in the new data node dn.
++ */
++ result = acpi_extract_properties(scope, desc, &dn->data);
++ /*
++ * Look for subnodes in the _DSD-equivalent package pointed to by desc
++ * and create child nodes of dn if there are any.
++ */
++ if (acpi_enumerate_nondev_subnodes(scope, desc, &dn->data, &dn->fwnode))
+ result = true;
+- }
+
+ if (result) {
++ /*
++ * This will be NULL if the desc package is embedded in an outer
++ * _DSD-equivalent package and its scope cannot be determined.
++ */
+ dn->handle = handle;
+ dn->data.pointer = desc;
+ list_add_tail(&dn->sibling, list);
+@@ -132,35 +139,21 @@ static bool acpi_nondev_subnode_extract(union acpi_object *desc,
+ return false;
+ }
+
+-static bool acpi_nondev_subnode_data_ok(acpi_handle handle,
+- const union acpi_object *link,
+- struct list_head *list,
+- struct fwnode_handle *parent)
+-{
+- struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER };
+- acpi_status status;
+-
+- status = acpi_evaluate_object_typed(handle, NULL, NULL, &buf,
+- ACPI_TYPE_PACKAGE);
+- if (ACPI_FAILURE(status))
+- return false;
+-
+- if (acpi_nondev_subnode_extract(buf.pointer, handle, link, list,
+- parent))
+- return true;
+-
+- ACPI_FREE(buf.pointer);
+- return false;
+-}
+-
+ static bool acpi_nondev_subnode_ok(acpi_handle scope,
+ const union acpi_object *link,
+ struct list_head *list,
+ struct fwnode_handle *parent)
+ {
++ struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER };
+ acpi_handle handle;
+ acpi_status status;
+
++ /*
++ * If the scope is unknown, the _DSD-equivalent package being parsed
++ * was embedded in an outer _DSD-equivalent package as a result of
++ * direct evaluation of an object pointed to by a reference. In that
++ * case, using a pathname as the target object pointer is invalid.
++ */
+ if (!scope)
+ return false;
+
+@@ -169,7 +162,17 @@ static bool acpi_nondev_subnode_ok(acpi_handle scope,
+ if (ACPI_FAILURE(status))
+ return false;
+
+- return acpi_nondev_subnode_data_ok(handle, link, list, parent);
++ status = acpi_evaluate_object_typed(handle, NULL, NULL, &buf,
++ ACPI_TYPE_PACKAGE);
++ if (ACPI_FAILURE(status))
++ return false;
++
++ if (acpi_nondev_subnode_extract(buf.pointer, handle, link, list,
++ parent))
++ return true;
++
++ ACPI_FREE(buf.pointer);
++ return false;
+ }
+
+ static bool acpi_add_nondev_subnodes(acpi_handle scope,
+@@ -180,9 +183,12 @@ static bool acpi_add_nondev_subnodes(acpi_handle scope,
+ bool ret = false;
+ int i;
+
++ /*
++ * Every element in the links package is expected to represent a link
++ * to a non-device node in a tree containing device-specific data.
++ */
+ for (i = 0; i < links->package.count; i++) {
+ union acpi_object *link, *desc;
+- acpi_handle handle;
+ bool result;
+
+ link = &links->package.elements[i];
+@@ -190,26 +196,53 @@ static bool acpi_add_nondev_subnodes(acpi_handle scope,
+ if (link->package.count != 2)
+ continue;
+
+- /* The first one must be a string. */
++ /* The first one (the key) must be a string. */
+ if (link->package.elements[0].type != ACPI_TYPE_STRING)
+ continue;
+
+- /* The second one may be a string, a reference or a package. */
++ /* The second one (the target) may be a string or a package. */
+ switch (link->package.elements[1].type) {
+ case ACPI_TYPE_STRING:
++ /*
++ * The string is expected to be a full pathname or a
++ * pathname segment relative to the given scope. That
++ * pathname is expected to point to an object returning
++ * a package that contains _DSD-equivalent information.
++ */
+ result = acpi_nondev_subnode_ok(scope, link, list,
+ parent);
+ break;
+- case ACPI_TYPE_LOCAL_REFERENCE:
+- handle = link->package.elements[1].reference.handle;
+- result = acpi_nondev_subnode_data_ok(handle, link, list,
+- parent);
+- break;
+ case ACPI_TYPE_PACKAGE:
++ /*
++ * This happens when a reference is used in AML to
++ * point to the target. Since the target is expected
++ * to be a named object, a reference to it will cause it
++ * to be avaluated in place and its return package will
++ * be embedded in the links package at the location of
++ * the reference.
++ *
++ * The target package is expected to contain _DSD-
++ * equivalent information, but the scope in which it
++ * is located in the original AML is unknown. Thus
++ * it cannot contain pathname segments represented as
++ * strings because there is no way to build full
++ * pathnames out of them.
++ */
++ acpi_handle_debug(scope, "subnode %s: Unknown scope\n",
++ link->package.elements[0].string.pointer);
+ desc = &link->package.elements[1];
+ result = acpi_nondev_subnode_extract(desc, NULL, link,
+ list, parent);
+ break;
++ case ACPI_TYPE_LOCAL_REFERENCE:
++ /*
++ * It is not expected to see any local references in
++ * the links package because referencing a named object
++ * should cause it to be evaluated in place.
++ */
++ acpi_handle_info(scope, "subnode %s: Unexpected reference\n",
++ link->package.elements[0].string.pointer);
++ fallthrough;
+ default:
+ result = false;
+ break;
+@@ -369,6 +402,9 @@ static void acpi_untie_nondev_subnodes(struct acpi_device_data *data)
+ struct acpi_data_node *dn;
+
+ list_for_each_entry(dn, &data->subnodes, sibling) {
++ if (!dn->handle)
++ continue;
++
+ acpi_detach_data(dn->handle, acpi_nondev_subnode_tag);
+
+ acpi_untie_nondev_subnodes(&dn->data);
+@@ -383,6 +419,9 @@ static bool acpi_tie_nondev_subnodes(struct acpi_device_data *data)
+ acpi_status status;
+ bool ret;
+
++ if (!dn->handle)
++ continue;
++
+ status = acpi_attach_data(dn->handle, acpi_nondev_subnode_tag, dn);
+ if (ACPI_FAILURE(status) && status != AE_ALREADY_EXISTS) {
+ acpi_handle_err(dn->handle, "Can't tag data node\n");
+diff --git a/drivers/base/base.h b/drivers/base/base.h
+index 123031a757d916..86fa7fbb354891 100644
+--- a/drivers/base/base.h
++++ b/drivers/base/base.h
+@@ -248,9 +248,18 @@ void device_links_driver_cleanup(struct device *dev);
+ void device_links_no_driver(struct device *dev);
+ bool device_links_busy(struct device *dev);
+ void device_links_unbind_consumers(struct device *dev);
++bool device_link_flag_is_sync_state_only(u32 flags);
+ void fw_devlink_drivers_done(void);
+ void fw_devlink_probing_done(void);
+
++#define dev_for_each_link_to_supplier(__link, __dev) \
++ list_for_each_entry_srcu(__link, &(__dev)->links.suppliers, c_node, \
++ device_links_read_lock_held())
++
++#define dev_for_each_link_to_consumer(__link, __dev) \
++ list_for_each_entry_srcu(__link, &(__dev)->links.consumers, s_node, \
++ device_links_read_lock_held())
++
+ /* device pm support */
+ void device_pm_move_to_tail(struct device *dev);
+
+diff --git a/drivers/base/core.c b/drivers/base/core.c
+index d22d6b23e75898..a54ec6df1058fb 100644
+--- a/drivers/base/core.c
++++ b/drivers/base/core.c
+@@ -287,7 +287,7 @@ static bool device_is_ancestor(struct device *dev, struct device *target)
+ #define DL_MARKER_FLAGS (DL_FLAG_INFERRED | \
+ DL_FLAG_CYCLE | \
+ DL_FLAG_MANAGED)
+-static inline bool device_link_flag_is_sync_state_only(u32 flags)
++bool device_link_flag_is_sync_state_only(u32 flags)
+ {
+ return (flags & ~DL_MARKER_FLAGS) == DL_FLAG_SYNC_STATE_ONLY;
+ }
+diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
+index c883b01ffbddc4..e83503bdc1fdb8 100644
+--- a/drivers/base/power/main.c
++++ b/drivers/base/power/main.c
+@@ -40,10 +40,6 @@
+
+ typedef int (*pm_callback_t)(struct device *);
+
+-#define list_for_each_entry_rcu_locked(pos, head, member) \
+- list_for_each_entry_rcu(pos, head, member, \
+- device_links_read_lock_held())
+-
+ /*
+ * The entries in the dpm_list list are in a depth first order, simply
+ * because children are guaranteed to be discovered after parents, and
+@@ -281,8 +277,9 @@ static void dpm_wait_for_suppliers(struct device *dev, bool async)
+ * callbacks freeing the link objects for the links in the list we're
+ * walking.
+ */
+- list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node)
+- if (READ_ONCE(link->status) != DL_STATE_DORMANT)
++ dev_for_each_link_to_supplier(link, dev)
++ if (READ_ONCE(link->status) != DL_STATE_DORMANT &&
++ !device_link_flag_is_sync_state_only(link->flags))
+ dpm_wait(link->supplier, async);
+
+ device_links_read_unlock(idx);
+@@ -338,8 +335,9 @@ static void dpm_wait_for_consumers(struct device *dev, bool async)
+ * continue instead of trying to continue in parallel with its
+ * unregistration).
+ */
+- list_for_each_entry_rcu_locked(link, &dev->links.consumers, s_node)
+- if (READ_ONCE(link->status) != DL_STATE_DORMANT)
++ dev_for_each_link_to_consumer(link, dev)
++ if (READ_ONCE(link->status) != DL_STATE_DORMANT &&
++ !device_link_flag_is_sync_state_only(link->flags))
+ dpm_wait(link->consumer, async);
+
+ device_links_read_unlock(idx);
+@@ -675,7 +673,7 @@ static void dpm_async_resume_subordinate(struct device *dev, async_func_t func)
+ idx = device_links_read_lock();
+
+ /* Start processing the device's "async" consumers. */
+- list_for_each_entry_rcu_locked(link, &dev->links.consumers, s_node)
++ dev_for_each_link_to_consumer(link, dev)
+ if (READ_ONCE(link->status) != DL_STATE_DORMANT)
+ dpm_async_with_cleanup(link->consumer, func);
+
+@@ -1342,7 +1340,7 @@ static void dpm_async_suspend_superior(struct device *dev, async_func_t func)
+ idx = device_links_read_lock();
+
+ /* Start processing the device's "async" suppliers. */
+- list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node)
++ dev_for_each_link_to_supplier(link, dev)
+ if (READ_ONCE(link->status) != DL_STATE_DORMANT)
+ dpm_async_with_cleanup(link->supplier, func);
+
+@@ -1396,7 +1394,7 @@ static void dpm_superior_set_must_resume(struct device *dev)
+
+ idx = device_links_read_lock();
+
+- list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node)
++ dev_for_each_link_to_supplier(link, dev)
+ link->supplier->power.must_resume = true;
+
+ device_links_read_unlock(idx);
+@@ -1825,7 +1823,7 @@ static void dpm_clear_superiors_direct_complete(struct device *dev)
+
+ idx = device_links_read_lock();
+
+- list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) {
++ dev_for_each_link_to_supplier(link, dev) {
+ spin_lock_irq(&link->supplier->power.lock);
+ link->supplier->power.direct_complete = false;
+ spin_unlock_irq(&link->supplier->power.lock);
+@@ -2077,7 +2075,7 @@ static bool device_prepare_smart_suspend(struct device *dev)
+
+ idx = device_links_read_lock();
+
+- list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) {
++ dev_for_each_link_to_supplier(link, dev) {
+ if (!device_link_test(link, DL_FLAG_PM_RUNTIME))
+ continue;
+
+diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
+index 3e84dc4122defe..7420b9851fe0fd 100644
+--- a/drivers/base/power/runtime.c
++++ b/drivers/base/power/runtime.c
+@@ -1903,8 +1903,7 @@ void pm_runtime_get_suppliers(struct device *dev)
+
+ idx = device_links_read_lock();
+
+- list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+- device_links_read_lock_held())
++ dev_for_each_link_to_supplier(link, dev)
+ if (device_link_test(link, DL_FLAG_PM_RUNTIME)) {
+ link->supplier_preactivated = true;
+ pm_runtime_get_sync(link->supplier);
+diff --git a/drivers/block/loop.c b/drivers/block/loop.c
+index 053a086d547ec6..94ec7f747f3672 100644
+--- a/drivers/block/loop.c
++++ b/drivers/block/loop.c
+@@ -551,8 +551,10 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
+ return -EBADF;
+
+ error = loop_check_backing_file(file);
+- if (error)
++ if (error) {
++ fput(file);
+ return error;
++ }
+
+ /* suppress uevents while reconfiguring the device */
+ dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1);
+@@ -993,8 +995,10 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode,
+ return -EBADF;
+
+ error = loop_check_backing_file(file);
+- if (error)
++ if (error) {
++ fput(file);
+ return error;
++ }
+
+ is_loop = is_loop_device(file);
+
+diff --git a/drivers/bus/mhi/ep/main.c b/drivers/bus/mhi/ep/main.c
+index b3eafcf2a2c50d..cdea24e9291959 100644
+--- a/drivers/bus/mhi/ep/main.c
++++ b/drivers/bus/mhi/ep/main.c
+@@ -403,17 +403,13 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl,
+ {
+ struct mhi_ep_chan *mhi_chan = &mhi_cntrl->mhi_chan[ring->ch_id];
+ struct device *dev = &mhi_cntrl->mhi_dev->dev;
+- size_t tr_len, read_offset, write_offset;
++ size_t tr_len, read_offset;
+ struct mhi_ep_buf_info buf_info = {};
+ u32 len = MHI_EP_DEFAULT_MTU;
+ struct mhi_ring_element *el;
+- bool tr_done = false;
+ void *buf_addr;
+- u32 buf_left;
+ int ret;
+
+- buf_left = len;
+-
+ do {
+ /* Don't process the transfer ring if the channel is not in RUNNING state */
+ if (mhi_chan->state != MHI_CH_STATE_RUNNING) {
+@@ -426,24 +422,23 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl,
+ /* Check if there is data pending to be read from previous read operation */
+ if (mhi_chan->tre_bytes_left) {
+ dev_dbg(dev, "TRE bytes remaining: %u\n", mhi_chan->tre_bytes_left);
+- tr_len = min(buf_left, mhi_chan->tre_bytes_left);
++ tr_len = min(len, mhi_chan->tre_bytes_left);
+ } else {
+ mhi_chan->tre_loc = MHI_TRE_DATA_GET_PTR(el);
+ mhi_chan->tre_size = MHI_TRE_DATA_GET_LEN(el);
+ mhi_chan->tre_bytes_left = mhi_chan->tre_size;
+
+- tr_len = min(buf_left, mhi_chan->tre_size);
++ tr_len = min(len, mhi_chan->tre_size);
+ }
+
+ read_offset = mhi_chan->tre_size - mhi_chan->tre_bytes_left;
+- write_offset = len - buf_left;
+
+ buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL);
+ if (!buf_addr)
+ return -ENOMEM;
+
+ buf_info.host_addr = mhi_chan->tre_loc + read_offset;
+- buf_info.dev_addr = buf_addr + write_offset;
++ buf_info.dev_addr = buf_addr;
+ buf_info.size = tr_len;
+ buf_info.cb = mhi_ep_read_completion;
+ buf_info.cb_buf = buf_addr;
+@@ -459,16 +454,12 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl,
+ goto err_free_buf_addr;
+ }
+
+- buf_left -= tr_len;
+ mhi_chan->tre_bytes_left -= tr_len;
+
+- if (!mhi_chan->tre_bytes_left) {
+- if (MHI_TRE_DATA_GET_IEOT(el))
+- tr_done = true;
+-
++ if (!mhi_chan->tre_bytes_left)
+ mhi_chan->rd_offset = (mhi_chan->rd_offset + 1) % ring->ring_size;
+- }
+- } while (buf_left && !tr_done);
++ /* Read until the some buffer is left or the ring becomes not empty */
++ } while (!mhi_ep_queue_is_empty(mhi_chan->mhi_dev, DMA_TO_DEVICE));
+
+ return 0;
+
+@@ -502,15 +493,11 @@ static int mhi_ep_process_ch_ring(struct mhi_ep_ring *ring)
+ mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
+ } else {
+ /* UL channel */
+- do {
+- ret = mhi_ep_read_channel(mhi_cntrl, ring);
+- if (ret < 0) {
+- dev_err(&mhi_chan->mhi_dev->dev, "Failed to read channel\n");
+- return ret;
+- }
+-
+- /* Read until the ring becomes empty */
+- } while (!mhi_ep_queue_is_empty(mhi_chan->mhi_dev, DMA_TO_DEVICE));
++ ret = mhi_ep_read_channel(mhi_cntrl, ring);
++ if (ret < 0) {
++ dev_err(&mhi_chan->mhi_dev->dev, "Failed to read channel\n");
++ return ret;
++ }
+ }
+
+ return 0;
+diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
+index 7f72aab38ce92f..099be8dd190078 100644
+--- a/drivers/bus/mhi/host/init.c
++++ b/drivers/bus/mhi/host/init.c
+@@ -194,7 +194,6 @@ static void mhi_deinit_free_irq(struct mhi_controller *mhi_cntrl)
+ static int mhi_init_irq_setup(struct mhi_controller *mhi_cntrl)
+ {
+ struct mhi_event *mhi_event = mhi_cntrl->mhi_event;
+- struct device *dev = &mhi_cntrl->mhi_dev->dev;
+ unsigned long irq_flags = IRQF_SHARED | IRQF_NO_SUSPEND;
+ int i, ret;
+
+@@ -221,7 +220,7 @@ static int mhi_init_irq_setup(struct mhi_controller *mhi_cntrl)
+ continue;
+
+ if (mhi_event->irq >= mhi_cntrl->nr_irqs) {
+- dev_err(dev, "irq %d not available for event ring\n",
++ dev_err(mhi_cntrl->cntrl_dev, "irq %d not available for event ring\n",
+ mhi_event->irq);
+ ret = -EINVAL;
+ goto error_request;
+@@ -232,7 +231,7 @@ static int mhi_init_irq_setup(struct mhi_controller *mhi_cntrl)
+ irq_flags,
+ "mhi", mhi_event);
+ if (ret) {
+- dev_err(dev, "Error requesting irq:%d for ev:%d\n",
++ dev_err(mhi_cntrl->cntrl_dev, "Error requesting irq:%d for ev:%d\n",
+ mhi_cntrl->irq[mhi_event->irq], i);
+ goto error_request;
+ }
+diff --git a/drivers/cdx/cdx_msi.c b/drivers/cdx/cdx_msi.c
+index 3388a5d1462c74..91b95422b2634e 100644
+--- a/drivers/cdx/cdx_msi.c
++++ b/drivers/cdx/cdx_msi.c
+@@ -174,6 +174,7 @@ struct irq_domain *cdx_msi_domain_init(struct device *dev)
+ }
+
+ parent = irq_find_matching_fwnode(of_fwnode_handle(parent_node), DOMAIN_BUS_NEXUS);
++ of_node_put(parent_node);
+ if (!parent || !msi_get_domain_info(parent)) {
+ dev_err(dev, "unable to locate ITS domain\n");
+ return NULL;
+diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c
+index ecfcb50302f6ce..efda90dcf5b3d0 100644
+--- a/drivers/char/ipmi/ipmi_kcs_sm.c
++++ b/drivers/char/ipmi/ipmi_kcs_sm.c
+@@ -122,10 +122,10 @@ struct si_sm_data {
+ unsigned long error0_timeout;
+ };
+
+-static unsigned int init_kcs_data_with_state(struct si_sm_data *kcs,
+- struct si_sm_io *io, enum kcs_states state)
++static unsigned int init_kcs_data(struct si_sm_data *kcs,
++ struct si_sm_io *io)
+ {
+- kcs->state = state;
++ kcs->state = KCS_IDLE;
+ kcs->io = io;
+ kcs->write_pos = 0;
+ kcs->write_count = 0;
+@@ -140,12 +140,6 @@ static unsigned int init_kcs_data_with_state(struct si_sm_data *kcs,
+ return 2;
+ }
+
+-static unsigned int init_kcs_data(struct si_sm_data *kcs,
+- struct si_sm_io *io)
+-{
+- return init_kcs_data_with_state(kcs, io, KCS_IDLE);
+-}
+-
+ static inline unsigned char read_status(struct si_sm_data *kcs)
+ {
+ return kcs->io->inputb(kcs->io, 1);
+@@ -276,7 +270,7 @@ static int start_kcs_transaction(struct si_sm_data *kcs, unsigned char *data,
+ if (size > MAX_KCS_WRITE_SIZE)
+ return IPMI_REQ_LEN_EXCEEDED_ERR;
+
+- if (kcs->state != KCS_IDLE) {
++ if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) {
+ dev_warn(kcs->io->dev, "KCS in invalid state %d\n", kcs->state);
+ return IPMI_NOT_IN_MY_STATE_ERR;
+ }
+@@ -501,7 +495,7 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time)
+ }
+
+ if (kcs->state == KCS_HOSED) {
+- init_kcs_data_with_state(kcs, kcs->io, KCS_ERROR0);
++ init_kcs_data(kcs, kcs->io);
+ return SI_SM_HOSED;
+ }
+
+diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
+index 8e9050f99e9eff..fa52414eccdaad 100644
+--- a/drivers/char/ipmi/ipmi_msghandler.c
++++ b/drivers/char/ipmi/ipmi_msghandler.c
+@@ -38,7 +38,9 @@
+
+ #define IPMI_DRIVER_VERSION "39.2"
+
+-static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void);
++static struct ipmi_recv_msg *ipmi_alloc_recv_msg(struct ipmi_user *user);
++static void ipmi_set_recv_msg_user(struct ipmi_recv_msg *msg,
++ struct ipmi_user *user);
+ static int ipmi_init_msghandler(void);
+ static void smi_work(struct work_struct *t);
+ static void handle_new_recv_msgs(struct ipmi_smi *intf);
+@@ -464,7 +466,7 @@ struct ipmi_smi {
+ * interface to match them up with their responses. A routine
+ * is called periodically to time the items in this list.
+ */
+- spinlock_t seq_lock;
++ struct mutex seq_lock;
+ struct seq_table seq_table[IPMI_IPMB_NUM_SEQ];
+ int curr_seq;
+
+@@ -955,7 +957,6 @@ static int deliver_response(struct ipmi_smi *intf, struct ipmi_recv_msg *msg)
+ * risk. At this moment, simply skip it in that case.
+ */
+ ipmi_free_recv_msg(msg);
+- atomic_dec(&msg->user->nr_msgs);
+ } else {
+ /*
+ * Deliver it in smi_work. The message will hold a
+@@ -1116,12 +1117,11 @@ static int intf_find_seq(struct ipmi_smi *intf,
+ struct ipmi_recv_msg **recv_msg)
+ {
+ int rv = -ENODEV;
+- unsigned long flags;
+
+ if (seq >= IPMI_IPMB_NUM_SEQ)
+ return -EINVAL;
+
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+ if (intf->seq_table[seq].inuse) {
+ struct ipmi_recv_msg *msg = intf->seq_table[seq].recv_msg;
+
+@@ -1134,7 +1134,7 @@ static int intf_find_seq(struct ipmi_smi *intf,
+ rv = 0;
+ }
+ }
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+
+ return rv;
+ }
+@@ -1145,14 +1145,13 @@ static int intf_start_seq_timer(struct ipmi_smi *intf,
+ long msgid)
+ {
+ int rv = -ENODEV;
+- unsigned long flags;
+ unsigned char seq;
+ unsigned long seqid;
+
+
+ GET_SEQ_FROM_MSGID(msgid, seq, seqid);
+
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+ /*
+ * We do this verification because the user can be deleted
+ * while a message is outstanding.
+@@ -1163,7 +1162,7 @@ static int intf_start_seq_timer(struct ipmi_smi *intf,
+ ent->timeout = ent->orig_timeout;
+ rv = 0;
+ }
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+
+ return rv;
+ }
+@@ -1174,7 +1173,6 @@ static int intf_err_seq(struct ipmi_smi *intf,
+ unsigned int err)
+ {
+ int rv = -ENODEV;
+- unsigned long flags;
+ unsigned char seq;
+ unsigned long seqid;
+ struct ipmi_recv_msg *msg = NULL;
+@@ -1182,7 +1180,7 @@ static int intf_err_seq(struct ipmi_smi *intf,
+
+ GET_SEQ_FROM_MSGID(msgid, seq, seqid);
+
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+ /*
+ * We do this verification because the user can be deleted
+ * while a message is outstanding.
+@@ -1196,7 +1194,7 @@ static int intf_err_seq(struct ipmi_smi *intf,
+ msg = ent->recv_msg;
+ rv = 0;
+ }
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+
+ if (msg)
+ deliver_err_response(intf, msg, err);
+@@ -1209,7 +1207,6 @@ int ipmi_create_user(unsigned int if_num,
+ void *handler_data,
+ struct ipmi_user **user)
+ {
+- unsigned long flags;
+ struct ipmi_user *new_user = NULL;
+ int rv = 0;
+ struct ipmi_smi *intf;
+@@ -1277,9 +1274,9 @@ int ipmi_create_user(unsigned int if_num,
+ new_user->gets_events = false;
+
+ mutex_lock(&intf->users_mutex);
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+ list_add(&new_user->link, &intf->users);
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+ mutex_unlock(&intf->users_mutex);
+
+ if (handler->ipmi_watchdog_pretimeout)
+@@ -1325,7 +1322,6 @@ static void _ipmi_destroy_user(struct ipmi_user *user)
+ {
+ struct ipmi_smi *intf = user->intf;
+ int i;
+- unsigned long flags;
+ struct cmd_rcvr *rcvr;
+ struct cmd_rcvr *rcvrs = NULL;
+ struct ipmi_recv_msg *msg, *msg2;
+@@ -1346,7 +1342,7 @@ static void _ipmi_destroy_user(struct ipmi_user *user)
+ list_del(&user->link);
+ atomic_dec(&intf->nr_users);
+
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+ for (i = 0; i < IPMI_IPMB_NUM_SEQ; i++) {
+ if (intf->seq_table[i].inuse
+ && (intf->seq_table[i].recv_msg->user == user)) {
+@@ -1355,7 +1351,7 @@ static void _ipmi_destroy_user(struct ipmi_user *user)
+ ipmi_free_recv_msg(intf->seq_table[i].recv_msg);
+ }
+ }
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+
+ /*
+ * Remove the user from the command receiver's table. First
+@@ -1616,8 +1612,7 @@ int ipmi_set_gets_events(struct ipmi_user *user, bool val)
+ }
+
+ list_for_each_entry_safe(msg, msg2, &msgs, link) {
+- msg->user = user;
+- kref_get(&user->refcount);
++ ipmi_set_recv_msg_user(msg, user);
+ deliver_local_response(intf, msg);
+ }
+ }
+@@ -2026,10 +2021,7 @@ static int i_ipmi_req_ipmb(struct ipmi_smi *intf,
+ */
+ smi_msg->user_data = recv_msg;
+ } else {
+- /* It's a command, so get a sequence for it. */
+- unsigned long flags;
+-
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+
+ if (is_maintenance_mode_cmd(msg))
+ intf->ipmb_maintenance_mode_timeout =
+@@ -2087,7 +2079,7 @@ static int i_ipmi_req_ipmb(struct ipmi_smi *intf,
+ * to be correct.
+ */
+ out_err:
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+ }
+
+ return rv;
+@@ -2205,10 +2197,7 @@ static int i_ipmi_req_lan(struct ipmi_smi *intf,
+ */
+ smi_msg->user_data = recv_msg;
+ } else {
+- /* It's a command, so get a sequence for it. */
+- unsigned long flags;
+-
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+
+ /*
+ * Create a sequence number with a 1 second
+@@ -2257,7 +2246,7 @@ static int i_ipmi_req_lan(struct ipmi_smi *intf,
+ * to be correct.
+ */
+ out_err:
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ mutex_unlock(&intf->seq_lock);
+ }
+
+ return rv;
+@@ -2288,22 +2277,18 @@ static int i_ipmi_request(struct ipmi_user *user,
+ int run_to_completion = READ_ONCE(intf->run_to_completion);
+ int rv = 0;
+
+- if (user) {
+- if (atomic_add_return(1, &user->nr_msgs) > max_msgs_per_user) {
+- /* Decrement will happen at the end of the routine. */
+- rv = -EBUSY;
+- goto out;
+- }
+- }
+-
+- if (supplied_recv)
++ if (supplied_recv) {
+ recv_msg = supplied_recv;
+- else {
+- recv_msg = ipmi_alloc_recv_msg();
+- if (recv_msg == NULL) {
+- rv = -ENOMEM;
+- goto out;
++ recv_msg->user = user;
++ if (user) {
++ atomic_inc(&user->nr_msgs);
++ /* The put happens when the message is freed. */
++ kref_get(&user->refcount);
+ }
++ } else {
++ recv_msg = ipmi_alloc_recv_msg(user);
++ if (IS_ERR(recv_msg))
++ return PTR_ERR(recv_msg);
+ }
+ recv_msg->user_msg_data = user_msg_data;
+
+@@ -2314,8 +2299,7 @@ static int i_ipmi_request(struct ipmi_user *user,
+ if (smi_msg == NULL) {
+ if (!supplied_recv)
+ ipmi_free_recv_msg(recv_msg);
+- rv = -ENOMEM;
+- goto out;
++ return -ENOMEM;
+ }
+ }
+
+@@ -2326,10 +2310,6 @@ static int i_ipmi_request(struct ipmi_user *user,
+ goto out_err;
+ }
+
+- recv_msg->user = user;
+- if (user)
+- /* The put happens when the message is freed. */
+- kref_get(&user->refcount);
+ recv_msg->msgid = msgid;
+ /*
+ * Store the message to send in the receive message so timeout
+@@ -2358,8 +2338,10 @@ static int i_ipmi_request(struct ipmi_user *user,
+
+ if (rv) {
+ out_err:
+- ipmi_free_smi_msg(smi_msg);
+- ipmi_free_recv_msg(recv_msg);
++ if (!supplied_smi)
++ ipmi_free_smi_msg(smi_msg);
++ if (!supplied_recv)
++ ipmi_free_recv_msg(recv_msg);
+ } else {
+ dev_dbg(intf->si_dev, "Send: %*ph\n",
+ smi_msg->data_size, smi_msg->data);
+@@ -2369,9 +2351,6 @@ static int i_ipmi_request(struct ipmi_user *user,
+ if (!run_to_completion)
+ mutex_unlock(&intf->users_mutex);
+
+-out:
+- if (rv && user)
+- atomic_dec(&user->nr_msgs);
+ return rv;
+ }
+
+@@ -3575,7 +3554,7 @@ int ipmi_add_smi(struct module *owner,
+ atomic_set(&intf->nr_users, 0);
+ intf->handlers = handlers;
+ intf->send_info = send_info;
+- spin_lock_init(&intf->seq_lock);
++ mutex_init(&intf->seq_lock);
+ for (j = 0; j < IPMI_IPMB_NUM_SEQ; j++) {
+ intf->seq_table[j].inuse = 0;
+ intf->seq_table[j].seqid = 0;
+@@ -3862,7 +3841,7 @@ static int handle_ipmb_get_msg_cmd(struct ipmi_smi *intf,
+ unsigned char chan;
+ struct ipmi_user *user = NULL;
+ struct ipmi_ipmb_addr *ipmb_addr;
+- struct ipmi_recv_msg *recv_msg;
++ struct ipmi_recv_msg *recv_msg = NULL;
+
+ if (msg->rsp_size < 10) {
+ /* Message not big enough, just ignore it. */
+@@ -3883,9 +3862,8 @@ static int handle_ipmb_get_msg_cmd(struct ipmi_smi *intf,
+ rcvr = find_cmd_rcvr(intf, netfn, cmd, chan);
+ if (rcvr) {
+ user = rcvr->user;
+- kref_get(&user->refcount);
+- } else
+- user = NULL;
++ recv_msg = ipmi_alloc_recv_msg(user);
++ }
+ rcu_read_unlock();
+
+ if (user == NULL) {
+@@ -3915,47 +3893,41 @@ static int handle_ipmb_get_msg_cmd(struct ipmi_smi *intf,
+ * causes it to not be freed or queued.
+ */
+ rv = -1;
+- } else {
+- recv_msg = ipmi_alloc_recv_msg();
+- if (!recv_msg) {
+- /*
+- * We couldn't allocate memory for the
+- * message, so requeue it for handling
+- * later.
+- */
+- rv = 1;
+- kref_put(&user->refcount, free_ipmi_user);
+- } else {
+- /* Extract the source address from the data. */
+- ipmb_addr = (struct ipmi_ipmb_addr *) &recv_msg->addr;
+- ipmb_addr->addr_type = IPMI_IPMB_ADDR_TYPE;
+- ipmb_addr->slave_addr = msg->rsp[6];
+- ipmb_addr->lun = msg->rsp[7] & 3;
+- ipmb_addr->channel = msg->rsp[3] & 0xf;
++ } else if (!IS_ERR(recv_msg)) {
++ /* Extract the source address from the data. */
++ ipmb_addr = (struct ipmi_ipmb_addr *) &recv_msg->addr;
++ ipmb_addr->addr_type = IPMI_IPMB_ADDR_TYPE;
++ ipmb_addr->slave_addr = msg->rsp[6];
++ ipmb_addr->lun = msg->rsp[7] & 3;
++ ipmb_addr->channel = msg->rsp[3] & 0xf;
+
+- /*
+- * Extract the rest of the message information
+- * from the IPMB header.
+- */
+- recv_msg->user = user;
+- recv_msg->recv_type = IPMI_CMD_RECV_TYPE;
+- recv_msg->msgid = msg->rsp[7] >> 2;
+- recv_msg->msg.netfn = msg->rsp[4] >> 2;
+- recv_msg->msg.cmd = msg->rsp[8];
+- recv_msg->msg.data = recv_msg->msg_data;
++ /*
++ * Extract the rest of the message information
++ * from the IPMB header.
++ */
++ recv_msg->recv_type = IPMI_CMD_RECV_TYPE;
++ recv_msg->msgid = msg->rsp[7] >> 2;
++ recv_msg->msg.netfn = msg->rsp[4] >> 2;
++ recv_msg->msg.cmd = msg->rsp[8];
++ recv_msg->msg.data = recv_msg->msg_data;
+
+- /*
+- * We chop off 10, not 9 bytes because the checksum
+- * at the end also needs to be removed.
+- */
+- recv_msg->msg.data_len = msg->rsp_size - 10;
+- memcpy(recv_msg->msg_data, &msg->rsp[9],
+- msg->rsp_size - 10);
+- if (deliver_response(intf, recv_msg))
+- ipmi_inc_stat(intf, unhandled_commands);
+- else
+- ipmi_inc_stat(intf, handled_commands);
+- }
++ /*
++ * We chop off 10, not 9 bytes because the checksum
++ * at the end also needs to be removed.
++ */
++ recv_msg->msg.data_len = msg->rsp_size - 10;
++ memcpy(recv_msg->msg_data, &msg->rsp[9],
++ msg->rsp_size - 10);
++ if (deliver_response(intf, recv_msg))
++ ipmi_inc_stat(intf, unhandled_commands);
++ else
++ ipmi_inc_stat(intf, handled_commands);
++ } else {
++ /*
++ * We couldn't allocate memory for the message, so
++ * requeue it for handling later.
++ */
++ rv = 1;
+ }
+
+ return rv;
+@@ -3968,7 +3940,7 @@ static int handle_ipmb_direct_rcv_cmd(struct ipmi_smi *intf,
+ int rv = 0;
+ struct ipmi_user *user = NULL;
+ struct ipmi_ipmb_direct_addr *daddr;
+- struct ipmi_recv_msg *recv_msg;
++ struct ipmi_recv_msg *recv_msg = NULL;
+ unsigned char netfn = msg->rsp[0] >> 2;
+ unsigned char cmd = msg->rsp[3];
+
+@@ -3977,9 +3949,8 @@ static int handle_ipmb_direct_rcv_cmd(struct ipmi_smi *intf,
+ rcvr = find_cmd_rcvr(intf, netfn, cmd, 0);
+ if (rcvr) {
+ user = rcvr->user;
+- kref_get(&user->refcount);
+- } else
+- user = NULL;
++ recv_msg = ipmi_alloc_recv_msg(user);
++ }
+ rcu_read_unlock();
+
+ if (user == NULL) {
+@@ -4001,44 +3972,38 @@ static int handle_ipmb_direct_rcv_cmd(struct ipmi_smi *intf,
+ * causes it to not be freed or queued.
+ */
+ rv = -1;
+- } else {
+- recv_msg = ipmi_alloc_recv_msg();
+- if (!recv_msg) {
+- /*
+- * We couldn't allocate memory for the
+- * message, so requeue it for handling
+- * later.
+- */
+- rv = 1;
+- kref_put(&user->refcount, free_ipmi_user);
+- } else {
+- /* Extract the source address from the data. */
+- daddr = (struct ipmi_ipmb_direct_addr *)&recv_msg->addr;
+- daddr->addr_type = IPMI_IPMB_DIRECT_ADDR_TYPE;
+- daddr->channel = 0;
+- daddr->slave_addr = msg->rsp[1];
+- daddr->rs_lun = msg->rsp[0] & 3;
+- daddr->rq_lun = msg->rsp[2] & 3;
++ } else if (!IS_ERR(recv_msg)) {
++ /* Extract the source address from the data. */
++ daddr = (struct ipmi_ipmb_direct_addr *)&recv_msg->addr;
++ daddr->addr_type = IPMI_IPMB_DIRECT_ADDR_TYPE;
++ daddr->channel = 0;
++ daddr->slave_addr = msg->rsp[1];
++ daddr->rs_lun = msg->rsp[0] & 3;
++ daddr->rq_lun = msg->rsp[2] & 3;
+
+- /*
+- * Extract the rest of the message information
+- * from the IPMB header.
+- */
+- recv_msg->user = user;
+- recv_msg->recv_type = IPMI_CMD_RECV_TYPE;
+- recv_msg->msgid = (msg->rsp[2] >> 2);
+- recv_msg->msg.netfn = msg->rsp[0] >> 2;
+- recv_msg->msg.cmd = msg->rsp[3];
+- recv_msg->msg.data = recv_msg->msg_data;
+-
+- recv_msg->msg.data_len = msg->rsp_size - 4;
+- memcpy(recv_msg->msg_data, msg->rsp + 4,
+- msg->rsp_size - 4);
+- if (deliver_response(intf, recv_msg))
+- ipmi_inc_stat(intf, unhandled_commands);
+- else
+- ipmi_inc_stat(intf, handled_commands);
+- }
++ /*
++ * Extract the rest of the message information
++ * from the IPMB header.
++ */
++ recv_msg->recv_type = IPMI_CMD_RECV_TYPE;
++ recv_msg->msgid = (msg->rsp[2] >> 2);
++ recv_msg->msg.netfn = msg->rsp[0] >> 2;
++ recv_msg->msg.cmd = msg->rsp[3];
++ recv_msg->msg.data = recv_msg->msg_data;
++
++ recv_msg->msg.data_len = msg->rsp_size - 4;
++ memcpy(recv_msg->msg_data, msg->rsp + 4,
++ msg->rsp_size - 4);
++ if (deliver_response(intf, recv_msg))
++ ipmi_inc_stat(intf, unhandled_commands);
++ else
++ ipmi_inc_stat(intf, handled_commands);
++ } else {
++ /*
++ * We couldn't allocate memory for the message, so
++ * requeue it for handling later.
++ */
++ rv = 1;
+ }
+
+ return rv;
+@@ -4152,7 +4117,7 @@ static int handle_lan_get_msg_cmd(struct ipmi_smi *intf,
+ unsigned char chan;
+ struct ipmi_user *user = NULL;
+ struct ipmi_lan_addr *lan_addr;
+- struct ipmi_recv_msg *recv_msg;
++ struct ipmi_recv_msg *recv_msg = NULL;
+
+ if (msg->rsp_size < 12) {
+ /* Message not big enough, just ignore it. */
+@@ -4173,9 +4138,8 @@ static int handle_lan_get_msg_cmd(struct ipmi_smi *intf,
+ rcvr = find_cmd_rcvr(intf, netfn, cmd, chan);
+ if (rcvr) {
+ user = rcvr->user;
+- kref_get(&user->refcount);
+- } else
+- user = NULL;
++ recv_msg = ipmi_alloc_recv_msg(user);
++ }
+ rcu_read_unlock();
+
+ if (user == NULL) {
+@@ -4206,49 +4170,44 @@ static int handle_lan_get_msg_cmd(struct ipmi_smi *intf,
+ * causes it to not be freed or queued.
+ */
+ rv = -1;
+- } else {
+- recv_msg = ipmi_alloc_recv_msg();
+- if (!recv_msg) {
+- /*
+- * We couldn't allocate memory for the
+- * message, so requeue it for handling later.
+- */
+- rv = 1;
+- kref_put(&user->refcount, free_ipmi_user);
+- } else {
+- /* Extract the source address from the data. */
+- lan_addr = (struct ipmi_lan_addr *) &recv_msg->addr;
+- lan_addr->addr_type = IPMI_LAN_ADDR_TYPE;
+- lan_addr->session_handle = msg->rsp[4];
+- lan_addr->remote_SWID = msg->rsp[8];
+- lan_addr->local_SWID = msg->rsp[5];
+- lan_addr->lun = msg->rsp[9] & 3;
+- lan_addr->channel = msg->rsp[3] & 0xf;
+- lan_addr->privilege = msg->rsp[3] >> 4;
++ } else if (!IS_ERR(recv_msg)) {
++ /* Extract the source address from the data. */
++ lan_addr = (struct ipmi_lan_addr *) &recv_msg->addr;
++ lan_addr->addr_type = IPMI_LAN_ADDR_TYPE;
++ lan_addr->session_handle = msg->rsp[4];
++ lan_addr->remote_SWID = msg->rsp[8];
++ lan_addr->local_SWID = msg->rsp[5];
++ lan_addr->lun = msg->rsp[9] & 3;
++ lan_addr->channel = msg->rsp[3] & 0xf;
++ lan_addr->privilege = msg->rsp[3] >> 4;
+
+- /*
+- * Extract the rest of the message information
+- * from the IPMB header.
+- */
+- recv_msg->user = user;
+- recv_msg->recv_type = IPMI_CMD_RECV_TYPE;
+- recv_msg->msgid = msg->rsp[9] >> 2;
+- recv_msg->msg.netfn = msg->rsp[6] >> 2;
+- recv_msg->msg.cmd = msg->rsp[10];
+- recv_msg->msg.data = recv_msg->msg_data;
++ /*
++ * Extract the rest of the message information
++ * from the IPMB header.
++ */
++ recv_msg->recv_type = IPMI_CMD_RECV_TYPE;
++ recv_msg->msgid = msg->rsp[9] >> 2;
++ recv_msg->msg.netfn = msg->rsp[6] >> 2;
++ recv_msg->msg.cmd = msg->rsp[10];
++ recv_msg->msg.data = recv_msg->msg_data;
+
+- /*
+- * We chop off 12, not 11 bytes because the checksum
+- * at the end also needs to be removed.
+- */
+- recv_msg->msg.data_len = msg->rsp_size - 12;
+- memcpy(recv_msg->msg_data, &msg->rsp[11],
+- msg->rsp_size - 12);
+- if (deliver_response(intf, recv_msg))
+- ipmi_inc_stat(intf, unhandled_commands);
+- else
+- ipmi_inc_stat(intf, handled_commands);
+- }
++ /*
++ * We chop off 12, not 11 bytes because the checksum
++ * at the end also needs to be removed.
++ */
++ recv_msg->msg.data_len = msg->rsp_size - 12;
++ memcpy(recv_msg->msg_data, &msg->rsp[11],
++ msg->rsp_size - 12);
++ if (deliver_response(intf, recv_msg))
++ ipmi_inc_stat(intf, unhandled_commands);
++ else
++ ipmi_inc_stat(intf, handled_commands);
++ } else {
++ /*
++ * We couldn't allocate memory for the message, so
++ * requeue it for handling later.
++ */
++ rv = 1;
+ }
+
+ return rv;
+@@ -4270,7 +4229,7 @@ static int handle_oem_get_msg_cmd(struct ipmi_smi *intf,
+ unsigned char chan;
+ struct ipmi_user *user = NULL;
+ struct ipmi_system_interface_addr *smi_addr;
+- struct ipmi_recv_msg *recv_msg;
++ struct ipmi_recv_msg *recv_msg = NULL;
+
+ /*
+ * We expect the OEM SW to perform error checking
+@@ -4299,9 +4258,8 @@ static int handle_oem_get_msg_cmd(struct ipmi_smi *intf,
+ rcvr = find_cmd_rcvr(intf, netfn, cmd, chan);
+ if (rcvr) {
+ user = rcvr->user;
+- kref_get(&user->refcount);
+- } else
+- user = NULL;
++ recv_msg = ipmi_alloc_recv_msg(user);
++ }
+ rcu_read_unlock();
+
+ if (user == NULL) {
+@@ -4314,48 +4272,42 @@ static int handle_oem_get_msg_cmd(struct ipmi_smi *intf,
+ */
+
+ rv = 0;
+- } else {
+- recv_msg = ipmi_alloc_recv_msg();
+- if (!recv_msg) {
+- /*
+- * We couldn't allocate memory for the
+- * message, so requeue it for handling
+- * later.
+- */
+- rv = 1;
+- kref_put(&user->refcount, free_ipmi_user);
+- } else {
+- /*
+- * OEM Messages are expected to be delivered via
+- * the system interface to SMS software. We might
+- * need to visit this again depending on OEM
+- * requirements
+- */
+- smi_addr = ((struct ipmi_system_interface_addr *)
+- &recv_msg->addr);
+- smi_addr->addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE;
+- smi_addr->channel = IPMI_BMC_CHANNEL;
+- smi_addr->lun = msg->rsp[0] & 3;
+-
+- recv_msg->user = user;
+- recv_msg->user_msg_data = NULL;
+- recv_msg->recv_type = IPMI_OEM_RECV_TYPE;
+- recv_msg->msg.netfn = msg->rsp[0] >> 2;
+- recv_msg->msg.cmd = msg->rsp[1];
+- recv_msg->msg.data = recv_msg->msg_data;
++ } else if (!IS_ERR(recv_msg)) {
++ /*
++ * OEM Messages are expected to be delivered via
++ * the system interface to SMS software. We might
++ * need to visit this again depending on OEM
++ * requirements
++ */
++ smi_addr = ((struct ipmi_system_interface_addr *)
++ &recv_msg->addr);
++ smi_addr->addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE;
++ smi_addr->channel = IPMI_BMC_CHANNEL;
++ smi_addr->lun = msg->rsp[0] & 3;
++
++ recv_msg->user_msg_data = NULL;
++ recv_msg->recv_type = IPMI_OEM_RECV_TYPE;
++ recv_msg->msg.netfn = msg->rsp[0] >> 2;
++ recv_msg->msg.cmd = msg->rsp[1];
++ recv_msg->msg.data = recv_msg->msg_data;
+
+- /*
+- * The message starts at byte 4 which follows the
+- * Channel Byte in the "GET MESSAGE" command
+- */
+- recv_msg->msg.data_len = msg->rsp_size - 4;
+- memcpy(recv_msg->msg_data, &msg->rsp[4],
+- msg->rsp_size - 4);
+- if (deliver_response(intf, recv_msg))
+- ipmi_inc_stat(intf, unhandled_commands);
+- else
+- ipmi_inc_stat(intf, handled_commands);
+- }
++ /*
++ * The message starts at byte 4 which follows the
++ * Channel Byte in the "GET MESSAGE" command
++ */
++ recv_msg->msg.data_len = msg->rsp_size - 4;
++ memcpy(recv_msg->msg_data, &msg->rsp[4],
++ msg->rsp_size - 4);
++ if (deliver_response(intf, recv_msg))
++ ipmi_inc_stat(intf, unhandled_commands);
++ else
++ ipmi_inc_stat(intf, handled_commands);
++ } else {
++ /*
++ * We couldn't allocate memory for the message, so
++ * requeue it for handling later.
++ */
++ rv = 1;
+ }
+
+ return rv;
+@@ -4413,8 +4365,8 @@ static int handle_read_event_rsp(struct ipmi_smi *intf,
+ if (!user->gets_events)
+ continue;
+
+- recv_msg = ipmi_alloc_recv_msg();
+- if (!recv_msg) {
++ recv_msg = ipmi_alloc_recv_msg(user);
++ if (IS_ERR(recv_msg)) {
+ mutex_unlock(&intf->users_mutex);
+ list_for_each_entry_safe(recv_msg, recv_msg2, &msgs,
+ link) {
+@@ -4435,8 +4387,6 @@ static int handle_read_event_rsp(struct ipmi_smi *intf,
+ deliver_count++;
+
+ copy_event_into_recv_msg(recv_msg, msg);
+- recv_msg->user = user;
+- kref_get(&user->refcount);
+ list_add_tail(&recv_msg->link, &msgs);
+ }
+ mutex_unlock(&intf->users_mutex);
+@@ -4452,8 +4402,8 @@ static int handle_read_event_rsp(struct ipmi_smi *intf,
+ * No one to receive the message, put it in queue if there's
+ * not already too many things in the queue.
+ */
+- recv_msg = ipmi_alloc_recv_msg();
+- if (!recv_msg) {
++ recv_msg = ipmi_alloc_recv_msg(NULL);
++ if (IS_ERR(recv_msg)) {
+ /*
+ * We couldn't allocate memory for the
+ * message, so requeue it for handling
+@@ -4529,9 +4479,10 @@ static int handle_one_recv_msg(struct ipmi_smi *intf,
+
+ if (msg->rsp_size < 2) {
+ /* Message is too small to be correct. */
+- dev_warn(intf->si_dev,
+- "BMC returned too small a message for netfn %x cmd %x, got %d bytes\n",
+- (msg->data[0] >> 2) | 1, msg->data[1], msg->rsp_size);
++ dev_warn_ratelimited(intf->si_dev,
++ "BMC returned too small a message for netfn %x cmd %x, got %d bytes\n",
++ (msg->data[0] >> 2) | 1,
++ msg->data[1], msg->rsp_size);
+
+ return_unspecified:
+ /* Generate an error response for the message. */
+@@ -4868,12 +4819,10 @@ static void smi_work(struct work_struct *t)
+
+ list_del(&msg->link);
+
+- if (refcount_read(&user->destroyed) == 0) {
++ if (refcount_read(&user->destroyed) == 0)
+ ipmi_free_recv_msg(msg);
+- } else {
+- atomic_dec(&user->nr_msgs);
++ else
+ user->handler->ipmi_recv_hndl(msg, user->handler_data);
+- }
+ }
+ mutex_unlock(&intf->user_msgs_mutex);
+
+@@ -4951,8 +4900,7 @@ smi_from_recv_msg(struct ipmi_smi *intf, struct ipmi_recv_msg *recv_msg,
+ static void check_msg_timeout(struct ipmi_smi *intf, struct seq_table *ent,
+ struct list_head *timeouts,
+ unsigned long timeout_period,
+- int slot, unsigned long *flags,
+- bool *need_timer)
++ int slot, bool *need_timer)
+ {
+ struct ipmi_recv_msg *msg;
+
+@@ -5004,7 +4952,7 @@ static void check_msg_timeout(struct ipmi_smi *intf, struct seq_table *ent,
+ return;
+ }
+
+- spin_unlock_irqrestore(&intf->seq_lock, *flags);
++ mutex_unlock(&intf->seq_lock);
+
+ /*
+ * Send the new message. We send with a zero
+@@ -5025,7 +4973,7 @@ static void check_msg_timeout(struct ipmi_smi *intf, struct seq_table *ent,
+ } else
+ ipmi_free_smi_msg(smi_msg);
+
+- spin_lock_irqsave(&intf->seq_lock, *flags);
++ mutex_lock(&intf->seq_lock);
+ }
+ }
+
+@@ -5052,7 +5000,7 @@ static bool ipmi_timeout_handler(struct ipmi_smi *intf,
+ * list.
+ */
+ INIT_LIST_HEAD(&timeouts);
+- spin_lock_irqsave(&intf->seq_lock, flags);
++ mutex_lock(&intf->seq_lock);
+ if (intf->ipmb_maintenance_mode_timeout) {
+ if (intf->ipmb_maintenance_mode_timeout <= timeout_period)
+ intf->ipmb_maintenance_mode_timeout = 0;
+@@ -5062,8 +5010,8 @@ static bool ipmi_timeout_handler(struct ipmi_smi *intf,
+ for (i = 0; i < IPMI_IPMB_NUM_SEQ; i++)
+ check_msg_timeout(intf, &intf->seq_table[i],
+ &timeouts, timeout_period, i,
+- &flags, &need_timer);
+- spin_unlock_irqrestore(&intf->seq_lock, flags);
++ &need_timer);
++ mutex_unlock(&intf->seq_lock);
+
+ list_for_each_entry_safe(msg, msg2, &timeouts, link)
+ deliver_err_response(intf, msg, IPMI_TIMEOUT_COMPLETION_CODE);
+@@ -5190,27 +5138,51 @@ static void free_recv_msg(struct ipmi_recv_msg *msg)
+ kfree(msg);
+ }
+
+-static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void)
++static struct ipmi_recv_msg *ipmi_alloc_recv_msg(struct ipmi_user *user)
+ {
+ struct ipmi_recv_msg *rv;
+
++ if (user) {
++ if (atomic_add_return(1, &user->nr_msgs) > max_msgs_per_user) {
++ atomic_dec(&user->nr_msgs);
++ return ERR_PTR(-EBUSY);
++ }
++ }
++
+ rv = kmalloc(sizeof(struct ipmi_recv_msg), GFP_ATOMIC);
+- if (rv) {
+- rv->user = NULL;
+- rv->done = free_recv_msg;
+- atomic_inc(&recv_msg_inuse_count);
++ if (!rv) {
++ if (user)
++ atomic_dec(&user->nr_msgs);
++ return ERR_PTR(-ENOMEM);
+ }
++
++ rv->user = user;
++ rv->done = free_recv_msg;
++ if (user)
++ kref_get(&user->refcount);
++ atomic_inc(&recv_msg_inuse_count);
+ return rv;
+ }
+
+ void ipmi_free_recv_msg(struct ipmi_recv_msg *msg)
+ {
+- if (msg->user && !oops_in_progress)
++ if (msg->user && !oops_in_progress) {
++ atomic_dec(&msg->user->nr_msgs);
+ kref_put(&msg->user->refcount, free_ipmi_user);
++ }
+ msg->done(msg);
+ }
+ EXPORT_SYMBOL(ipmi_free_recv_msg);
+
++static void ipmi_set_recv_msg_user(struct ipmi_recv_msg *msg,
++ struct ipmi_user *user)
++{
++ WARN_ON_ONCE(msg->user); /* User should not be set. */
++ msg->user = user;
++ atomic_inc(&user->nr_msgs);
++ kref_get(&user->refcount);
++}
++
+ static atomic_t panic_done_count = ATOMIC_INIT(0);
+
+ static void dummy_smi_done_handler(struct ipmi_smi_msg *msg)
+diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
+index 4b12c4b9da8bef..8954a8660ffc5a 100644
+--- a/drivers/char/tpm/tpm_tis_core.c
++++ b/drivers/char/tpm/tpm_tis_core.c
+@@ -978,8 +978,8 @@ static int tpm_tis_probe_irq_single(struct tpm_chip *chip, u32 intmask,
+ * will call disable_irq which undoes all of the above.
+ */
+ if (!(chip->flags & TPM_CHIP_FLAG_IRQ)) {
+- tpm_tis_write8(priv, original_int_vec,
+- TPM_INT_VECTOR(priv->locality));
++ tpm_tis_write8(priv, TPM_INT_VECTOR(priv->locality),
++ original_int_vec);
+ rc = -1;
+ }
+
+diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
+index 4d56475f94fc1e..b1425aed659387 100644
+--- a/drivers/clk/Kconfig
++++ b/drivers/clk/Kconfig
+@@ -364,6 +364,7 @@ config COMMON_CLK_LOCHNAGAR
+ config COMMON_CLK_NPCM8XX
+ tristate "Clock driver for the NPCM8XX SoC Family"
+ depends on ARCH_NPCM || COMPILE_TEST
++ select AUXILIARY_BUS
+ help
+ This driver supports the clocks on the Nuvoton BMC NPCM8XX SoC Family,
+ all the clocks are initialized by the bootloader, so this driver
+diff --git a/drivers/clk/at91/clk-peripheral.c b/drivers/clk/at91/clk-peripheral.c
+index c173a44c800aa8..629f050a855aae 100644
+--- a/drivers/clk/at91/clk-peripheral.c
++++ b/drivers/clk/at91/clk-peripheral.c
+@@ -279,8 +279,11 @@ static int clk_sam9x5_peripheral_determine_rate(struct clk_hw *hw,
+ long best_diff = LONG_MIN;
+ u32 shift;
+
+- if (periph->id < PERIPHERAL_ID_MIN || !periph->range.max)
+- return parent_rate;
++ if (periph->id < PERIPHERAL_ID_MIN || !periph->range.max) {
++ req->rate = parent_rate;
++
++ return 0;
++ }
+
+ /* Fist step: check the available dividers. */
+ for (shift = 0; shift <= PERIPHERAL_MAX_SHIFT; shift++) {
+diff --git a/drivers/clk/mediatek/clk-mt8195-infra_ao.c b/drivers/clk/mediatek/clk-mt8195-infra_ao.c
+index bb648a88e43afd..ad47fdb2346075 100644
+--- a/drivers/clk/mediatek/clk-mt8195-infra_ao.c
++++ b/drivers/clk/mediatek/clk-mt8195-infra_ao.c
+@@ -103,7 +103,7 @@ static const struct mtk_gate infra_ao_clks[] = {
+ GATE_INFRA_AO0(CLK_INFRA_AO_CQ_DMA_FPC, "infra_ao_cq_dma_fpc", "fpc", 28),
+ GATE_INFRA_AO0(CLK_INFRA_AO_UART5, "infra_ao_uart5", "top_uart", 29),
+ /* INFRA_AO1 */
+- GATE_INFRA_AO1(CLK_INFRA_AO_HDMI_26M, "infra_ao_hdmi_26m", "clk26m", 0),
++ GATE_INFRA_AO1(CLK_INFRA_AO_HDMI_26M, "infra_ao_hdmi_26m", "top_hdmi_xtal", 0),
+ GATE_INFRA_AO1(CLK_INFRA_AO_SPI0, "infra_ao_spi0", "top_spi", 1),
+ GATE_INFRA_AO1(CLK_INFRA_AO_MSDC0, "infra_ao_msdc0", "top_msdc50_0_hclk", 2),
+ GATE_INFRA_AO1(CLK_INFRA_AO_MSDC1, "infra_ao_msdc1", "top_axi", 4),
+diff --git a/drivers/clk/mediatek/clk-mux.c b/drivers/clk/mediatek/clk-mux.c
+index 60990296450bbb..9a12e58230bed8 100644
+--- a/drivers/clk/mediatek/clk-mux.c
++++ b/drivers/clk/mediatek/clk-mux.c
+@@ -146,9 +146,7 @@ static int mtk_clk_mux_set_parent_setclr_lock(struct clk_hw *hw, u8 index)
+ static int mtk_clk_mux_determine_rate(struct clk_hw *hw,
+ struct clk_rate_request *req)
+ {
+- struct mtk_clk_mux *mux = to_mtk_clk_mux(hw);
+-
+- return clk_mux_determine_rate_flags(hw, req, mux->data->flags);
++ return clk_mux_determine_rate_flags(hw, req, 0);
+ }
+
+ const struct clk_ops mtk_mux_clr_set_upd_ops = {
+diff --git a/drivers/clk/nxp/clk-lpc18xx-cgu.c b/drivers/clk/nxp/clk-lpc18xx-cgu.c
+index 81efa885069b2a..b9e204d63a9722 100644
+--- a/drivers/clk/nxp/clk-lpc18xx-cgu.c
++++ b/drivers/clk/nxp/clk-lpc18xx-cgu.c
+@@ -370,23 +370,25 @@ static unsigned long lpc18xx_pll0_recalc_rate(struct clk_hw *hw,
+ return 0;
+ }
+
+-static long lpc18xx_pll0_round_rate(struct clk_hw *hw, unsigned long rate,
+- unsigned long *prate)
++static int lpc18xx_pll0_determine_rate(struct clk_hw *hw,
++ struct clk_rate_request *req)
+ {
+ unsigned long m;
+
+- if (*prate < rate) {
++ if (req->best_parent_rate < req->rate) {
+ pr_warn("%s: pll dividers not supported\n", __func__);
+ return -EINVAL;
+ }
+
+- m = DIV_ROUND_UP_ULL(*prate, rate * 2);
+- if (m <= 0 && m > LPC18XX_PLL0_MSEL_MAX) {
+- pr_warn("%s: unable to support rate %lu\n", __func__, rate);
++ m = DIV_ROUND_UP_ULL(req->best_parent_rate, req->rate * 2);
++ if (m == 0 || m > LPC18XX_PLL0_MSEL_MAX) {
++ pr_warn("%s: unable to support rate %lu\n", __func__, req->rate);
+ return -EINVAL;
+ }
+
+- return 2 * *prate * m;
++ req->rate = 2 * req->best_parent_rate * m;
++
++ return 0;
+ }
+
+ static int lpc18xx_pll0_set_rate(struct clk_hw *hw, unsigned long rate,
+@@ -402,7 +404,7 @@ static int lpc18xx_pll0_set_rate(struct clk_hw *hw, unsigned long rate,
+ }
+
+ m = DIV_ROUND_UP_ULL(parent_rate, rate * 2);
+- if (m <= 0 && m > LPC18XX_PLL0_MSEL_MAX) {
++ if (m == 0 || m > LPC18XX_PLL0_MSEL_MAX) {
+ pr_warn("%s: unable to support rate %lu\n", __func__, rate);
+ return -EINVAL;
+ }
+@@ -443,7 +445,7 @@ static int lpc18xx_pll0_set_rate(struct clk_hw *hw, unsigned long rate,
+
+ static const struct clk_ops lpc18xx_pll0_ops = {
+ .recalc_rate = lpc18xx_pll0_recalc_rate,
+- .round_rate = lpc18xx_pll0_round_rate,
++ .determine_rate = lpc18xx_pll0_determine_rate,
+ .set_rate = lpc18xx_pll0_set_rate,
+ };
+
+diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig
+index 6cb6cd3e1778ad..e721b23234ddd3 100644
+--- a/drivers/clk/qcom/Kconfig
++++ b/drivers/clk/qcom/Kconfig
+@@ -495,7 +495,7 @@ config QCM_DISPCC_2290
+
+ config QCS_DISPCC_615
+ tristate "QCS615 Display Clock Controller"
+- select QCM_GCC_615
++ select QCS_GCC_615
+ help
+ Support for the display clock controller on Qualcomm Technologies, Inc
+ QCS615 devices.
+diff --git a/drivers/clk/qcom/common.c b/drivers/clk/qcom/common.c
+index 37c3008e6c1be1..12159188677418 100644
+--- a/drivers/clk/qcom/common.c
++++ b/drivers/clk/qcom/common.c
+@@ -277,8 +277,8 @@ static int qcom_cc_icc_register(struct device *dev,
+ icd[i].slave_id = desc->icc_hws[i].slave_id;
+ hws = &desc->clks[desc->icc_hws[i].clk_id]->hw;
+ icd[i].clk = devm_clk_hw_get_clk(dev, hws, "icc");
+- if (!icd[i].clk)
+- return dev_err_probe(dev, -ENOENT,
++ if (IS_ERR(icd[i].clk))
++ return dev_err_probe(dev, PTR_ERR(icd[i].clk),
+ "(%d) clock entry is null\n", i);
+ icd[i].name = clk_hw_get_name(hws);
+ }
+diff --git a/drivers/clk/qcom/tcsrcc-x1e80100.c b/drivers/clk/qcom/tcsrcc-x1e80100.c
+index ff61769a08077e..a367e1f55622d9 100644
+--- a/drivers/clk/qcom/tcsrcc-x1e80100.c
++++ b/drivers/clk/qcom/tcsrcc-x1e80100.c
+@@ -29,6 +29,10 @@ static struct clk_branch tcsr_edp_clkref_en = {
+ .enable_mask = BIT(0),
+ .hw.init = &(const struct clk_init_data) {
+ .name = "tcsr_edp_clkref_en",
++ .parent_data = &(const struct clk_parent_data){
++ .index = DT_BI_TCXO_PAD,
++ },
++ .num_parents = 1,
+ .ops = &clk_branch2_ops,
+ },
+ },
+diff --git a/drivers/clk/renesas/r9a08g045-cpg.c b/drivers/clk/renesas/r9a08g045-cpg.c
+index ed0661997928b0..3b28edfabc34e4 100644
+--- a/drivers/clk/renesas/r9a08g045-cpg.c
++++ b/drivers/clk/renesas/r9a08g045-cpg.c
+@@ -284,7 +284,8 @@ static const struct rzg2l_mod_clk r9a08g045_mod_clks[] = {
+ MSTOP(BUS_MCPU2, BIT(5))),
+ DEF_MOD("scif5_clk_pck", R9A08G045_SCIF5_CLK_PCK, R9A08G045_CLK_P0, 0x584, 5,
+ MSTOP(BUS_MCPU3, BIT(4))),
+- DEF_MOD("gpio_hclk", R9A08G045_GPIO_HCLK, R9A08G045_OSCCLK, 0x598, 0, 0),
++ DEF_MOD("gpio_hclk", R9A08G045_GPIO_HCLK, R9A08G045_OSCCLK, 0x598, 0,
++ MSTOP(BUS_PERI_CPU, BIT(6))),
+ DEF_MOD("adc_adclk", R9A08G045_ADC_ADCLK, R9A08G045_CLK_TSU, 0x5a8, 0,
+ MSTOP(BUS_MCPU2, BIT(14))),
+ DEF_MOD("adc_pclk", R9A08G045_ADC_PCLK, R9A08G045_CLK_TSU, 0x5a8, 1,
+diff --git a/drivers/clk/renesas/renesas-cpg-mssr.c b/drivers/clk/renesas/renesas-cpg-mssr.c
+index 5ff6ee1f7d4b7d..de1cf7ba45b78b 100644
+--- a/drivers/clk/renesas/renesas-cpg-mssr.c
++++ b/drivers/clk/renesas/renesas-cpg-mssr.c
+@@ -1082,6 +1082,7 @@ static int __init cpg_mssr_reserved_init(struct cpg_mssr_priv *priv,
+
+ of_for_each_phandle(&it, rc, node, "clocks", "#clock-cells", -1) {
+ int idx;
++ unsigned int *new_ids;
+
+ if (it.node != priv->np)
+ continue;
+@@ -1092,11 +1093,13 @@ static int __init cpg_mssr_reserved_init(struct cpg_mssr_priv *priv,
+ if (args[0] != CPG_MOD)
+ continue;
+
+- ids = krealloc_array(ids, (num + 1), sizeof(*ids), GFP_KERNEL);
+- if (!ids) {
++ new_ids = krealloc_array(ids, (num + 1), sizeof(*ids), GFP_KERNEL);
++ if (!new_ids) {
+ of_node_put(it.node);
++ kfree(ids);
+ return -ENOMEM;
+ }
++ ids = new_ids;
+
+ if (priv->reg_layout == CLK_REG_LAYOUT_RZ_A)
+ idx = MOD_CLK_PACK_10(args[1]); /* for DEF_MOD_STB() */
+diff --git a/drivers/clk/samsung/clk-exynos990.c b/drivers/clk/samsung/clk-exynos990.c
+index 8d3f193d2b4d4c..8571c225d09074 100644
+--- a/drivers/clk/samsung/clk-exynos990.c
++++ b/drivers/clk/samsung/clk-exynos990.c
+@@ -239,12 +239,19 @@ static const unsigned long top_clk_regs[] __initconst = {
+ PLL_LOCKTIME_PLL_SHARED2,
+ PLL_LOCKTIME_PLL_SHARED3,
+ PLL_LOCKTIME_PLL_SHARED4,
++ PLL_CON0_PLL_G3D,
+ PLL_CON3_PLL_G3D,
++ PLL_CON0_PLL_MMC,
+ PLL_CON3_PLL_MMC,
++ PLL_CON0_PLL_SHARED0,
+ PLL_CON3_PLL_SHARED0,
++ PLL_CON0_PLL_SHARED1,
+ PLL_CON3_PLL_SHARED1,
++ PLL_CON0_PLL_SHARED2,
+ PLL_CON3_PLL_SHARED2,
++ PLL_CON0_PLL_SHARED3,
+ PLL_CON3_PLL_SHARED3,
++ PLL_CON0_PLL_SHARED4,
+ PLL_CON3_PLL_SHARED4,
+ CLK_CON_MUX_MUX_CLKCMU_APM_BUS,
+ CLK_CON_MUX_MUX_CLKCMU_AUD_CPU,
+@@ -689,13 +696,13 @@ PNAME(mout_cmu_vra_bus_p) = { "dout_cmu_shared0_div3",
+
+ static const struct samsung_mux_clock top_mux_clks[] __initconst = {
+ MUX(CLK_MOUT_PLL_SHARED0, "mout_pll_shared0", mout_pll_shared0_p,
+- PLL_CON3_PLL_SHARED0, 4, 1),
++ PLL_CON0_PLL_SHARED0, 4, 1),
+ MUX(CLK_MOUT_PLL_SHARED1, "mout_pll_shared1", mout_pll_shared1_p,
+- PLL_CON3_PLL_SHARED1, 4, 1),
++ PLL_CON0_PLL_SHARED1, 4, 1),
+ MUX(CLK_MOUT_PLL_SHARED2, "mout_pll_shared2", mout_pll_shared2_p,
+- PLL_CON3_PLL_SHARED2, 4, 1),
++ PLL_CON0_PLL_SHARED2, 4, 1),
+ MUX(CLK_MOUT_PLL_SHARED3, "mout_pll_shared3", mout_pll_shared3_p,
+- PLL_CON3_PLL_SHARED3, 4, 1),
++ PLL_CON0_PLL_SHARED3, 4, 1),
+ MUX(CLK_MOUT_PLL_SHARED4, "mout_pll_shared4", mout_pll_shared4_p,
+ PLL_CON0_PLL_SHARED4, 4, 1),
+ MUX(CLK_MOUT_PLL_MMC, "mout_pll_mmc", mout_pll_mmc_p,
+@@ -759,11 +766,11 @@ static const struct samsung_mux_clock top_mux_clks[] __initconst = {
+ MUX(CLK_MOUT_CMU_DPU_ALT, "mout_cmu_dpu_alt",
+ mout_cmu_dpu_alt_p, CLK_CON_MUX_MUX_CLKCMU_DPU_ALT, 0, 2),
+ MUX(CLK_MOUT_CMU_DSP_BUS, "mout_cmu_dsp_bus",
+- mout_cmu_dsp_bus_p, CLK_CON_MUX_MUX_CLKCMU_DSP_BUS, 0, 2),
++ mout_cmu_dsp_bus_p, CLK_CON_MUX_MUX_CLKCMU_DSP_BUS, 0, 3),
+ MUX(CLK_MOUT_CMU_G2D_G2D, "mout_cmu_g2d_g2d",
+ mout_cmu_g2d_g2d_p, CLK_CON_MUX_MUX_CLKCMU_G2D_G2D, 0, 2),
+ MUX(CLK_MOUT_CMU_G2D_MSCL, "mout_cmu_g2d_mscl",
+- mout_cmu_g2d_mscl_p, CLK_CON_MUX_MUX_CLKCMU_G2D_MSCL, 0, 1),
++ mout_cmu_g2d_mscl_p, CLK_CON_MUX_MUX_CLKCMU_G2D_MSCL, 0, 2),
+ MUX(CLK_MOUT_CMU_HPM, "mout_cmu_hpm",
+ mout_cmu_hpm_p, CLK_CON_MUX_MUX_CLKCMU_HPM, 0, 2),
+ MUX(CLK_MOUT_CMU_HSI0_BUS, "mout_cmu_hsi0_bus",
+@@ -775,7 +782,7 @@ static const struct samsung_mux_clock top_mux_clks[] __initconst = {
+ 0, 2),
+ MUX(CLK_MOUT_CMU_HSI0_USBDP_DEBUG, "mout_cmu_hsi0_usbdp_debug",
+ mout_cmu_hsi0_usbdp_debug_p,
+- CLK_CON_MUX_MUX_CLKCMU_HSI0_USBDP_DEBUG, 0, 2),
++ CLK_CON_MUX_MUX_CLKCMU_HSI0_USBDP_DEBUG, 0, 1),
+ MUX(CLK_MOUT_CMU_HSI1_BUS, "mout_cmu_hsi1_bus",
+ mout_cmu_hsi1_bus_p, CLK_CON_MUX_MUX_CLKCMU_HSI1_BUS, 0, 3),
+ MUX(CLK_MOUT_CMU_HSI1_MMC_CARD, "mout_cmu_hsi1_mmc_card",
+@@ -788,7 +795,7 @@ static const struct samsung_mux_clock top_mux_clks[] __initconst = {
+ 0, 2),
+ MUX(CLK_MOUT_CMU_HSI1_UFS_EMBD, "mout_cmu_hsi1_ufs_embd",
+ mout_cmu_hsi1_ufs_embd_p, CLK_CON_MUX_MUX_CLKCMU_HSI1_UFS_EMBD,
+- 0, 1),
++ 0, 2),
+ MUX(CLK_MOUT_CMU_HSI2_BUS, "mout_cmu_hsi2_bus",
+ mout_cmu_hsi2_bus_p, CLK_CON_MUX_MUX_CLKCMU_HSI2_BUS, 0, 1),
+ MUX(CLK_MOUT_CMU_HSI2_PCIE, "mout_cmu_hsi2_pcie",
+@@ -862,7 +869,7 @@ static const struct samsung_div_clock top_div_clks[] __initconst = {
+ CLK_CON_DIV_PLL_SHARED4_DIV4, 0, 1),
+
+ DIV(CLK_DOUT_CMU_APM_BUS, "dout_cmu_apm_bus", "gout_cmu_apm_bus",
+- CLK_CON_DIV_CLKCMU_APM_BUS, 0, 3),
++ CLK_CON_DIV_CLKCMU_APM_BUS, 0, 2),
+ DIV(CLK_DOUT_CMU_AUD_CPU, "dout_cmu_aud_cpu", "gout_cmu_aud_cpu",
+ CLK_CON_DIV_CLKCMU_AUD_CPU, 0, 3),
+ DIV(CLK_DOUT_CMU_BUS0_BUS, "dout_cmu_bus0_bus", "gout_cmu_bus0_bus",
+@@ -887,9 +894,9 @@ static const struct samsung_div_clock top_div_clks[] __initconst = {
+ CLK_CON_DIV_CLKCMU_CMU_BOOST, 0, 2),
+ DIV(CLK_DOUT_CMU_CORE_BUS, "dout_cmu_core_bus", "gout_cmu_core_bus",
+ CLK_CON_DIV_CLKCMU_CORE_BUS, 0, 4),
+- DIV(CLK_DOUT_CMU_CPUCL0_DBG_BUS, "dout_cmu_cpucl0_debug",
++ DIV(CLK_DOUT_CMU_CPUCL0_DBG_BUS, "dout_cmu_cpucl0_dbg_bus",
+ "gout_cmu_cpucl0_dbg_bus", CLK_CON_DIV_CLKCMU_CPUCL0_DBG_BUS,
+- 0, 3),
++ 0, 4),
+ DIV(CLK_DOUT_CMU_CPUCL0_SWITCH, "dout_cmu_cpucl0_switch",
+ "gout_cmu_cpucl0_switch", CLK_CON_DIV_CLKCMU_CPUCL0_SWITCH, 0, 3),
+ DIV(CLK_DOUT_CMU_CPUCL1_SWITCH, "dout_cmu_cpucl1_switch",
+@@ -924,16 +931,11 @@ static const struct samsung_div_clock top_div_clks[] __initconst = {
+ CLK_CON_DIV_CLKCMU_HSI0_DPGTC, 0, 3),
+ DIV(CLK_DOUT_CMU_HSI0_USB31DRD, "dout_cmu_hsi0_usb31drd",
+ "gout_cmu_hsi0_usb31drd", CLK_CON_DIV_CLKCMU_HSI0_USB31DRD, 0, 4),
+- DIV(CLK_DOUT_CMU_HSI0_USBDP_DEBUG, "dout_cmu_hsi0_usbdp_debug",
+- "gout_cmu_hsi0_usbdp_debug", CLK_CON_DIV_CLKCMU_HSI0_USBDP_DEBUG,
+- 0, 4),
+ DIV(CLK_DOUT_CMU_HSI1_BUS, "dout_cmu_hsi1_bus", "gout_cmu_hsi1_bus",
+ CLK_CON_DIV_CLKCMU_HSI1_BUS, 0, 3),
+ DIV(CLK_DOUT_CMU_HSI1_MMC_CARD, "dout_cmu_hsi1_mmc_card",
+ "gout_cmu_hsi1_mmc_card", CLK_CON_DIV_CLKCMU_HSI1_MMC_CARD,
+ 0, 9),
+- DIV(CLK_DOUT_CMU_HSI1_PCIE, "dout_cmu_hsi1_pcie", "gout_cmu_hsi1_pcie",
+- CLK_CON_DIV_CLKCMU_HSI1_PCIE, 0, 7),
+ DIV(CLK_DOUT_CMU_HSI1_UFS_CARD, "dout_cmu_hsi1_ufs_card",
+ "gout_cmu_hsi1_ufs_card", CLK_CON_DIV_CLKCMU_HSI1_UFS_CARD,
+ 0, 3),
+@@ -942,8 +944,6 @@ static const struct samsung_div_clock top_div_clks[] __initconst = {
+ 0, 3),
+ DIV(CLK_DOUT_CMU_HSI2_BUS, "dout_cmu_hsi2_bus", "gout_cmu_hsi2_bus",
+ CLK_CON_DIV_CLKCMU_HSI2_BUS, 0, 4),
+- DIV(CLK_DOUT_CMU_HSI2_PCIE, "dout_cmu_hsi2_pcie", "gout_cmu_hsi2_pcie",
+- CLK_CON_DIV_CLKCMU_HSI2_PCIE, 0, 7),
+ DIV(CLK_DOUT_CMU_IPP_BUS, "dout_cmu_ipp_bus", "gout_cmu_ipp_bus",
+ CLK_CON_DIV_CLKCMU_IPP_BUS, 0, 4),
+ DIV(CLK_DOUT_CMU_ITP_BUS, "dout_cmu_itp_bus", "gout_cmu_itp_bus",
+@@ -979,8 +979,18 @@ static const struct samsung_div_clock top_div_clks[] __initconst = {
+ CLK_CON_DIV_CLKCMU_TNR_BUS, 0, 4),
+ DIV(CLK_DOUT_CMU_VRA_BUS, "dout_cmu_vra_bus", "gout_cmu_vra_bus",
+ CLK_CON_DIV_CLKCMU_VRA_BUS, 0, 4),
+- DIV(CLK_DOUT_CMU_DPU, "dout_cmu_clkcmu_dpu", "gout_cmu_dpu",
+- CLK_CON_DIV_DIV_CLKCMU_DPU, 0, 4),
++ DIV(CLK_DOUT_CMU_DPU, "dout_cmu_dpu", "gout_cmu_dpu",
++ CLK_CON_DIV_DIV_CLKCMU_DPU, 0, 3),
++};
++
++static const struct samsung_fixed_factor_clock cmu_top_ffactor[] __initconst = {
++ FFACTOR(CLK_DOUT_CMU_HSI1_PCIE, "dout_cmu_hsi1_pcie",
++ "gout_cmu_hsi1_pcie", 1, 8, 0),
++ FFACTOR(CLK_DOUT_CMU_OTP, "dout_cmu_otp", "oscclk", 1, 8, 0),
++ FFACTOR(CLK_DOUT_CMU_HSI0_USBDP_DEBUG, "dout_cmu_hsi0_usbdp_debug",
++ "gout_cmu_hsi0_usbdp_debug", 1, 8, 0),
++ FFACTOR(CLK_DOUT_CMU_HSI2_PCIE, "dout_cmu_hsi2_pcie",
++ "gout_cmu_hsi2_pcie", 1, 8, 0),
+ };
+
+ static const struct samsung_gate_clock top_gate_clks[] __initconst = {
+@@ -1126,6 +1136,8 @@ static const struct samsung_cmu_info top_cmu_info __initconst = {
+ .nr_mux_clks = ARRAY_SIZE(top_mux_clks),
+ .div_clks = top_div_clks,
+ .nr_div_clks = ARRAY_SIZE(top_div_clks),
++ .fixed_factor_clks = cmu_top_ffactor,
++ .nr_fixed_factor_clks = ARRAY_SIZE(cmu_top_ffactor),
+ .gate_clks = top_gate_clks,
+ .nr_gate_clks = ARRAY_SIZE(top_gate_clks),
+ .nr_clk_ids = CLKS_NR_TOP,
+diff --git a/drivers/clk/tegra/clk-bpmp.c b/drivers/clk/tegra/clk-bpmp.c
+index b2323cb8eddcce..77a2586dbe000e 100644
+--- a/drivers/clk/tegra/clk-bpmp.c
++++ b/drivers/clk/tegra/clk-bpmp.c
+@@ -635,7 +635,7 @@ static int tegra_bpmp_register_clocks(struct tegra_bpmp *bpmp,
+
+ bpmp->num_clocks = count;
+
+- bpmp->clocks = devm_kcalloc(bpmp->dev, count, sizeof(struct tegra_bpmp_clk), GFP_KERNEL);
++ bpmp->clocks = devm_kcalloc(bpmp->dev, count, sizeof(*bpmp->clocks), GFP_KERNEL);
+ if (!bpmp->clocks)
+ return -ENOMEM;
+
+diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
+index cf1bba58f641e9..ec52726fbea954 100644
+--- a/drivers/clk/thead/clk-th1520-ap.c
++++ b/drivers/clk/thead/clk-th1520-ap.c
+@@ -48,8 +48,9 @@ struct ccu_mux {
+ };
+
+ struct ccu_gate {
+- u32 enable;
+- struct ccu_common common;
++ int clkid;
++ u32 reg;
++ struct clk_gate gate;
+ };
+
+ struct ccu_div {
+@@ -87,12 +88,12 @@ struct ccu_pll {
+ 0), \
+ }
+
+-#define CCU_GATE(_clkid, _struct, _name, _parent, _reg, _gate, _flags) \
++#define CCU_GATE(_clkid, _struct, _name, _parent, _reg, _bit, _flags) \
+ struct ccu_gate _struct = { \
+- .enable = _gate, \
+- .common = { \
+- .clkid = _clkid, \
+- .cfg0 = _reg, \
++ .clkid = _clkid, \
++ .reg = _reg, \
++ .gate = { \
++ .bit_idx = _bit, \
+ .hw.init = CLK_HW_INIT_PARENTS_DATA( \
+ _name, \
+ _parent, \
+@@ -120,13 +121,6 @@ static inline struct ccu_div *hw_to_ccu_div(struct clk_hw *hw)
+ return container_of(common, struct ccu_div, common);
+ }
+
+-static inline struct ccu_gate *hw_to_ccu_gate(struct clk_hw *hw)
+-{
+- struct ccu_common *common = hw_to_ccu_common(hw);
+-
+- return container_of(common, struct ccu_gate, common);
+-}
+-
+ static u8 ccu_get_parent_helper(struct ccu_common *common,
+ struct ccu_internal *mux)
+ {
+@@ -767,6 +761,10 @@ static struct ccu_div dpu0_clk = {
+ },
+ };
+
++static const struct clk_parent_data dpu0_clk_pd[] = {
++ { .hw = &dpu0_clk.common.hw }
++};
++
+ static struct ccu_div dpu1_clk = {
+ .div = TH_CCU_DIV_FLAGS(0, 8, CLK_DIVIDER_ONE_BASED),
+ .common = {
+@@ -779,6 +777,10 @@ static struct ccu_div dpu1_clk = {
+ },
+ };
+
++static const struct clk_parent_data dpu1_clk_pd[] = {
++ { .hw = &dpu1_clk.common.hw }
++};
++
+ static CLK_FIXED_FACTOR_HW(emmc_sdio_ref_clk, "emmc-sdio-ref",
+ &video_pll_clk.common.hw, 4, 1, 0);
+
+@@ -786,128 +788,132 @@ static const struct clk_parent_data emmc_sdio_ref_clk_pd[] = {
+ { .hw = &emmc_sdio_ref_clk.hw },
+ };
+
+-static CCU_GATE(CLK_BROM, brom_clk, "brom", ahb2_cpusys_hclk_pd, 0x100, BIT(4), 0);
+-static CCU_GATE(CLK_BMU, bmu_clk, "bmu", axi4_cpusys2_aclk_pd, 0x100, BIT(5), 0);
++static CCU_GATE(CLK_BROM, brom_clk, "brom", ahb2_cpusys_hclk_pd, 0x100, 4, 0);
++static CCU_GATE(CLK_BMU, bmu_clk, "bmu", axi4_cpusys2_aclk_pd, 0x100, 5, 0);
+ static CCU_GATE(CLK_AON2CPU_A2X, aon2cpu_a2x_clk, "aon2cpu-a2x", axi4_cpusys2_aclk_pd,
+- 0x134, BIT(8), 0);
++ 0x134, 8, 0);
+ static CCU_GATE(CLK_X2X_CPUSYS, x2x_cpusys_clk, "x2x-cpusys", axi4_cpusys2_aclk_pd,
+- 0x134, BIT(7), 0);
++ 0x134, 7, 0);
+ static CCU_GATE(CLK_CPU2AON_X2H, cpu2aon_x2h_clk, "cpu2aon-x2h", axi_aclk_pd,
+- 0x138, BIT(8), CLK_IGNORE_UNUSED);
++ 0x138, 8, CLK_IGNORE_UNUSED);
+ static CCU_GATE(CLK_CPU2PERI_X2H, cpu2peri_x2h_clk, "cpu2peri-x2h", axi4_cpusys2_aclk_pd,
+- 0x140, BIT(9), CLK_IGNORE_UNUSED);
++ 0x140, 9, CLK_IGNORE_UNUSED);
+ static CCU_GATE(CLK_PERISYS_APB1_HCLK, perisys_apb1_hclk, "perisys-apb1-hclk", perisys_ahb_hclk_pd,
+- 0x150, BIT(9), CLK_IGNORE_UNUSED);
++ 0x150, 9, CLK_IGNORE_UNUSED);
+ static CCU_GATE(CLK_PERISYS_APB2_HCLK, perisys_apb2_hclk, "perisys-apb2-hclk", perisys_ahb_hclk_pd,
+- 0x150, BIT(10), CLK_IGNORE_UNUSED);
++ 0x150, 10, CLK_IGNORE_UNUSED);
+ static CCU_GATE(CLK_PERISYS_APB3_HCLK, perisys_apb3_hclk, "perisys-apb3-hclk", perisys_ahb_hclk_pd,
+- 0x150, BIT(11), CLK_IGNORE_UNUSED);
++ 0x150, 11, CLK_IGNORE_UNUSED);
+ static CCU_GATE(CLK_PERISYS_APB4_HCLK, perisys_apb4_hclk, "perisys-apb4-hclk", perisys_ahb_hclk_pd,
+- 0x150, BIT(12), 0);
+-static CCU_GATE(CLK_NPU_AXI, npu_axi_clk, "npu-axi", axi_aclk_pd, 0x1c8, BIT(5), 0);
+-static CCU_GATE(CLK_CPU2VP, cpu2vp_clk, "cpu2vp", axi_aclk_pd, 0x1e0, BIT(13), 0);
+-static CCU_GATE(CLK_EMMC_SDIO, emmc_sdio_clk, "emmc-sdio", emmc_sdio_ref_clk_pd, 0x204, BIT(30), 0);
+-static CCU_GATE(CLK_GMAC1, gmac1_clk, "gmac1", gmac_pll_clk_pd, 0x204, BIT(26), 0);
+-static CCU_GATE(CLK_PADCTRL1, padctrl1_clk, "padctrl1", perisys_apb_pclk_pd, 0x204, BIT(24), 0);
+-static CCU_GATE(CLK_DSMART, dsmart_clk, "dsmart", perisys_apb_pclk_pd, 0x204, BIT(23), 0);
+-static CCU_GATE(CLK_PADCTRL0, padctrl0_clk, "padctrl0", perisys_apb_pclk_pd, 0x204, BIT(22), 0);
+-static CCU_GATE(CLK_GMAC_AXI, gmac_axi_clk, "gmac-axi", axi4_cpusys2_aclk_pd, 0x204, BIT(21), 0);
+-static CCU_GATE(CLK_GPIO3, gpio3_clk, "gpio3-clk", peri2sys_apb_pclk_pd, 0x204, BIT(20), 0);
+-static CCU_GATE(CLK_GMAC0, gmac0_clk, "gmac0", gmac_pll_clk_pd, 0x204, BIT(19), 0);
+-static CCU_GATE(CLK_PWM, pwm_clk, "pwm", perisys_apb_pclk_pd, 0x204, BIT(18), 0);
+-static CCU_GATE(CLK_QSPI0, qspi0_clk, "qspi0", video_pll_clk_pd, 0x204, BIT(17), 0);
+-static CCU_GATE(CLK_QSPI1, qspi1_clk, "qspi1", video_pll_clk_pd, 0x204, BIT(16), 0);
+-static CCU_GATE(CLK_SPI, spi_clk, "spi", video_pll_clk_pd, 0x204, BIT(15), 0);
+-static CCU_GATE(CLK_UART0_PCLK, uart0_pclk, "uart0-pclk", perisys_apb_pclk_pd, 0x204, BIT(14), 0);
+-static CCU_GATE(CLK_UART1_PCLK, uart1_pclk, "uart1-pclk", perisys_apb_pclk_pd, 0x204, BIT(13), 0);
+-static CCU_GATE(CLK_UART2_PCLK, uart2_pclk, "uart2-pclk", perisys_apb_pclk_pd, 0x204, BIT(12), 0);
+-static CCU_GATE(CLK_UART3_PCLK, uart3_pclk, "uart3-pclk", perisys_apb_pclk_pd, 0x204, BIT(11), 0);
+-static CCU_GATE(CLK_UART4_PCLK, uart4_pclk, "uart4-pclk", perisys_apb_pclk_pd, 0x204, BIT(10), 0);
+-static CCU_GATE(CLK_UART5_PCLK, uart5_pclk, "uart5-pclk", perisys_apb_pclk_pd, 0x204, BIT(9), 0);
+-static CCU_GATE(CLK_GPIO0, gpio0_clk, "gpio0-clk", perisys_apb_pclk_pd, 0x204, BIT(8), 0);
+-static CCU_GATE(CLK_GPIO1, gpio1_clk, "gpio1-clk", perisys_apb_pclk_pd, 0x204, BIT(7), 0);
+-static CCU_GATE(CLK_GPIO2, gpio2_clk, "gpio2-clk", peri2sys_apb_pclk_pd, 0x204, BIT(6), 0);
+-static CCU_GATE(CLK_I2C0, i2c0_clk, "i2c0", perisys_apb_pclk_pd, 0x204, BIT(5), 0);
+-static CCU_GATE(CLK_I2C1, i2c1_clk, "i2c1", perisys_apb_pclk_pd, 0x204, BIT(4), 0);
+-static CCU_GATE(CLK_I2C2, i2c2_clk, "i2c2", perisys_apb_pclk_pd, 0x204, BIT(3), 0);
+-static CCU_GATE(CLK_I2C3, i2c3_clk, "i2c3", perisys_apb_pclk_pd, 0x204, BIT(2), 0);
+-static CCU_GATE(CLK_I2C4, i2c4_clk, "i2c4", perisys_apb_pclk_pd, 0x204, BIT(1), 0);
+-static CCU_GATE(CLK_I2C5, i2c5_clk, "i2c5", perisys_apb_pclk_pd, 0x204, BIT(0), 0);
+-static CCU_GATE(CLK_SPINLOCK, spinlock_clk, "spinlock", ahb2_cpusys_hclk_pd, 0x208, BIT(10), 0);
+-static CCU_GATE(CLK_DMA, dma_clk, "dma", axi4_cpusys2_aclk_pd, 0x208, BIT(8), 0);
+-static CCU_GATE(CLK_MBOX0, mbox0_clk, "mbox0", apb3_cpusys_pclk_pd, 0x208, BIT(7), 0);
+-static CCU_GATE(CLK_MBOX1, mbox1_clk, "mbox1", apb3_cpusys_pclk_pd, 0x208, BIT(6), 0);
+-static CCU_GATE(CLK_MBOX2, mbox2_clk, "mbox2", apb3_cpusys_pclk_pd, 0x208, BIT(5), 0);
+-static CCU_GATE(CLK_MBOX3, mbox3_clk, "mbox3", apb3_cpusys_pclk_pd, 0x208, BIT(4), 0);
+-static CCU_GATE(CLK_WDT0, wdt0_clk, "wdt0", apb3_cpusys_pclk_pd, 0x208, BIT(3), 0);
+-static CCU_GATE(CLK_WDT1, wdt1_clk, "wdt1", apb3_cpusys_pclk_pd, 0x208, BIT(2), 0);
+-static CCU_GATE(CLK_TIMER0, timer0_clk, "timer0", apb3_cpusys_pclk_pd, 0x208, BIT(1), 0);
+-static CCU_GATE(CLK_TIMER1, timer1_clk, "timer1", apb3_cpusys_pclk_pd, 0x208, BIT(0), 0);
+-static CCU_GATE(CLK_SRAM0, sram0_clk, "sram0", axi_aclk_pd, 0x20c, BIT(4), 0);
+-static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
+-static CCU_GATE(CLK_SRAM2, sram2_clk, "sram2", axi_aclk_pd, 0x20c, BIT(2), 0);
+-static CCU_GATE(CLK_SRAM3, sram3_clk, "sram3", axi_aclk_pd, 0x20c, BIT(1), 0);
++ 0x150, 12, 0);
++static const struct clk_parent_data perisys_apb4_hclk_pd[] = {
++ { .hw = &perisys_apb4_hclk.gate.hw },
++};
++
++static CCU_GATE(CLK_NPU_AXI, npu_axi_clk, "npu-axi", axi_aclk_pd, 0x1c8, 5, 0);
++static CCU_GATE(CLK_CPU2VP, cpu2vp_clk, "cpu2vp", axi_aclk_pd, 0x1e0, 13, 0);
++static CCU_GATE(CLK_EMMC_SDIO, emmc_sdio_clk, "emmc-sdio", emmc_sdio_ref_clk_pd, 0x204, 30, 0);
++static CCU_GATE(CLK_GMAC1, gmac1_clk, "gmac1", gmac_pll_clk_pd, 0x204, 26, 0);
++static CCU_GATE(CLK_PADCTRL1, padctrl1_clk, "padctrl1", perisys_apb_pclk_pd, 0x204, 24, 0);
++static CCU_GATE(CLK_DSMART, dsmart_clk, "dsmart", perisys_apb_pclk_pd, 0x204, 23, 0);
++static CCU_GATE(CLK_PADCTRL0, padctrl0_clk, "padctrl0", perisys_apb4_hclk_pd, 0x204, 22, 0);
++static CCU_GATE(CLK_GMAC_AXI, gmac_axi_clk, "gmac-axi", axi4_cpusys2_aclk_pd, 0x204, 21, 0);
++static CCU_GATE(CLK_GPIO3, gpio3_clk, "gpio3-clk", peri2sys_apb_pclk_pd, 0x204, 20, 0);
++static CCU_GATE(CLK_GMAC0, gmac0_clk, "gmac0", gmac_pll_clk_pd, 0x204, 19, 0);
++static CCU_GATE(CLK_PWM, pwm_clk, "pwm", perisys_apb_pclk_pd, 0x204, 18, 0);
++static CCU_GATE(CLK_QSPI0, qspi0_clk, "qspi0", video_pll_clk_pd, 0x204, 17, 0);
++static CCU_GATE(CLK_QSPI1, qspi1_clk, "qspi1", video_pll_clk_pd, 0x204, 16, 0);
++static CCU_GATE(CLK_SPI, spi_clk, "spi", video_pll_clk_pd, 0x204, 15, 0);
++static CCU_GATE(CLK_UART0_PCLK, uart0_pclk, "uart0-pclk", perisys_apb_pclk_pd, 0x204, 14, 0);
++static CCU_GATE(CLK_UART1_PCLK, uart1_pclk, "uart1-pclk", perisys_apb_pclk_pd, 0x204, 13, 0);
++static CCU_GATE(CLK_UART2_PCLK, uart2_pclk, "uart2-pclk", perisys_apb_pclk_pd, 0x204, 12, 0);
++static CCU_GATE(CLK_UART3_PCLK, uart3_pclk, "uart3-pclk", perisys_apb_pclk_pd, 0x204, 11, 0);
++static CCU_GATE(CLK_UART4_PCLK, uart4_pclk, "uart4-pclk", perisys_apb_pclk_pd, 0x204, 10, 0);
++static CCU_GATE(CLK_UART5_PCLK, uart5_pclk, "uart5-pclk", perisys_apb_pclk_pd, 0x204, 9, 0);
++static CCU_GATE(CLK_GPIO0, gpio0_clk, "gpio0-clk", perisys_apb_pclk_pd, 0x204, 8, 0);
++static CCU_GATE(CLK_GPIO1, gpio1_clk, "gpio1-clk", perisys_apb_pclk_pd, 0x204, 7, 0);
++static CCU_GATE(CLK_GPIO2, gpio2_clk, "gpio2-clk", peri2sys_apb_pclk_pd, 0x204, 6, 0);
++static CCU_GATE(CLK_I2C0, i2c0_clk, "i2c0", perisys_apb_pclk_pd, 0x204, 5, 0);
++static CCU_GATE(CLK_I2C1, i2c1_clk, "i2c1", perisys_apb_pclk_pd, 0x204, 4, 0);
++static CCU_GATE(CLK_I2C2, i2c2_clk, "i2c2", perisys_apb_pclk_pd, 0x204, 3, 0);
++static CCU_GATE(CLK_I2C3, i2c3_clk, "i2c3", perisys_apb_pclk_pd, 0x204, 2, 0);
++static CCU_GATE(CLK_I2C4, i2c4_clk, "i2c4", perisys_apb_pclk_pd, 0x204, 1, 0);
++static CCU_GATE(CLK_I2C5, i2c5_clk, "i2c5", perisys_apb_pclk_pd, 0x204, 0, 0);
++static CCU_GATE(CLK_SPINLOCK, spinlock_clk, "spinlock", ahb2_cpusys_hclk_pd, 0x208, 10, 0);
++static CCU_GATE(CLK_DMA, dma_clk, "dma", axi4_cpusys2_aclk_pd, 0x208, 8, 0);
++static CCU_GATE(CLK_MBOX0, mbox0_clk, "mbox0", apb3_cpusys_pclk_pd, 0x208, 7, 0);
++static CCU_GATE(CLK_MBOX1, mbox1_clk, "mbox1", apb3_cpusys_pclk_pd, 0x208, 6, 0);
++static CCU_GATE(CLK_MBOX2, mbox2_clk, "mbox2", apb3_cpusys_pclk_pd, 0x208, 5, 0);
++static CCU_GATE(CLK_MBOX3, mbox3_clk, "mbox3", apb3_cpusys_pclk_pd, 0x208, 4, 0);
++static CCU_GATE(CLK_WDT0, wdt0_clk, "wdt0", apb3_cpusys_pclk_pd, 0x208, 3, 0);
++static CCU_GATE(CLK_WDT1, wdt1_clk, "wdt1", apb3_cpusys_pclk_pd, 0x208, 2, 0);
++static CCU_GATE(CLK_TIMER0, timer0_clk, "timer0", apb3_cpusys_pclk_pd, 0x208, 1, 0);
++static CCU_GATE(CLK_TIMER1, timer1_clk, "timer1", apb3_cpusys_pclk_pd, 0x208, 0, 0);
++static CCU_GATE(CLK_SRAM0, sram0_clk, "sram0", axi_aclk_pd, 0x20c, 4, 0);
++static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, 3, 0);
++static CCU_GATE(CLK_SRAM2, sram2_clk, "sram2", axi_aclk_pd, 0x20c, 2, 0);
++static CCU_GATE(CLK_SRAM3, sram3_clk, "sram3", axi_aclk_pd, 0x20c, 1, 0);
+
+ static CCU_GATE(CLK_AXI4_VO_ACLK, axi4_vo_aclk, "axi4-vo-aclk",
+- video_pll_clk_pd, 0x0, BIT(0), 0);
++ video_pll_clk_pd, 0x0, 0, 0);
+ static CCU_GATE(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", video_pll_clk_pd,
+- 0x0, BIT(3), 0);
++ 0x0, 3, 0);
+ static CCU_GATE(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
+- video_pll_clk_pd, 0x0, BIT(4), 0);
++ video_pll_clk_pd, 0x0, 4, 0);
+ static CCU_GATE(CLK_DPU_PIXELCLK0, dpu0_pixelclk, "dpu0-pixelclk",
+- video_pll_clk_pd, 0x0, BIT(5), 0);
++ dpu0_clk_pd, 0x0, 5, 0);
+ static CCU_GATE(CLK_DPU_PIXELCLK1, dpu1_pixelclk, "dpu1-pixelclk",
+- video_pll_clk_pd, 0x0, BIT(6), 0);
++ dpu1_clk_pd, 0x0, 6, 0);
+ static CCU_GATE(CLK_DPU_HCLK, dpu_hclk, "dpu-hclk", video_pll_clk_pd, 0x0,
+- BIT(7), 0);
++ 7, 0);
+ static CCU_GATE(CLK_DPU_ACLK, dpu_aclk, "dpu-aclk", video_pll_clk_pd, 0x0,
+- BIT(8), 0);
++ 8, 0);
+ static CCU_GATE(CLK_DPU_CCLK, dpu_cclk, "dpu-cclk", video_pll_clk_pd, 0x0,
+- BIT(9), 0);
++ 9, 0);
+ static CCU_GATE(CLK_HDMI_SFR, hdmi_sfr_clk, "hdmi-sfr-clk", video_pll_clk_pd,
+- 0x0, BIT(10), 0);
++ 0x0, 10, 0);
+ static CCU_GATE(CLK_HDMI_PCLK, hdmi_pclk, "hdmi-pclk", video_pll_clk_pd, 0x0,
+- BIT(11), 0);
++ 11, 0);
+ static CCU_GATE(CLK_HDMI_CEC, hdmi_cec_clk, "hdmi-cec-clk", video_pll_clk_pd,
+- 0x0, BIT(12), 0);
++ 0x0, 12, 0);
+ static CCU_GATE(CLK_MIPI_DSI0_PCLK, mipi_dsi0_pclk, "mipi-dsi0-pclk",
+- video_pll_clk_pd, 0x0, BIT(13), 0);
++ video_pll_clk_pd, 0x0, 13, 0);
+ static CCU_GATE(CLK_MIPI_DSI1_PCLK, mipi_dsi1_pclk, "mipi-dsi1-pclk",
+- video_pll_clk_pd, 0x0, BIT(14), 0);
++ video_pll_clk_pd, 0x0, 14, 0);
+ static CCU_GATE(CLK_MIPI_DSI0_CFG, mipi_dsi0_cfg_clk, "mipi-dsi0-cfg-clk",
+- video_pll_clk_pd, 0x0, BIT(15), 0);
++ video_pll_clk_pd, 0x0, 15, 0);
+ static CCU_GATE(CLK_MIPI_DSI1_CFG, mipi_dsi1_cfg_clk, "mipi-dsi1-cfg-clk",
+- video_pll_clk_pd, 0x0, BIT(16), 0);
++ video_pll_clk_pd, 0x0, 16, 0);
+ static CCU_GATE(CLK_MIPI_DSI0_REFCLK, mipi_dsi0_refclk, "mipi-dsi0-refclk",
+- video_pll_clk_pd, 0x0, BIT(17), 0);
++ video_pll_clk_pd, 0x0, 17, 0);
+ static CCU_GATE(CLK_MIPI_DSI1_REFCLK, mipi_dsi1_refclk, "mipi-dsi1-refclk",
+- video_pll_clk_pd, 0x0, BIT(18), 0);
++ video_pll_clk_pd, 0x0, 18, 0);
+ static CCU_GATE(CLK_HDMI_I2S, hdmi_i2s_clk, "hdmi-i2s-clk", video_pll_clk_pd,
+- 0x0, BIT(19), 0);
++ 0x0, 19, 0);
+ static CCU_GATE(CLK_X2H_DPU1_ACLK, x2h_dpu1_aclk, "x2h-dpu1-aclk",
+- video_pll_clk_pd, 0x0, BIT(20), 0);
++ video_pll_clk_pd, 0x0, 20, 0);
+ static CCU_GATE(CLK_X2H_DPU_ACLK, x2h_dpu_aclk, "x2h-dpu-aclk",
+- video_pll_clk_pd, 0x0, BIT(21), 0);
++ video_pll_clk_pd, 0x0, 21, 0);
+ static CCU_GATE(CLK_AXI4_VO_PCLK, axi4_vo_pclk, "axi4-vo-pclk",
+- video_pll_clk_pd, 0x0, BIT(22), 0);
++ video_pll_clk_pd, 0x0, 22, 0);
+ static CCU_GATE(CLK_IOPMP_VOSYS_DPU_PCLK, iopmp_vosys_dpu_pclk,
+- "iopmp-vosys-dpu-pclk", video_pll_clk_pd, 0x0, BIT(23), 0);
++ "iopmp-vosys-dpu-pclk", video_pll_clk_pd, 0x0, 23, 0);
+ static CCU_GATE(CLK_IOPMP_VOSYS_DPU1_PCLK, iopmp_vosys_dpu1_pclk,
+- "iopmp-vosys-dpu1-pclk", video_pll_clk_pd, 0x0, BIT(24), 0);
++ "iopmp-vosys-dpu1-pclk", video_pll_clk_pd, 0x0, 24, 0);
+ static CCU_GATE(CLK_IOPMP_VOSYS_GPU_PCLK, iopmp_vosys_gpu_pclk,
+- "iopmp-vosys-gpu-pclk", video_pll_clk_pd, 0x0, BIT(25), 0);
++ "iopmp-vosys-gpu-pclk", video_pll_clk_pd, 0x0, 25, 0);
+ static CCU_GATE(CLK_IOPMP_DPU1_ACLK, iopmp_dpu1_aclk, "iopmp-dpu1-aclk",
+- video_pll_clk_pd, 0x0, BIT(27), 0);
++ video_pll_clk_pd, 0x0, 27, 0);
+ static CCU_GATE(CLK_IOPMP_DPU_ACLK, iopmp_dpu_aclk, "iopmp-dpu-aclk",
+- video_pll_clk_pd, 0x0, BIT(28), 0);
++ video_pll_clk_pd, 0x0, 28, 0);
+ static CCU_GATE(CLK_IOPMP_GPU_ACLK, iopmp_gpu_aclk, "iopmp-gpu-aclk",
+- video_pll_clk_pd, 0x0, BIT(29), 0);
++ video_pll_clk_pd, 0x0, 29, 0);
+ static CCU_GATE(CLK_MIPIDSI0_PIXCLK, mipi_dsi0_pixclk, "mipi-dsi0-pixclk",
+- video_pll_clk_pd, 0x0, BIT(30), 0);
++ video_pll_clk_pd, 0x0, 30, 0);
+ static CCU_GATE(CLK_MIPIDSI1_PIXCLK, mipi_dsi1_pixclk, "mipi-dsi1-pixclk",
+- video_pll_clk_pd, 0x0, BIT(31), 0);
++ video_pll_clk_pd, 0x0, 31, 0);
+ static CCU_GATE(CLK_HDMI_PIXCLK, hdmi_pixclk, "hdmi-pixclk", video_pll_clk_pd,
+- 0x4, BIT(0), 0);
++ 0x4, 0, 0);
+
+ static CLK_FIXED_FACTOR_HW(gmac_pll_clk_100m, "gmac-pll-clk-100m",
+ &gmac_pll_clk.common.hw, 10, 1, 0);
+@@ -963,93 +969,93 @@ static struct ccu_mux *th1520_mux_clks[] = {
+ &uart_sclk,
+ };
+
+-static struct ccu_common *th1520_gate_clks[] = {
+- &emmc_sdio_clk.common,
+- &aon2cpu_a2x_clk.common,
+- &x2x_cpusys_clk.common,
+- &brom_clk.common,
+- &bmu_clk.common,
+- &cpu2aon_x2h_clk.common,
+- &cpu2peri_x2h_clk.common,
+- &cpu2vp_clk.common,
+- &perisys_apb1_hclk.common,
+- &perisys_apb2_hclk.common,
+- &perisys_apb3_hclk.common,
+- &perisys_apb4_hclk.common,
+- &npu_axi_clk.common,
+- &gmac1_clk.common,
+- &padctrl1_clk.common,
+- &dsmart_clk.common,
+- &padctrl0_clk.common,
+- &gmac_axi_clk.common,
+- &gpio3_clk.common,
+- &gmac0_clk.common,
+- &pwm_clk.common,
+- &qspi0_clk.common,
+- &qspi1_clk.common,
+- &spi_clk.common,
+- &uart0_pclk.common,
+- &uart1_pclk.common,
+- &uart2_pclk.common,
+- &uart3_pclk.common,
+- &uart4_pclk.common,
+- &uart5_pclk.common,
+- &gpio0_clk.common,
+- &gpio1_clk.common,
+- &gpio2_clk.common,
+- &i2c0_clk.common,
+- &i2c1_clk.common,
+- &i2c2_clk.common,
+- &i2c3_clk.common,
+- &i2c4_clk.common,
+- &i2c5_clk.common,
+- &spinlock_clk.common,
+- &dma_clk.common,
+- &mbox0_clk.common,
+- &mbox1_clk.common,
+- &mbox2_clk.common,
+- &mbox3_clk.common,
+- &wdt0_clk.common,
+- &wdt1_clk.common,
+- &timer0_clk.common,
+- &timer1_clk.common,
+- &sram0_clk.common,
+- &sram1_clk.common,
+- &sram2_clk.common,
+- &sram3_clk.common,
+-};
+-
+-static struct ccu_common *th1520_vo_gate_clks[] = {
+- &axi4_vo_aclk.common,
+- &gpu_core_clk.common,
+- &gpu_cfg_aclk.common,
+- &dpu0_pixelclk.common,
+- &dpu1_pixelclk.common,
+- &dpu_hclk.common,
+- &dpu_aclk.common,
+- &dpu_cclk.common,
+- &hdmi_sfr_clk.common,
+- &hdmi_pclk.common,
+- &hdmi_cec_clk.common,
+- &mipi_dsi0_pclk.common,
+- &mipi_dsi1_pclk.common,
+- &mipi_dsi0_cfg_clk.common,
+- &mipi_dsi1_cfg_clk.common,
+- &mipi_dsi0_refclk.common,
+- &mipi_dsi1_refclk.common,
+- &hdmi_i2s_clk.common,
+- &x2h_dpu1_aclk.common,
+- &x2h_dpu_aclk.common,
+- &axi4_vo_pclk.common,
+- &iopmp_vosys_dpu_pclk.common,
+- &iopmp_vosys_dpu1_pclk.common,
+- &iopmp_vosys_gpu_pclk.common,
+- &iopmp_dpu1_aclk.common,
+- &iopmp_dpu_aclk.common,
+- &iopmp_gpu_aclk.common,
+- &mipi_dsi0_pixclk.common,
+- &mipi_dsi1_pixclk.common,
+- &hdmi_pixclk.common
++static struct ccu_gate *th1520_gate_clks[] = {
++ &emmc_sdio_clk,
++ &aon2cpu_a2x_clk,
++ &x2x_cpusys_clk,
++ &brom_clk,
++ &bmu_clk,
++ &cpu2aon_x2h_clk,
++ &cpu2peri_x2h_clk,
++ &cpu2vp_clk,
++ &perisys_apb1_hclk,
++ &perisys_apb2_hclk,
++ &perisys_apb3_hclk,
++ &perisys_apb4_hclk,
++ &npu_axi_clk,
++ &gmac1_clk,
++ &padctrl1_clk,
++ &dsmart_clk,
++ &padctrl0_clk,
++ &gmac_axi_clk,
++ &gpio3_clk,
++ &gmac0_clk,
++ &pwm_clk,
++ &qspi0_clk,
++ &qspi1_clk,
++ &spi_clk,
++ &uart0_pclk,
++ &uart1_pclk,
++ &uart2_pclk,
++ &uart3_pclk,
++ &uart4_pclk,
++ &uart5_pclk,
++ &gpio0_clk,
++ &gpio1_clk,
++ &gpio2_clk,
++ &i2c0_clk,
++ &i2c1_clk,
++ &i2c2_clk,
++ &i2c3_clk,
++ &i2c4_clk,
++ &i2c5_clk,
++ &spinlock_clk,
++ &dma_clk,
++ &mbox0_clk,
++ &mbox1_clk,
++ &mbox2_clk,
++ &mbox3_clk,
++ &wdt0_clk,
++ &wdt1_clk,
++ &timer0_clk,
++ &timer1_clk,
++ &sram0_clk,
++ &sram1_clk,
++ &sram2_clk,
++ &sram3_clk,
++};
++
++static struct ccu_gate *th1520_vo_gate_clks[] = {
++ &axi4_vo_aclk,
++ &gpu_core_clk,
++ &gpu_cfg_aclk,
++ &dpu0_pixelclk,
++ &dpu1_pixelclk,
++ &dpu_hclk,
++ &dpu_aclk,
++ &dpu_cclk,
++ &hdmi_sfr_clk,
++ &hdmi_pclk,
++ &hdmi_cec_clk,
++ &mipi_dsi0_pclk,
++ &mipi_dsi1_pclk,
++ &mipi_dsi0_cfg_clk,
++ &mipi_dsi1_cfg_clk,
++ &mipi_dsi0_refclk,
++ &mipi_dsi1_refclk,
++ &hdmi_i2s_clk,
++ &x2h_dpu1_aclk,
++ &x2h_dpu_aclk,
++ &axi4_vo_pclk,
++ &iopmp_vosys_dpu_pclk,
++ &iopmp_vosys_dpu1_pclk,
++ &iopmp_vosys_gpu_pclk,
++ &iopmp_dpu1_aclk,
++ &iopmp_dpu_aclk,
++ &iopmp_gpu_aclk,
++ &mipi_dsi0_pixclk,
++ &mipi_dsi1_pixclk,
++ &hdmi_pixclk
+ };
+
+ static const struct regmap_config th1520_clk_regmap_config = {
+@@ -1063,7 +1069,7 @@ struct th1520_plat_data {
+ struct ccu_common **th1520_pll_clks;
+ struct ccu_common **th1520_div_clks;
+ struct ccu_mux **th1520_mux_clks;
+- struct ccu_common **th1520_gate_clks;
++ struct ccu_gate **th1520_gate_clks;
+
+ int nr_clks;
+ int nr_pll_clks;
+@@ -1102,7 +1108,6 @@ static int th1520_clk_probe(struct platform_device *pdev)
+
+ struct regmap *map;
+ void __iomem *base;
+- struct clk_hw *hw;
+ int ret, i;
+
+ plat_data = device_get_match_data(&pdev->dev);
+@@ -1161,20 +1166,15 @@ static int th1520_clk_probe(struct platform_device *pdev)
+ }
+
+ for (i = 0; i < plat_data->nr_gate_clks; i++) {
+- struct ccu_gate *cg = hw_to_ccu_gate(&plat_data->th1520_gate_clks[i]->hw);
++ struct ccu_gate *cg = plat_data->th1520_gate_clks[i];
+
+- plat_data->th1520_gate_clks[i]->map = map;
++ cg->gate.reg = base + cg->reg;
+
+- hw = devm_clk_hw_register_gate_parent_data(dev,
+- cg->common.hw.init->name,
+- cg->common.hw.init->parent_data,
+- cg->common.hw.init->flags,
+- base + cg->common.cfg0,
+- ffs(cg->enable) - 1, 0, NULL);
+- if (IS_ERR(hw))
+- return PTR_ERR(hw);
++ ret = devm_clk_hw_register(dev, &cg->gate.hw);
++ if (ret)
++ return ret;
+
+- priv->hws[cg->common.clkid] = hw;
++ priv->hws[cg->clkid] = &cg->gate.hw;
+ }
+
+ if (plat_data == &th1520_ap_platdata) {
+diff --git a/drivers/clocksource/clps711x-timer.c b/drivers/clocksource/clps711x-timer.c
+index e95fdc49c2269c..bbceb0289d457a 100644
+--- a/drivers/clocksource/clps711x-timer.c
++++ b/drivers/clocksource/clps711x-timer.c
+@@ -78,24 +78,33 @@ static int __init clps711x_timer_init(struct device_node *np)
+ unsigned int irq = irq_of_parse_and_map(np, 0);
+ struct clk *clock = of_clk_get(np, 0);
+ void __iomem *base = of_iomap(np, 0);
++ int ret = 0;
+
+ if (!base)
+ return -ENOMEM;
+- if (!irq)
+- return -EINVAL;
+- if (IS_ERR(clock))
+- return PTR_ERR(clock);
++ if (!irq) {
++ ret = -EINVAL;
++ goto unmap_io;
++ }
++ if (IS_ERR(clock)) {
++ ret = PTR_ERR(clock);
++ goto unmap_io;
++ }
+
+ switch (of_alias_get_id(np, "timer")) {
+ case CLPS711X_CLKSRC_CLOCKSOURCE:
+ clps711x_clksrc_init(clock, base);
+ break;
+ case CLPS711X_CLKSRC_CLOCKEVENT:
+- return _clps711x_clkevt_init(clock, base, irq);
++ ret = _clps711x_clkevt_init(clock, base, irq);
++ break;
+ default:
+- return -EINVAL;
++ ret = -EINVAL;
++ break;
+ }
+
+- return 0;
++unmap_io:
++ iounmap(base);
++ return ret;
+ }
+ TIMER_OF_DECLARE(clps711x, "cirrus,ep7209-timer", clps711x_timer_init);
+diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
+index 4a17162a392da7..dd8efe4fb967f4 100644
+--- a/drivers/cpufreq/cppc_cpufreq.c
++++ b/drivers/cpufreq/cppc_cpufreq.c
+@@ -310,6 +310,16 @@ static int cppc_verify_policy(struct cpufreq_policy_data *policy)
+ return 0;
+ }
+
++static unsigned int __cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
++{
++ unsigned int transition_latency_ns = cppc_get_transition_latency(cpu);
++
++ if (transition_latency_ns == CPUFREQ_ETERNAL)
++ return CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS / NSEC_PER_USEC;
++
++ return transition_latency_ns / NSEC_PER_USEC;
++}
++
+ /*
+ * The PCC subspace describes the rate at which platform can accept commands
+ * on the shared PCC channel (including READs which do not count towards freq
+@@ -332,12 +342,12 @@ static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
+ return 10000;
+ }
+ }
+- return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
++ return __cppc_cpufreq_get_transition_delay_us(cpu);
+ }
+ #else
+ static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
+ {
+- return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
++ return __cppc_cpufreq_get_transition_delay_us(cpu);
+ }
+ #endif
+
+diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
+index 506437489b4db2..7d5079fd168825 100644
+--- a/drivers/cpufreq/cpufreq-dt.c
++++ b/drivers/cpufreq/cpufreq-dt.c
+@@ -104,7 +104,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
+
+ transition_latency = dev_pm_opp_get_max_transition_latency(cpu_dev);
+ if (!transition_latency)
+- transition_latency = CPUFREQ_ETERNAL;
++ transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ cpumask_copy(policy->cpus, priv->cpus);
+ policy->driver_data = priv;
+diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
+index db1c88e9d3f9cd..e93697d3edfd9b 100644
+--- a/drivers/cpufreq/imx6q-cpufreq.c
++++ b/drivers/cpufreq/imx6q-cpufreq.c
+@@ -442,7 +442,7 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
+ }
+
+ if (of_property_read_u32(np, "clock-latency", &transition_latency))
+- transition_latency = CPUFREQ_ETERNAL;
++ transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ /*
+ * Calculate the ramp time for max voltage change in the
+diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
+index 0d5d283a5429b3..fc02a3542f6569 100644
+--- a/drivers/cpufreq/intel_pstate.c
++++ b/drivers/cpufreq/intel_pstate.c
+@@ -1710,10 +1710,10 @@ static void update_qos_request(enum freq_qos_req_type type)
+ continue;
+
+ req = policy->driver_data;
+- cpufreq_cpu_put(policy);
+-
+- if (!req)
++ if (!req) {
++ cpufreq_cpu_put(policy);
+ continue;
++ }
+
+ if (hwp_active)
+ intel_pstate_get_hwp_cap(cpu);
+@@ -1729,6 +1729,8 @@ static void update_qos_request(enum freq_qos_req_type type)
+
+ if (freq_qos_update_request(req, freq) < 0)
+ pr_warn("Failed to update freq constraint: CPU%d\n", i);
++
++ cpufreq_cpu_put(policy);
+ }
+ }
+
+diff --git a/drivers/cpufreq/mediatek-cpufreq-hw.c b/drivers/cpufreq/mediatek-cpufreq-hw.c
+index 74f1b4c796e4cc..d0374340ef29a0 100644
+--- a/drivers/cpufreq/mediatek-cpufreq-hw.c
++++ b/drivers/cpufreq/mediatek-cpufreq-hw.c
+@@ -238,7 +238,7 @@ static int mtk_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
+
+ latency = readl_relaxed(data->reg_bases[REG_FREQ_LATENCY]) * 1000;
+ if (!latency)
+- latency = CPUFREQ_ETERNAL;
++ latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ policy->cpuinfo.transition_latency = latency;
+ policy->fast_switch_possible = true;
+diff --git a/drivers/cpufreq/rcpufreq_dt.rs b/drivers/cpufreq/rcpufreq_dt.rs
+index 7e1fbf9a091f74..3909022e1c7448 100644
+--- a/drivers/cpufreq/rcpufreq_dt.rs
++++ b/drivers/cpufreq/rcpufreq_dt.rs
+@@ -123,7 +123,7 @@ fn init(policy: &mut cpufreq::Policy) -> Result<Self::PData> {
+
+ let mut transition_latency = opp_table.max_transition_latency_ns() as u32;
+ if transition_latency == 0 {
+- transition_latency = cpufreq::ETERNAL_LATENCY_NS;
++ transition_latency = cpufreq::DEFAULT_TRANSITION_LATENCY_NS;
+ }
+
+ policy
+diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
+index 38c165d526d144..d2a110079f5fd5 100644
+--- a/drivers/cpufreq/scmi-cpufreq.c
++++ b/drivers/cpufreq/scmi-cpufreq.c
+@@ -294,7 +294,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
+
+ latency = perf_ops->transition_latency_get(ph, domain);
+ if (!latency)
+- latency = CPUFREQ_ETERNAL;
++ latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ policy->cpuinfo.transition_latency = latency;
+
+diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
+index dcbb0ae7dd476c..e530345baddf6a 100644
+--- a/drivers/cpufreq/scpi-cpufreq.c
++++ b/drivers/cpufreq/scpi-cpufreq.c
+@@ -157,7 +157,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
+
+ latency = scpi_ops->get_transition_latency(cpu_dev);
+ if (!latency)
+- latency = CPUFREQ_ETERNAL;
++ latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ policy->cpuinfo.transition_latency = latency;
+
+diff --git a/drivers/cpufreq/spear-cpufreq.c b/drivers/cpufreq/spear-cpufreq.c
+index 707c71090cc322..2a1550e1aa21fc 100644
+--- a/drivers/cpufreq/spear-cpufreq.c
++++ b/drivers/cpufreq/spear-cpufreq.c
+@@ -182,7 +182,7 @@ static int spear_cpufreq_probe(struct platform_device *pdev)
+
+ if (of_property_read_u32(np, "clock-latency",
+ &spear_cpufreq.transition_latency))
+- spear_cpufreq.transition_latency = CPUFREQ_ETERNAL;
++ spear_cpufreq.transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ cnt = of_property_count_u32_elems(np, "cpufreq_tbl");
+ if (cnt <= 0) {
+diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
+index cbabb726c6645d..6c394b429b6182 100644
+--- a/drivers/cpufreq/tegra186-cpufreq.c
++++ b/drivers/cpufreq/tegra186-cpufreq.c
+@@ -93,10 +93,14 @@ static int tegra186_cpufreq_set_target(struct cpufreq_policy *policy,
+ {
+ struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
+ struct cpufreq_frequency_table *tbl = policy->freq_table + index;
+- unsigned int edvd_offset = data->cpus[policy->cpu].edvd_offset;
++ unsigned int edvd_offset;
+ u32 edvd_val = tbl->driver_data;
++ u32 cpu;
+
+- writel(edvd_val, data->regs + edvd_offset);
++ for_each_cpu(cpu, policy->cpus) {
++ edvd_offset = data->cpus[cpu].edvd_offset;
++ writel(edvd_val, data->regs + edvd_offset);
++ }
+
+ return 0;
+ }
+diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
+index a72dfebc53ffc2..fa201dae1f81b4 100644
+--- a/drivers/crypto/aspeed/aspeed-hace-crypto.c
++++ b/drivers/crypto/aspeed/aspeed-hace-crypto.c
+@@ -346,7 +346,7 @@ static int aspeed_sk_start_sg(struct aspeed_hace_dev *hace_dev)
+
+ } else {
+ dma_unmap_sg(hace_dev->dev, req->dst, rctx->dst_nents,
+- DMA_TO_DEVICE);
++ DMA_FROM_DEVICE);
+ dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
+ DMA_TO_DEVICE);
+ }
+diff --git a/drivers/crypto/atmel-tdes.c b/drivers/crypto/atmel-tdes.c
+index 098f5532f38986..3b2a92029b16f9 100644
+--- a/drivers/crypto/atmel-tdes.c
++++ b/drivers/crypto/atmel-tdes.c
+@@ -512,7 +512,7 @@ static int atmel_tdes_crypt_start(struct atmel_tdes_dev *dd)
+
+ if (err && (dd->flags & TDES_FLAGS_FAST)) {
+ dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
+- dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE);
++ dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
+ }
+
+ return err;
+diff --git a/drivers/crypto/rockchip/rk3288_crypto_ahash.c b/drivers/crypto/rockchip/rk3288_crypto_ahash.c
+index d6928ebe9526d4..b9f5a8b42e6618 100644
+--- a/drivers/crypto/rockchip/rk3288_crypto_ahash.c
++++ b/drivers/crypto/rockchip/rk3288_crypto_ahash.c
+@@ -254,7 +254,7 @@ static void rk_hash_unprepare(struct crypto_engine *engine, void *breq)
+ struct rk_ahash_rctx *rctx = ahash_request_ctx(areq);
+ struct rk_crypto_info *rkc = rctx->dev;
+
+- dma_unmap_sg(rkc->dev, areq->src, rctx->nrsg, DMA_TO_DEVICE);
++ dma_unmap_sg(rkc->dev, areq->src, sg_nents(areq->src), DMA_TO_DEVICE);
+ }
+
+ static int rk_hash_run(struct crypto_engine *engine, void *breq)
+diff --git a/drivers/firmware/arm_scmi/quirks.c b/drivers/firmware/arm_scmi/quirks.c
+index 03960aca361001..03848283c2a07b 100644
+--- a/drivers/firmware/arm_scmi/quirks.c
++++ b/drivers/firmware/arm_scmi/quirks.c
+@@ -71,6 +71,7 @@
+ */
+
+ #include <linux/ctype.h>
++#include <linux/cleanup.h>
+ #include <linux/device.h>
+ #include <linux/export.h>
+ #include <linux/hashtable.h>
+@@ -89,9 +90,9 @@
+ struct scmi_quirk {
+ bool enabled;
+ const char *name;
+- char *vendor;
+- char *sub_vendor_id;
+- char *impl_ver_range;
++ const char *vendor;
++ const char *sub_vendor_id;
++ const char *impl_ver_range;
+ u32 start_range;
+ u32 end_range;
+ struct static_key_false *key;
+@@ -217,7 +218,7 @@ static unsigned int scmi_quirk_signature(const char *vend, const char *sub_vend)
+
+ static int scmi_quirk_range_parse(struct scmi_quirk *quirk)
+ {
+- const char *last, *first = quirk->impl_ver_range;
++ const char *last, *first __free(kfree) = NULL;
+ size_t len;
+ char *sep;
+ int ret;
+@@ -228,8 +229,12 @@ static int scmi_quirk_range_parse(struct scmi_quirk *quirk)
+ if (!len)
+ return 0;
+
++ first = kmemdup(quirk->impl_ver_range, len + 1, GFP_KERNEL);
++ if (!first)
++ return -ENOMEM;
++
+ last = first + len - 1;
+- sep = strchr(quirk->impl_ver_range, '-');
++ sep = strchr(first, '-');
+ if (sep)
+ *sep = '\0';
+
+diff --git a/drivers/firmware/meson/meson_sm.c b/drivers/firmware/meson/meson_sm.c
+index f25a9746249b60..3ab67aaa9e5da9 100644
+--- a/drivers/firmware/meson/meson_sm.c
++++ b/drivers/firmware/meson/meson_sm.c
+@@ -232,11 +232,16 @@ EXPORT_SYMBOL(meson_sm_call_write);
+ struct meson_sm_firmware *meson_sm_get(struct device_node *sm_node)
+ {
+ struct platform_device *pdev = of_find_device_by_node(sm_node);
++ struct meson_sm_firmware *fw;
+
+ if (!pdev)
+ return NULL;
+
+- return platform_get_drvdata(pdev);
++ fw = platform_get_drvdata(pdev);
++
++ put_device(&pdev->dev);
++
++ return fw;
+ }
+ EXPORT_SYMBOL_GPL(meson_sm_get);
+
+diff --git a/drivers/firmware/samsung/exynos-acpm-pmic.c b/drivers/firmware/samsung/exynos-acpm-pmic.c
+index 39b33a356ebd24..961d7599e4224e 100644
+--- a/drivers/firmware/samsung/exynos-acpm-pmic.c
++++ b/drivers/firmware/samsung/exynos-acpm-pmic.c
+@@ -4,7 +4,9 @@
+ * Copyright 2020 Google LLC.
+ * Copyright 2024 Linaro Ltd.
+ */
++#include <linux/array_size.h>
+ #include <linux/bitfield.h>
++#include <linux/errno.h>
+ #include <linux/firmware/samsung/exynos-acpm-protocol.h>
+ #include <linux/ktime.h>
+ #include <linux/types.h>
+@@ -33,6 +35,19 @@ enum exynos_acpm_pmic_func {
+ ACPM_PMIC_BULK_WRITE,
+ };
+
++static const int acpm_pmic_linux_errmap[] = {
++ [0] = 0, /* ACPM_PMIC_SUCCESS */
++ [1] = -EACCES, /* Read register can't be accessed or issues to access it. */
++ [2] = -EACCES, /* Write register can't be accessed or issues to access it. */
++};
++
++static int acpm_pmic_to_linux_err(int err)
++{
++ if (err >= 0 && err < ARRAY_SIZE(acpm_pmic_linux_errmap))
++ return acpm_pmic_linux_errmap[err];
++ return -EIO;
++}
++
+ static inline u32 acpm_pmic_set_bulk(u32 data, unsigned int i)
+ {
+ return (data & ACPM_PMIC_BULK_MASK) << (ACPM_PMIC_BULK_SHIFT * i);
+@@ -79,7 +94,7 @@ int acpm_pmic_read_reg(const struct acpm_handle *handle,
+
+ *buf = FIELD_GET(ACPM_PMIC_VALUE, xfer.rxd[1]);
+
+- return FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]);
++ return acpm_pmic_to_linux_err(FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]));
+ }
+
+ static void acpm_pmic_init_bulk_read_cmd(u32 cmd[4], u8 type, u8 reg, u8 chan,
+@@ -110,7 +125,7 @@ int acpm_pmic_bulk_read(const struct acpm_handle *handle,
+ if (ret)
+ return ret;
+
+- ret = FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]);
++ ret = acpm_pmic_to_linux_err(FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]));
+ if (ret)
+ return ret;
+
+@@ -150,7 +165,7 @@ int acpm_pmic_write_reg(const struct acpm_handle *handle,
+ if (ret)
+ return ret;
+
+- return FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]);
++ return acpm_pmic_to_linux_err(FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]));
+ }
+
+ static void acpm_pmic_init_bulk_write_cmd(u32 cmd[4], u8 type, u8 reg, u8 chan,
+@@ -190,7 +205,7 @@ int acpm_pmic_bulk_write(const struct acpm_handle *handle,
+ if (ret)
+ return ret;
+
+- return FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]);
++ return acpm_pmic_to_linux_err(FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]));
+ }
+
+ static void acpm_pmic_init_update_cmd(u32 cmd[4], u8 type, u8 reg, u8 chan,
+@@ -220,5 +235,5 @@ int acpm_pmic_update_reg(const struct acpm_handle *handle,
+ if (ret)
+ return ret;
+
+- return FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]);
++ return acpm_pmic_to_linux_err(FIELD_GET(ACPM_PMIC_RETURN, xfer.rxd[1]));
+ }
+diff --git a/drivers/gpio/gpio-mpfs.c b/drivers/gpio/gpio-mpfs.c
+index 82d557a7e5d8d5..9468795b96348a 100644
+--- a/drivers/gpio/gpio-mpfs.c
++++ b/drivers/gpio/gpio-mpfs.c
+@@ -69,7 +69,7 @@ static int mpfs_gpio_direction_output(struct gpio_chip *gc, unsigned int gpio_in
+ struct mpfs_gpio_chip *mpfs_gpio = gpiochip_get_data(gc);
+
+ regmap_update_bits(mpfs_gpio->regs, MPFS_GPIO_CTRL(gpio_index),
+- MPFS_GPIO_DIR_MASK, MPFS_GPIO_EN_IN);
++ MPFS_GPIO_DIR_MASK, MPFS_GPIO_EN_OUT | MPFS_GPIO_EN_OUT_BUF);
+ regmap_update_bits(mpfs_gpio->regs, mpfs_gpio->offsets->outp, BIT(gpio_index),
+ value << gpio_index);
+
+diff --git a/drivers/gpio/gpio-wcd934x.c b/drivers/gpio/gpio-wcd934x.c
+index 4af504c23e6ff5..572b85e773700e 100644
+--- a/drivers/gpio/gpio-wcd934x.c
++++ b/drivers/gpio/gpio-wcd934x.c
+@@ -103,7 +103,7 @@ static int wcd_gpio_probe(struct platform_device *pdev)
+ chip->base = -1;
+ chip->ngpio = WCD934X_NPINS;
+ chip->label = dev_name(dev);
+- chip->can_sleep = false;
++ chip->can_sleep = true;
+
+ return devm_gpiochip_add_data(dev, chip, data);
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+index b16cce7c22c373..d5f9d48bf8842d 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+@@ -2583,12 +2583,17 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info,
+ * from the KFD, trigger a segmentation fault in VM debug mode.
+ */
+ if (amdgpu_ttm_adev(bo->tbo.bdev)->debug_vm_userptr) {
++ struct kfd_process *p;
++
+ pr_err("Pid %d unmapped memory before destroying userptr at GPU addr 0x%llx\n",
+ pid_nr(process_info->pid), mem->va);
+
+ // Send GPU VM fault to user space
+- kfd_signal_vm_fault_event_with_userptr(kfd_lookup_process_by_pid(process_info->pid),
+- mem->va);
++ p = kfd_lookup_process_by_pid(process_info->pid);
++ if (p) {
++ kfd_signal_vm_fault_event_with_userptr(p, mem->va);
++ kfd_unref_process(p);
++ }
+ }
+
+ ret = 0;
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index ef026143dc1ca9..58c4e57abc9e0f 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -1956,6 +1956,10 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+
+ init_data.flags.disable_ips_in_vpb = 0;
+
++ /* DCN35 and above supports dynamic DTBCLK switch */
++ if (amdgpu_ip_version(adev, DCE_HWIP, 0) >= IP_VERSION(3, 5, 0))
++ init_data.flags.allow_0_dtb_clk = true;
++
+ /* Enable DWB for tested platforms only */
+ if (amdgpu_ip_version(adev, DCE_HWIP, 0) >= IP_VERSION(3, 0, 0))
+ init_data.num_virtual_links = 1;
+diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+index 2b1673d69ea83f..1ab5ae9b5ea515 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
++++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+@@ -154,10 +154,13 @@ static bool dce60_setup_scaling_configuration(
+ REG_SET(SCL_BYPASS_CONTROL, 0, SCL_BYPASS_MODE, 0);
+
+ if (data->taps.h_taps + data->taps.v_taps <= 2) {
+- /* Set bypass */
+-
+- /* DCE6 has no SCL_MODE register, skip scale mode programming */
++ /* Disable scaler functionality */
++ REG_WRITE(SCL_SCALER_ENABLE, 0);
+
++ /* Clear registers that can cause glitches even when the scaler is off */
++ REG_WRITE(SCL_TAP_CONTROL, 0);
++ REG_WRITE(SCL_AUTOMATIC_MODE_CONTROL, 0);
++ REG_WRITE(SCL_F_SHARP_CONTROL, 0);
+ return false;
+ }
+
+@@ -165,7 +168,7 @@ static bool dce60_setup_scaling_configuration(
+ SCL_H_NUM_OF_TAPS, data->taps.h_taps - 1,
+ SCL_V_NUM_OF_TAPS, data->taps.v_taps - 1);
+
+- /* DCE6 has no SCL_MODE register, skip scale mode programming */
++ REG_WRITE(SCL_SCALER_ENABLE, 1);
+
+ /* DCE6 has no SCL_BOUNDARY_MODE bit, skip replace out of bound pixels */
+
+@@ -502,6 +505,8 @@ static void dce60_transform_set_scaler(
+ REG_SET(DC_LB_MEM_SIZE, 0,
+ DC_LB_MEM_SIZE, xfm_dce->lb_memory_size);
+
++ REG_WRITE(SCL_UPDATE, 0x00010000);
++
+ /* Clear SCL_F_SHARP_CONTROL value to 0 */
+ REG_WRITE(SCL_F_SHARP_CONTROL, 0);
+
+@@ -527,8 +532,7 @@ static void dce60_transform_set_scaler(
+ if (coeffs_v != xfm_dce->filter_v || coeffs_h != xfm_dce->filter_h) {
+ /* 4. Program vertical filters */
+ if (xfm_dce->filter_v == NULL)
+- REG_SET(SCL_VERT_FILTER_CONTROL, 0,
+- SCL_V_2TAP_HARDCODE_COEF_EN, 0);
++ REG_WRITE(SCL_VERT_FILTER_CONTROL, 0);
+ program_multi_taps_filter(
+ xfm_dce,
+ data->taps.v_taps,
+@@ -542,8 +546,7 @@ static void dce60_transform_set_scaler(
+
+ /* 5. Program horizontal filters */
+ if (xfm_dce->filter_h == NULL)
+- REG_SET(SCL_HORZ_FILTER_CONTROL, 0,
+- SCL_H_2TAP_HARDCODE_COEF_EN, 0);
++ REG_WRITE(SCL_HORZ_FILTER_CONTROL, 0);
+ program_multi_taps_filter(
+ xfm_dce,
+ data->taps.h_taps,
+@@ -566,6 +569,8 @@ static void dce60_transform_set_scaler(
+ /* DCE6 has no SCL_COEF_UPDATE_COMPLETE bit to flip to new coefficient memory */
+
+ /* DCE6 DATA_FORMAT register does not support ALPHA_EN */
++
++ REG_WRITE(SCL_UPDATE, 0);
+ }
+ #endif
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h
+index cbce194ec7b82b..eb716e8337e236 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h
++++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h
+@@ -155,6 +155,9 @@
+ SRI(SCL_COEF_RAM_TAP_DATA, SCL, id), \
+ SRI(VIEWPORT_START, SCL, id), \
+ SRI(VIEWPORT_SIZE, SCL, id), \
++ SRI(SCL_SCALER_ENABLE, SCL, id), \
++ SRI(SCL_HORZ_FILTER_INIT_RGB_LUMA, SCL, id), \
++ SRI(SCL_HORZ_FILTER_INIT_CHROMA, SCL, id), \
+ SRI(SCL_HORZ_FILTER_SCALE_RATIO, SCL, id), \
+ SRI(SCL_VERT_FILTER_SCALE_RATIO, SCL, id), \
+ SRI(SCL_VERT_FILTER_INIT, SCL, id), \
+@@ -590,6 +593,7 @@ struct dce_transform_registers {
+ uint32_t SCL_VERT_FILTER_SCALE_RATIO;
+ uint32_t SCL_HORZ_FILTER_INIT;
+ #if defined(CONFIG_DRM_AMD_DC_SI)
++ uint32_t SCL_SCALER_ENABLE;
+ uint32_t SCL_HORZ_FILTER_INIT_RGB_LUMA;
+ uint32_t SCL_HORZ_FILTER_INIT_CHROMA;
+ #endif
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+index 17a21bcbde1722..1a28061bb9ff77 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+@@ -808,6 +808,8 @@ void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_param
+
+ int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc)
+ {
++ dc_assert_fp_enabled();
++
+ return soc->clock_limits[0].dispclk_mhz * 10000.0 / (1.0 + soc->dcn_downspread_percent / 100.0);
+ }
+
+@@ -815,6 +817,8 @@ int dcn_get_approx_det_segs_required_for_pstate(
+ struct _vcs_dpi_soc_bounding_box_st *soc,
+ int pix_clk_100hz, int bpp, int seg_size_kb)
+ {
++ dc_assert_fp_enabled();
++
+ /* Roughly calculate required crb to hide latency. In practice there is slightly
+ * more buffer available for latency hiding
+ */
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
+index 5d73efa2f0c909..15a1d77dfe3624 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
+@@ -445,6 +445,8 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
+ bool upscaled = false;
+ const unsigned int max_allowed_vblank_nom = 1023;
+
++ dc_assert_fp_enabled();
++
+ dcn31_populate_dml_pipes_from_context(dc, context, pipes,
+ validate_mode);
+
+@@ -498,9 +500,7 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
+
+ pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
+
+- DC_FP_START();
+ dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
+- DC_FP_END();
+
+ pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
+ pipes[pipe_cnt].pipe.src.dcc_rate = 3;
+@@ -581,6 +581,8 @@ void dcn35_decide_zstate_support(struct dc *dc, struct dc_state *context)
+ unsigned int i, plane_count = 0;
+ DC_LOGGER_INIT(dc->ctx->logger);
+
++ dc_assert_fp_enabled();
++
+ for (i = 0; i < dc->res_pool->pipe_count; i++) {
+ if (context->res_ctx.pipe_ctx[i].plane_state)
+ plane_count++;
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
+index 6f516af8295644..e5cfe73f640afe 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
+@@ -478,6 +478,8 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
+ bool upscaled = false;
+ const unsigned int max_allowed_vblank_nom = 1023;
+
++ dc_assert_fp_enabled();
++
+ dcn31_populate_dml_pipes_from_context(dc, context, pipes,
+ validate_mode);
+
+@@ -531,9 +533,7 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
+
+ pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
+
+- DC_FP_START();
+ dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
+- DC_FP_END();
+
+ pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
+ pipes[pipe_cnt].pipe.src.dcc_rate = 3;
+diff --git a/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c
+index 53b60044653f80..f887d59da7c6f9 100644
+--- a/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c
+@@ -403,13 +403,13 @@ static const struct dc_plane_cap plane_cap = {
+ },
+
+ .max_upscale_factor = {
+- .argb8888 = 16000,
++ .argb8888 = 1,
+ .nv12 = 1,
+ .fp16 = 1
+ },
+
+ .max_downscale_factor = {
+- .argb8888 = 250,
++ .argb8888 = 1,
+ .nv12 = 1,
+ .fp16 = 1
+ }
+diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
+index 8475c6eec547b5..32678b66c410bd 100644
+--- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
+@@ -1760,6 +1760,20 @@ enum dc_status dcn35_patch_unknown_plane_state(struct dc_plane_state *plane_stat
+ }
+
+
++static int populate_dml_pipes_from_context_fpu(struct dc *dc,
++ struct dc_state *context,
++ display_e2e_pipe_params_st *pipes,
++ enum dc_validate_mode validate_mode)
++{
++ int ret;
++
++ DC_FP_START();
++ ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
++ DC_FP_END();
++
++ return ret;
++}
++
+ static struct resource_funcs dcn35_res_pool_funcs = {
+ .destroy = dcn35_destroy_resource_pool,
+ .link_enc_create = dcn35_link_encoder_create,
+@@ -1770,7 +1784,7 @@ static struct resource_funcs dcn35_res_pool_funcs = {
+ .validate_bandwidth = dcn35_validate_bandwidth,
+ .calculate_wm_and_dlg = NULL,
+ .update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
+- .populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
++ .populate_dml_pipes = populate_dml_pipes_from_context_fpu,
+ .acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
+ .release_pipe = dcn20_release_pipe,
+ .add_stream_to_ctx = dcn30_add_stream_to_ctx,
+diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
+index 0971c0f7418655..677cee27589c2e 100644
+--- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
+@@ -1732,6 +1732,21 @@ static enum dc_status dcn351_validate_bandwidth(struct dc *dc,
+ return out ? DC_OK : DC_FAIL_BANDWIDTH_VALIDATE;
+ }
+
++static int populate_dml_pipes_from_context_fpu(struct dc *dc,
++ struct dc_state *context,
++ display_e2e_pipe_params_st *pipes,
++ enum dc_validate_mode validate_mode)
++{
++ int ret;
++
++ DC_FP_START();
++ ret = dcn351_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
++ DC_FP_END();
++
++ return ret;
++
++}
++
+ static struct resource_funcs dcn351_res_pool_funcs = {
+ .destroy = dcn351_destroy_resource_pool,
+ .link_enc_create = dcn35_link_encoder_create,
+@@ -1742,7 +1757,7 @@ static struct resource_funcs dcn351_res_pool_funcs = {
+ .validate_bandwidth = dcn351_validate_bandwidth,
+ .calculate_wm_and_dlg = NULL,
+ .update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
+- .populate_dml_pipes = dcn351_populate_dml_pipes_from_context_fpu,
++ .populate_dml_pipes = populate_dml_pipes_from_context_fpu,
+ .acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
+ .release_pipe = dcn20_release_pipe,
+ .add_stream_to_ctx = dcn30_add_stream_to_ctx,
+diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
+index 8bae7fcedc22d3..d81540515e5ce4 100644
+--- a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
+@@ -1734,6 +1734,20 @@ static enum dc_status dcn35_validate_bandwidth(struct dc *dc,
+ }
+
+
++static int populate_dml_pipes_from_context_fpu(struct dc *dc,
++ struct dc_state *context,
++ display_e2e_pipe_params_st *pipes,
++ enum dc_validate_mode validate_mode)
++{
++ int ret;
++
++ DC_FP_START();
++ ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
++ DC_FP_END();
++
++ return ret;
++}
++
+ static struct resource_funcs dcn36_res_pool_funcs = {
+ .destroy = dcn36_destroy_resource_pool,
+ .link_enc_create = dcn35_link_encoder_create,
+@@ -1744,7 +1758,7 @@ static struct resource_funcs dcn36_res_pool_funcs = {
+ .validate_bandwidth = dcn35_validate_bandwidth,
+ .calculate_wm_and_dlg = NULL,
+ .update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
+- .populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
++ .populate_dml_pipes = populate_dml_pipes_from_context_fpu,
+ .acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
+ .release_pipe = dcn20_release_pipe,
+ .add_stream_to_ctx = dcn30_add_stream_to_ctx,
+diff --git a/drivers/gpu/drm/amd/display/dc/sspl/dc_spl.c b/drivers/gpu/drm/amd/display/dc/sspl/dc_spl.c
+index 55b929ca798298..b1fb0f8a253a5f 100644
+--- a/drivers/gpu/drm/amd/display/dc/sspl/dc_spl.c
++++ b/drivers/gpu/drm/amd/display/dc/sspl/dc_spl.c
+@@ -641,16 +641,16 @@ static void spl_calculate_inits_and_viewports(struct spl_in *spl_in,
+ /* this gives the direction of the cositing (negative will move
+ * left, right otherwise)
+ */
+- int sign = 1;
++ int h_sign = flip_horz_scan_dir ? -1 : 1;
++ int v_sign = flip_vert_scan_dir ? -1 : 1;
+
+ switch (spl_in->basic_in.cositing) {
+-
+ case CHROMA_COSITING_TOPLEFT:
+- init_adj_h = spl_fixpt_from_fraction(sign, 4);
+- init_adj_v = spl_fixpt_from_fraction(sign, 4);
++ init_adj_h = spl_fixpt_from_fraction(h_sign, 4);
++ init_adj_v = spl_fixpt_from_fraction(v_sign, 4);
+ break;
+ case CHROMA_COSITING_LEFT:
+- init_adj_h = spl_fixpt_from_fraction(sign, 4);
++ init_adj_h = spl_fixpt_from_fraction(h_sign, 4);
+ init_adj_v = spl_fixpt_zero;
+ break;
+ case CHROMA_COSITING_NONE:
+diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h
+index 9de01ae574c035..067eddd9c62d80 100644
+--- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h
++++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h
+@@ -4115,6 +4115,7 @@
+ #define mmSCL0_SCL_COEF_RAM_CONFLICT_STATUS 0x1B55
+ #define mmSCL0_SCL_COEF_RAM_SELECT 0x1B40
+ #define mmSCL0_SCL_COEF_RAM_TAP_DATA 0x1B41
++#define mmSCL0_SCL_SCALER_ENABLE 0x1B42
+ #define mmSCL0_SCL_CONTROL 0x1B44
+ #define mmSCL0_SCL_DEBUG 0x1B6A
+ #define mmSCL0_SCL_DEBUG2 0x1B69
+@@ -4144,6 +4145,7 @@
+ #define mmSCL1_SCL_COEF_RAM_CONFLICT_STATUS 0x1E55
+ #define mmSCL1_SCL_COEF_RAM_SELECT 0x1E40
+ #define mmSCL1_SCL_COEF_RAM_TAP_DATA 0x1E41
++#define mmSCL1_SCL_SCALER_ENABLE 0x1E42
+ #define mmSCL1_SCL_CONTROL 0x1E44
+ #define mmSCL1_SCL_DEBUG 0x1E6A
+ #define mmSCL1_SCL_DEBUG2 0x1E69
+@@ -4173,6 +4175,7 @@
+ #define mmSCL2_SCL_COEF_RAM_CONFLICT_STATUS 0x4155
+ #define mmSCL2_SCL_COEF_RAM_SELECT 0x4140
+ #define mmSCL2_SCL_COEF_RAM_TAP_DATA 0x4141
++#define mmSCL2_SCL_SCALER_ENABLE 0x4142
+ #define mmSCL2_SCL_CONTROL 0x4144
+ #define mmSCL2_SCL_DEBUG 0x416A
+ #define mmSCL2_SCL_DEBUG2 0x4169
+@@ -4202,6 +4205,7 @@
+ #define mmSCL3_SCL_COEF_RAM_CONFLICT_STATUS 0x4455
+ #define mmSCL3_SCL_COEF_RAM_SELECT 0x4440
+ #define mmSCL3_SCL_COEF_RAM_TAP_DATA 0x4441
++#define mmSCL3_SCL_SCALER_ENABLE 0x4442
+ #define mmSCL3_SCL_CONTROL 0x4444
+ #define mmSCL3_SCL_DEBUG 0x446A
+ #define mmSCL3_SCL_DEBUG2 0x4469
+@@ -4231,6 +4235,7 @@
+ #define mmSCL4_SCL_COEF_RAM_CONFLICT_STATUS 0x4755
+ #define mmSCL4_SCL_COEF_RAM_SELECT 0x4740
+ #define mmSCL4_SCL_COEF_RAM_TAP_DATA 0x4741
++#define mmSCL4_SCL_SCALER_ENABLE 0x4742
+ #define mmSCL4_SCL_CONTROL 0x4744
+ #define mmSCL4_SCL_DEBUG 0x476A
+ #define mmSCL4_SCL_DEBUG2 0x4769
+@@ -4260,6 +4265,7 @@
+ #define mmSCL5_SCL_COEF_RAM_CONFLICT_STATUS 0x4A55
+ #define mmSCL5_SCL_COEF_RAM_SELECT 0x4A40
+ #define mmSCL5_SCL_COEF_RAM_TAP_DATA 0x4A41
++#define mmSCL5_SCL_SCALER_ENABLE 0x4A42
+ #define mmSCL5_SCL_CONTROL 0x4A44
+ #define mmSCL5_SCL_DEBUG 0x4A6A
+ #define mmSCL5_SCL_DEBUG2 0x4A69
+@@ -4287,6 +4293,7 @@
+ #define mmSCL_COEF_RAM_CONFLICT_STATUS 0x1B55
+ #define mmSCL_COEF_RAM_SELECT 0x1B40
+ #define mmSCL_COEF_RAM_TAP_DATA 0x1B41
++#define mmSCL_SCALER_ENABLE 0x1B42
+ #define mmSCL_CONTROL 0x1B44
+ #define mmSCL_DEBUG 0x1B6A
+ #define mmSCL_DEBUG2 0x1B69
+diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h
+index 2d6a598a6c25cd..9317a7afa6211f 100644
+--- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h
++++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h
+@@ -8650,6 +8650,8 @@
+ #define REGAMMA_LUT_INDEX__REGAMMA_LUT_INDEX__SHIFT 0x00000000
+ #define REGAMMA_LUT_WRITE_EN_MASK__REGAMMA_LUT_WRITE_EN_MASK_MASK 0x00000007L
+ #define REGAMMA_LUT_WRITE_EN_MASK__REGAMMA_LUT_WRITE_EN_MASK__SHIFT 0x00000000
++#define SCL_SCALER_ENABLE__SCL_SCALE_EN_MASK 0x00000001L
++#define SCL_SCALER_ENABLE__SCL_SCALE_EN__SHIFT 0x00000000
+ #define SCL_ALU_CONTROL__SCL_ALU_DISABLE_MASK 0x00000001L
+ #define SCL_ALU_CONTROL__SCL_ALU_DISABLE__SHIFT 0x00000000
+ #define SCL_BYPASS_CONTROL__SCL_BYPASS_MODE_MASK 0x00000003L
+diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+index 805aa28c172300..b8d9b72513199e 100644
+--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
++++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+@@ -69,7 +69,6 @@ struct decon_context {
+ void __iomem *regs;
+ unsigned long irq_flags;
+ bool i80_if;
+- bool suspended;
+ wait_queue_head_t wait_vsync_queue;
+ atomic_t wait_vsync_event;
+
+@@ -132,9 +131,6 @@ static void decon_shadow_protect_win(struct decon_context *ctx,
+
+ static void decon_wait_for_vblank(struct decon_context *ctx)
+ {
+- if (ctx->suspended)
+- return;
+-
+ atomic_set(&ctx->wait_vsync_event, 1);
+
+ /*
+@@ -210,9 +206,6 @@ static void decon_commit(struct exynos_drm_crtc *crtc)
+ struct drm_display_mode *mode = &crtc->base.state->adjusted_mode;
+ u32 val, clkdiv;
+
+- if (ctx->suspended)
+- return;
+-
+ /* nothing to do if we haven't set the mode yet */
+ if (mode->htotal == 0 || mode->vtotal == 0)
+ return;
+@@ -274,9 +267,6 @@ static int decon_enable_vblank(struct exynos_drm_crtc *crtc)
+ struct decon_context *ctx = crtc->ctx;
+ u32 val;
+
+- if (ctx->suspended)
+- return -EPERM;
+-
+ if (!test_and_set_bit(0, &ctx->irq_flags)) {
+ val = readl(ctx->regs + VIDINTCON0);
+
+@@ -299,9 +289,6 @@ static void decon_disable_vblank(struct exynos_drm_crtc *crtc)
+ struct decon_context *ctx = crtc->ctx;
+ u32 val;
+
+- if (ctx->suspended)
+- return;
+-
+ if (test_and_clear_bit(0, &ctx->irq_flags)) {
+ val = readl(ctx->regs + VIDINTCON0);
+
+@@ -404,9 +391,6 @@ static void decon_atomic_begin(struct exynos_drm_crtc *crtc)
+ struct decon_context *ctx = crtc->ctx;
+ int i;
+
+- if (ctx->suspended)
+- return;
+-
+ for (i = 0; i < WINDOWS_NR; i++)
+ decon_shadow_protect_win(ctx, i, true);
+ }
+@@ -427,9 +411,6 @@ static void decon_update_plane(struct exynos_drm_crtc *crtc,
+ unsigned int pitch = fb->pitches[0];
+ unsigned int vidw_addr0_base = ctx->data->vidw_buf_start_base;
+
+- if (ctx->suspended)
+- return;
+-
+ /*
+ * SHADOWCON/PRTCON register is used for enabling timing.
+ *
+@@ -517,9 +498,6 @@ static void decon_disable_plane(struct exynos_drm_crtc *crtc,
+ unsigned int win = plane->index;
+ u32 val;
+
+- if (ctx->suspended)
+- return;
+-
+ /* protect windows */
+ decon_shadow_protect_win(ctx, win, true);
+
+@@ -538,9 +516,6 @@ static void decon_atomic_flush(struct exynos_drm_crtc *crtc)
+ struct decon_context *ctx = crtc->ctx;
+ int i;
+
+- if (ctx->suspended)
+- return;
+-
+ for (i = 0; i < WINDOWS_NR; i++)
+ decon_shadow_protect_win(ctx, i, false);
+ exynos_crtc_handle_event(crtc);
+@@ -568,9 +543,6 @@ static void decon_atomic_enable(struct exynos_drm_crtc *crtc)
+ struct decon_context *ctx = crtc->ctx;
+ int ret;
+
+- if (!ctx->suspended)
+- return;
+-
+ ret = pm_runtime_resume_and_get(ctx->dev);
+ if (ret < 0) {
+ DRM_DEV_ERROR(ctx->dev, "failed to enable DECON device.\n");
+@@ -584,8 +556,6 @@ static void decon_atomic_enable(struct exynos_drm_crtc *crtc)
+ decon_enable_vblank(ctx->crtc);
+
+ decon_commit(ctx->crtc);
+-
+- ctx->suspended = false;
+ }
+
+ static void decon_atomic_disable(struct exynos_drm_crtc *crtc)
+@@ -593,9 +563,6 @@ static void decon_atomic_disable(struct exynos_drm_crtc *crtc)
+ struct decon_context *ctx = crtc->ctx;
+ int i;
+
+- if (ctx->suspended)
+- return;
+-
+ /*
+ * We need to make sure that all windows are disabled before we
+ * suspend that connector. Otherwise we might try to scan from
+@@ -605,8 +572,6 @@ static void decon_atomic_disable(struct exynos_drm_crtc *crtc)
+ decon_disable_plane(crtc, &ctx->planes[i]);
+
+ pm_runtime_put_sync(ctx->dev);
+-
+- ctx->suspended = true;
+ }
+
+ static const struct exynos_drm_crtc_ops decon_crtc_ops = {
+@@ -727,7 +692,6 @@ static int decon_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ ctx->dev = dev;
+- ctx->suspended = true;
+ ctx->data = of_device_get_match_data(dev);
+
+ i80_if_timings = of_get_child_by_name(dev->of_node, "i80-if-timings");
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+index 28e6705c6da682..3369a03978d533 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+@@ -272,6 +272,8 @@ static int a6xx_gmu_start(struct a6xx_gmu *gmu)
+ if (ret)
+ DRM_DEV_ERROR(gmu->dev, "GMU firmware initialization timed out\n");
+
++ set_bit(GMU_STATUS_FW_START, &gmu->status);
++
+ return ret;
+ }
+
+@@ -518,6 +520,9 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
+ int ret;
+ u32 val;
+
++ if (!test_and_clear_bit(GMU_STATUS_PDC_SLEEP, &gmu->status))
++ return 0;
++
+ gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
+
+ ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
+@@ -545,6 +550,9 @@ static void a6xx_rpmh_stop(struct a6xx_gmu *gmu)
+ int ret;
+ u32 val;
+
++ if (test_and_clear_bit(GMU_STATUS_FW_START, &gmu->status))
++ return;
++
+ gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1);
+
+ ret = gmu_poll_timeout_rscc(gmu, REG_A6XX_GPU_RSCC_RSC_STATUS0_DRV0,
+@@ -553,6 +561,8 @@ static void a6xx_rpmh_stop(struct a6xx_gmu *gmu)
+ DRM_DEV_ERROR(gmu->dev, "Unable to power off the GPU RSC\n");
+
+ gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 0);
++
++ set_bit(GMU_STATUS_PDC_SLEEP, &gmu->status);
+ }
+
+ static inline void pdc_write(void __iomem *ptr, u32 offset, u32 value)
+@@ -681,8 +691,6 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
+ /* ensure no writes happen before the uCode is fully written */
+ wmb();
+
+- a6xx_rpmh_stop(gmu);
+-
+ err:
+ if (!IS_ERR_OR_NULL(pdcptr))
+ iounmap(pdcptr);
+@@ -842,19 +850,15 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, unsigned int state)
+ else
+ gmu_write(gmu, REG_A6XX_GMU_GENERAL_7, 1);
+
+- if (state == GMU_WARM_BOOT) {
+- ret = a6xx_rpmh_start(gmu);
+- if (ret)
+- return ret;
+- } else {
++ ret = a6xx_rpmh_start(gmu);
++ if (ret)
++ return ret;
++
++ if (state == GMU_COLD_BOOT) {
+ if (WARN(!adreno_gpu->fw[ADRENO_FW_GMU],
+ "GMU firmware is not loaded\n"))
+ return -ENOENT;
+
+- ret = a6xx_rpmh_start(gmu);
+- if (ret)
+- return ret;
+-
+ ret = a6xx_gmu_fw_load(gmu);
+ if (ret)
+ return ret;
+@@ -1023,6 +1027,8 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
+
+ /* Reset GPU core blocks */
+ a6xx_gpu_sw_reset(gpu, true);
++
++ a6xx_rpmh_stop(gmu);
+ }
+
+ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu)
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+index d1ce11131ba674..069a8c9474e8be 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+@@ -117,6 +117,12 @@ struct a6xx_gmu {
+
+ struct qmp *qmp;
+ struct a6xx_hfi_msg_bw_table *bw_table;
++
++/* To check if we can trigger sleep seq at PDC. Cleared in a6xx_rpmh_stop() */
++#define GMU_STATUS_FW_START 0
++/* To track if PDC sleep seq was done */
++#define GMU_STATUS_PDC_SLEEP 1
++ unsigned long status;
+ };
+
+ static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)
+diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
+index b96f0555ca1453..f26562eafffc86 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
++++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
+@@ -929,7 +929,7 @@ nouveau_bo_move_prep(struct nouveau_drm *drm, struct ttm_buffer_object *bo,
+ nvif_vmm_put(vmm, &old_mem->vma[1]);
+ nvif_vmm_put(vmm, &old_mem->vma[0]);
+ }
+- return 0;
++ return ret;
+ }
+
+ static int
+diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
+index 4d8e9b34702a76..23390cbb6ae322 100644
+--- a/drivers/gpu/drm/panthor/panthor_drv.c
++++ b/drivers/gpu/drm/panthor/panthor_drv.c
+@@ -1103,14 +1103,15 @@ static int panthor_ioctl_group_create(struct drm_device *ddev, void *data,
+
+ ret = group_priority_permit(file, args->priority);
+ if (ret)
+- return ret;
++ goto out;
+
+ ret = panthor_group_create(pfile, args, queue_args);
+- if (ret >= 0) {
+- args->group_handle = ret;
+- ret = 0;
+- }
++ if (ret < 0)
++ goto out;
++ args->group_handle = ret;
++ ret = 0;
+
++out:
+ kvfree(queue_args);
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c b/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c
+index 1af4c73f7a8877..952c3efb74da9b 100644
+--- a/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c
++++ b/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c
+@@ -576,7 +576,10 @@ static int rcar_mipi_dsi_startup(struct rcar_mipi_dsi *dsi,
+ udelay(10);
+ rcar_mipi_dsi_clr(dsi, CLOCKSET1, CLOCKSET1_UPDATEPLL);
+
+- ppisetr = PPISETR_DLEN_3 | PPISETR_CLEN;
++ rcar_mipi_dsi_clr(dsi, TXSETR, TXSETR_LANECNT_MASK);
++ rcar_mipi_dsi_set(dsi, TXSETR, dsi->lanes - 1);
++
++ ppisetr = ((BIT(dsi->lanes) - 1) & PPISETR_DLEN_MASK) | PPISETR_CLEN;
+ rcar_mipi_dsi_write(dsi, PPISETR, ppisetr);
+
+ rcar_mipi_dsi_set(dsi, PHYSETUP, PHYSETUP_SHUTDOWNZ);
+diff --git a/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h b/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h
+index a6b276f1d6ee15..a54c7eb4113b93 100644
+--- a/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h
++++ b/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h
+@@ -12,6 +12,9 @@
+ #define LINKSR_LPBUSY (1 << 1)
+ #define LINKSR_HSBUSY (1 << 0)
+
++#define TXSETR 0x100
++#define TXSETR_LANECNT_MASK (0x3 << 0)
++
+ /*
+ * Video Mode Register
+ */
+@@ -80,10 +83,7 @@
+ * PHY-Protocol Interface (PPI) Registers
+ */
+ #define PPISETR 0x700
+-#define PPISETR_DLEN_0 (0x1 << 0)
+-#define PPISETR_DLEN_1 (0x3 << 0)
+-#define PPISETR_DLEN_2 (0x7 << 0)
+-#define PPISETR_DLEN_3 (0xf << 0)
++#define PPISETR_DLEN_MASK (0xf << 0)
+ #define PPISETR_CLEN (1 << 8)
+
+ #define PPICLCR 0x710
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+index 819704ac675d08..d539f25b5fbe0a 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+@@ -1497,6 +1497,7 @@ static int vmw_cmd_dma(struct vmw_private *dev_priv,
+ SVGA3dCmdHeader *header)
+ {
+ struct vmw_bo *vmw_bo = NULL;
++ struct vmw_resource *res;
+ struct vmw_surface *srf = NULL;
+ VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdSurfaceDMA);
+ int ret;
+@@ -1532,18 +1533,24 @@ static int vmw_cmd_dma(struct vmw_private *dev_priv,
+
+ dirty = (cmd->body.transfer == SVGA3D_WRITE_HOST_VRAM) ?
+ VMW_RES_DIRTY_SET : 0;
+- ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface,
+- dirty, user_surface_converter,
+- &cmd->body.host.sid, NULL);
++ ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, dirty,
++ user_surface_converter, &cmd->body.host.sid,
++ NULL);
+ if (unlikely(ret != 0)) {
+ if (unlikely(ret != -ERESTARTSYS))
+ VMW_DEBUG_USER("could not find surface for DMA.\n");
+ return ret;
+ }
+
+- srf = vmw_res_to_srf(sw_context->res_cache[vmw_res_surface].res);
++ res = sw_context->res_cache[vmw_res_surface].res;
++ if (!res) {
++ VMW_DEBUG_USER("Invalid DMA surface.\n");
++ return -EINVAL;
++ }
+
+- vmw_kms_cursor_snoop(srf, sw_context->fp->tfile, &vmw_bo->tbo, header);
++ srf = vmw_res_to_srf(res);
++ vmw_kms_cursor_snoop(srf, sw_context->fp->tfile, &vmw_bo->tbo,
++ header);
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
+index 7ee93e7191c7fa..35dc94c3db3998 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
+@@ -308,8 +308,10 @@ int vmw_validation_add_resource(struct vmw_validation_context *ctx,
+ hash_add_rcu(ctx->sw_context->res_ht, &node->hash.head, node->hash.key);
+ }
+ node->res = vmw_resource_reference_unless_doomed(res);
+- if (!node->res)
++ if (!node->res) {
++ hash_del_rcu(&node->hash.head);
+ return -ESRCH;
++ }
+
+ node->first_usage = 1;
+ if (!res->dev_priv->has_mob) {
+@@ -636,7 +638,7 @@ void vmw_validation_drop_ht(struct vmw_validation_context *ctx)
+ hash_del_rcu(&val->hash.head);
+
+ list_for_each_entry(val, &ctx->resource_ctx_list, head)
+- hash_del_rcu(&entry->hash.head);
++ hash_del_rcu(&val->hash.head);
+
+ ctx->sw_context = NULL;
+ }
+diff --git a/drivers/gpu/drm/xe/xe_hw_engine_group.c b/drivers/gpu/drm/xe/xe_hw_engine_group.c
+index c926f840c87b0f..cb1d7ed54f4295 100644
+--- a/drivers/gpu/drm/xe/xe_hw_engine_group.c
++++ b/drivers/gpu/drm/xe/xe_hw_engine_group.c
+@@ -213,17 +213,13 @@ static int xe_hw_engine_group_suspend_faulting_lr_jobs(struct xe_hw_engine_group
+
+ err = q->ops->suspend_wait(q);
+ if (err)
+- goto err_suspend;
++ return err;
+ }
+
+ if (need_resume)
+ xe_hw_engine_group_resume_faulting_lr_jobs(group);
+
+ return 0;
+-
+-err_suspend:
+- up_write(&group->mode_sem);
+- return err;
+ }
+
+ /**
+diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
+index bb9b6ecad2afcd..3e301e42b2f199 100644
+--- a/drivers/gpu/drm/xe/xe_pm.c
++++ b/drivers/gpu/drm/xe/xe_pm.c
+@@ -194,7 +194,7 @@ int xe_pm_resume(struct xe_device *xe)
+ if (err)
+ goto err;
+
+- xe_i2c_pm_resume(xe, xe->d3cold.allowed);
++ xe_i2c_pm_resume(xe, true);
+
+ xe_irq_resume(xe);
+
+diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
+index d517ec9ddcbf59..83fe77ce62f761 100644
+--- a/drivers/gpu/drm/xe/xe_query.c
++++ b/drivers/gpu/drm/xe/xe_query.c
+@@ -274,8 +274,7 @@ static int query_mem_regions(struct xe_device *xe,
+ mem_regions->mem_regions[0].instance = 0;
+ mem_regions->mem_regions[0].min_page_size = PAGE_SIZE;
+ mem_regions->mem_regions[0].total_size = man->size << PAGE_SHIFT;
+- if (perfmon_capable())
+- mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man);
++ mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man);
+ mem_regions->num_mem_regions = 1;
+
+ for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) {
+@@ -291,13 +290,11 @@ static int query_mem_regions(struct xe_device *xe,
+ mem_regions->mem_regions[mem_regions->num_mem_regions].total_size =
+ man->size;
+
+- if (perfmon_capable()) {
+- xe_ttm_vram_get_used(man,
+- &mem_regions->mem_regions
+- [mem_regions->num_mem_regions].used,
+- &mem_regions->mem_regions
+- [mem_regions->num_mem_regions].cpu_visible_used);
+- }
++ xe_ttm_vram_get_used(man,
++ &mem_regions->mem_regions
++ [mem_regions->num_mem_regions].used,
++ &mem_regions->mem_regions
++ [mem_regions->num_mem_regions].cpu_visible_used);
+
+ mem_regions->mem_regions[mem_regions->num_mem_regions].cpu_visible_size =
+ xe_ttm_vram_get_cpu_visible_size(man);
+diff --git a/drivers/hv/mshv_common.c b/drivers/hv/mshv_common.c
+index 6f227a8a5af719..eb3df3e296bbee 100644
+--- a/drivers/hv/mshv_common.c
++++ b/drivers/hv/mshv_common.c
+@@ -151,7 +151,7 @@ int mshv_do_pre_guest_mode_work(ulong th_flags)
+ if (th_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
+ return -EINTR;
+
+- if (th_flags & _TIF_NEED_RESCHED)
++ if (th_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY))
+ schedule();
+
+ if (th_flags & _TIF_NOTIFY_RESUME)
+diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
+index 72df774e410aba..cad09ff5f94dce 100644
+--- a/drivers/hv/mshv_root_main.c
++++ b/drivers/hv/mshv_root_main.c
+@@ -490,7 +490,8 @@ mshv_vp_wait_for_hv_kick(struct mshv_vp *vp)
+ static int mshv_pre_guest_mode_work(struct mshv_vp *vp)
+ {
+ const ulong work_flags = _TIF_NOTIFY_SIGNAL | _TIF_SIGPENDING |
+- _TIF_NEED_RESCHED | _TIF_NOTIFY_RESUME;
++ _TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY |
++ _TIF_NOTIFY_RESUME;
+ ulong th_flags;
+
+ th_flags = read_thread_flags();
+diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
+index 2ef898a8fd8065..67a18e437f831e 100644
+--- a/drivers/i3c/master.c
++++ b/drivers/i3c/master.c
+@@ -2492,7 +2492,7 @@ static int i3c_master_i2c_adapter_init(struct i3c_master_controller *master)
+ strscpy(adap->name, dev_name(master->dev.parent), sizeof(adap->name));
+
+ /* FIXME: Should we allow i3c masters to override these values? */
+- adap->timeout = 1000;
++ adap->timeout = HZ;
+ adap->retries = 3;
+
+ id = of_alias_get_id(master->dev.of_node, "i2c");
+diff --git a/drivers/iio/adc/pac1934.c b/drivers/iio/adc/pac1934.c
+index 09fe88eb3fb045..2e442e46f67973 100644
+--- a/drivers/iio/adc/pac1934.c
++++ b/drivers/iio/adc/pac1934.c
+@@ -88,6 +88,7 @@
+ #define PAC1934_VPOWER_3_ADDR 0x19
+ #define PAC1934_VPOWER_4_ADDR 0x1A
+ #define PAC1934_REFRESH_V_REG_ADDR 0x1F
++#define PAC1934_SLOW_REG_ADDR 0x20
+ #define PAC1934_CTRL_STAT_REGS_ADDR 0x1C
+ #define PAC1934_PID_REG_ADDR 0xFD
+ #define PAC1934_MID_REG_ADDR 0xFE
+@@ -1265,8 +1266,23 @@ static int pac1934_chip_configure(struct pac1934_chip_info *info)
+ /* no SLOW triggered REFRESH, clear POR */
+ regs[PAC1934_SLOW_REG_OFF] = 0;
+
+- ret = i2c_smbus_write_block_data(client, PAC1934_CTRL_STAT_REGS_ADDR,
+- ARRAY_SIZE(regs), (u8 *)regs);
++ /*
++ * Write the three bytes sequentially, as the device does not support
++ * block write.
++ */
++ ret = i2c_smbus_write_byte_data(client, PAC1934_CTRL_STAT_REGS_ADDR,
++ regs[PAC1934_CHANNEL_DIS_REG_OFF]);
++ if (ret)
++ return ret;
++
++ ret = i2c_smbus_write_byte_data(client,
++ PAC1934_CTRL_STAT_REGS_ADDR + PAC1934_NEG_PWR_REG_OFF,
++ regs[PAC1934_NEG_PWR_REG_OFF]);
++ if (ret)
++ return ret;
++
++ ret = i2c_smbus_write_byte_data(client, PAC1934_SLOW_REG_ADDR,
++ regs[PAC1934_SLOW_REG_OFF]);
+ if (ret)
+ return ret;
+
+diff --git a/drivers/iio/adc/xilinx-ams.c b/drivers/iio/adc/xilinx-ams.c
+index 76dd0343f5f76a..124470c9252978 100644
+--- a/drivers/iio/adc/xilinx-ams.c
++++ b/drivers/iio/adc/xilinx-ams.c
+@@ -118,7 +118,7 @@
+ #define AMS_ALARM_THRESHOLD_OFF_10 0x10
+ #define AMS_ALARM_THRESHOLD_OFF_20 0x20
+
+-#define AMS_ALARM_THR_DIRECT_MASK BIT(1)
++#define AMS_ALARM_THR_DIRECT_MASK BIT(0)
+ #define AMS_ALARM_THR_MIN 0x0000
+ #define AMS_ALARM_THR_MAX (BIT(16) - 1)
+
+@@ -389,6 +389,29 @@ static void ams_update_pl_alarm(struct ams *ams, unsigned long alarm_mask)
+ ams_pl_update_reg(ams, AMS_REG_CONFIG3, AMS_REGCFG3_ALARM_MASK, cfg);
+ }
+
++static void ams_unmask(struct ams *ams)
++{
++ unsigned int status, unmask;
++
++ status = readl(ams->base + AMS_ISR_0);
++
++ /* Clear those bits which are not active anymore */
++ unmask = (ams->current_masked_alarm ^ status) & ams->current_masked_alarm;
++
++ /* Clear status of disabled alarm */
++ unmask |= ams->intr_mask;
++
++ ams->current_masked_alarm &= status;
++
++ /* Also clear those which are masked out anyway */
++ ams->current_masked_alarm &= ~ams->intr_mask;
++
++ /* Clear the interrupts before we unmask them */
++ writel(unmask, ams->base + AMS_ISR_0);
++
++ ams_update_intrmask(ams, ~AMS_ALARM_MASK, ~AMS_ALARM_MASK);
++}
++
+ static void ams_update_alarm(struct ams *ams, unsigned long alarm_mask)
+ {
+ unsigned long flags;
+@@ -401,6 +424,7 @@ static void ams_update_alarm(struct ams *ams, unsigned long alarm_mask)
+
+ spin_lock_irqsave(&ams->intr_lock, flags);
+ ams_update_intrmask(ams, AMS_ISR0_ALARM_MASK, ~alarm_mask);
++ ams_unmask(ams);
+ spin_unlock_irqrestore(&ams->intr_lock, flags);
+ }
+
+@@ -1035,28 +1059,9 @@ static void ams_handle_events(struct iio_dev *indio_dev, unsigned long events)
+ static void ams_unmask_worker(struct work_struct *work)
+ {
+ struct ams *ams = container_of(work, struct ams, ams_unmask_work.work);
+- unsigned int status, unmask;
+
+ spin_lock_irq(&ams->intr_lock);
+-
+- status = readl(ams->base + AMS_ISR_0);
+-
+- /* Clear those bits which are not active anymore */
+- unmask = (ams->current_masked_alarm ^ status) & ams->current_masked_alarm;
+-
+- /* Clear status of disabled alarm */
+- unmask |= ams->intr_mask;
+-
+- ams->current_masked_alarm &= status;
+-
+- /* Also clear those which are masked out anyway */
+- ams->current_masked_alarm &= ~ams->intr_mask;
+-
+- /* Clear the interrupts before we unmask them */
+- writel(unmask, ams->base + AMS_ISR_0);
+-
+- ams_update_intrmask(ams, ~AMS_ALARM_MASK, ~AMS_ALARM_MASK);
+-
++ ams_unmask(ams);
+ spin_unlock_irq(&ams->intr_lock);
+
+ /* If still pending some alarm re-trigger the timer */
+diff --git a/drivers/iio/dac/ad5360.c b/drivers/iio/dac/ad5360.c
+index a57b0a093112bc..8271849b1c83c0 100644
+--- a/drivers/iio/dac/ad5360.c
++++ b/drivers/iio/dac/ad5360.c
+@@ -262,7 +262,7 @@ static int ad5360_update_ctrl(struct iio_dev *indio_dev, unsigned int set,
+ unsigned int clr)
+ {
+ struct ad5360_state *st = iio_priv(indio_dev);
+- unsigned int ret;
++ int ret;
+
+ mutex_lock(&st->lock);
+
+diff --git a/drivers/iio/dac/ad5421.c b/drivers/iio/dac/ad5421.c
+index 1462ee640b1686..d9d7031c443250 100644
+--- a/drivers/iio/dac/ad5421.c
++++ b/drivers/iio/dac/ad5421.c
+@@ -186,7 +186,7 @@ static int ad5421_update_ctrl(struct iio_dev *indio_dev, unsigned int set,
+ unsigned int clr)
+ {
+ struct ad5421_state *st = iio_priv(indio_dev);
+- unsigned int ret;
++ int ret;
+
+ mutex_lock(&st->lock);
+
+diff --git a/drivers/iio/frequency/adf4350.c b/drivers/iio/frequency/adf4350.c
+index 47f1c7e9efa9f4..475a7a653bfb52 100644
+--- a/drivers/iio/frequency/adf4350.c
++++ b/drivers/iio/frequency/adf4350.c
+@@ -149,6 +149,19 @@ static int adf4350_set_freq(struct adf4350_state *st, unsigned long long freq)
+ if (freq > ADF4350_MAX_OUT_FREQ || freq < st->min_out_freq)
+ return -EINVAL;
+
++ st->r4_rf_div_sel = 0;
++
++ /*
++ * !\TODO: The below computation is making sure we get a power of 2
++ * shift (st->r4_rf_div_sel) so that freq becomes higher or equal to
++ * ADF4350_MIN_VCO_FREQ. This might be simplified with fls()/fls_long()
++ * and friends.
++ */
++ while (freq < ADF4350_MIN_VCO_FREQ) {
++ freq <<= 1;
++ st->r4_rf_div_sel++;
++ }
++
+ if (freq > ADF4350_MAX_FREQ_45_PRESC) {
+ prescaler = ADF4350_REG1_PRESCALER;
+ mdiv = 75;
+@@ -157,13 +170,6 @@ static int adf4350_set_freq(struct adf4350_state *st, unsigned long long freq)
+ mdiv = 23;
+ }
+
+- st->r4_rf_div_sel = 0;
+-
+- while (freq < ADF4350_MIN_VCO_FREQ) {
+- freq <<= 1;
+- st->r4_rf_div_sel++;
+- }
+-
+ /*
+ * Allow a predefined reference division factor
+ * if not set, compute our own
+diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+index a4d42e7e21807f..ee780f530dc861 100644
+--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
++++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+@@ -711,20 +711,12 @@ static void inv_icm42600_disable_vdd_reg(void *_data)
+ static void inv_icm42600_disable_vddio_reg(void *_data)
+ {
+ struct inv_icm42600_state *st = _data;
+- const struct device *dev = regmap_get_device(st->map);
+- int ret;
+-
+- ret = regulator_disable(st->vddio_supply);
+- if (ret)
+- dev_err(dev, "failed to disable vddio error %d\n", ret);
+-}
++ struct device *dev = regmap_get_device(st->map);
+
+-static void inv_icm42600_disable_pm(void *_data)
+-{
+- struct device *dev = _data;
++ if (pm_runtime_status_suspended(dev))
++ return;
+
+- pm_runtime_put_sync(dev);
+- pm_runtime_disable(dev);
++ regulator_disable(st->vddio_supply);
+ }
+
+ int inv_icm42600_core_probe(struct regmap *regmap, int chip,
+@@ -824,16 +816,14 @@ int inv_icm42600_core_probe(struct regmap *regmap, int chip,
+ return ret;
+
+ /* setup runtime power management */
+- ret = pm_runtime_set_active(dev);
++ ret = devm_pm_runtime_set_active_enabled(dev);
+ if (ret)
+ return ret;
+- pm_runtime_get_noresume(dev);
+- pm_runtime_enable(dev);
++
+ pm_runtime_set_autosuspend_delay(dev, INV_ICM42600_SUSPEND_DELAY_MS);
+ pm_runtime_use_autosuspend(dev);
+- pm_runtime_put(dev);
+
+- return devm_add_action_or_reset(dev, inv_icm42600_disable_pm, dev);
++ return ret;
+ }
+ EXPORT_SYMBOL_NS_GPL(inv_icm42600_core_probe, "IIO_ICM42600");
+
+@@ -847,17 +837,15 @@ static int inv_icm42600_suspend(struct device *dev)
+ struct device *accel_dev;
+ bool wakeup;
+ int accel_conf;
+- int ret;
++ int ret = 0;
+
+ mutex_lock(&st->lock);
+
+ st->suspended.gyro = st->conf.gyro.mode;
+ st->suspended.accel = st->conf.accel.mode;
+ st->suspended.temp = st->conf.temp_en;
+- if (pm_runtime_suspended(dev)) {
+- ret = 0;
++ if (pm_runtime_suspended(dev))
+ goto out_unlock;
+- }
+
+ /* disable FIFO data streaming */
+ if (st->fifo.on) {
+@@ -910,10 +898,13 @@ static int inv_icm42600_resume(struct device *dev)
+ struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
+ struct device *accel_dev;
+ bool wakeup;
+- int ret;
++ int ret = 0;
+
+ mutex_lock(&st->lock);
+
++ if (pm_runtime_suspended(dev))
++ goto out_unlock;
++
+ /* check wakeup capability */
+ accel_dev = &st->indio_accel->dev;
+ wakeup = st->apex.on && device_may_wakeup(accel_dev);
+@@ -927,10 +918,6 @@ static int inv_icm42600_resume(struct device *dev)
+ goto out_unlock;
+ }
+
+- pm_runtime_disable(dev);
+- pm_runtime_set_active(dev);
+- pm_runtime_enable(dev);
+-
+ /* restore sensors state */
+ ret = inv_icm42600_set_pwr_mgmt0(st, st->suspended.gyro,
+ st->suspended.accel,
+diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
+index dff2d895b8abd7..e236c7ec221f4b 100644
+--- a/drivers/iommu/intel/iommu.c
++++ b/drivers/iommu/intel/iommu.c
+@@ -3817,7 +3817,7 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
+ }
+
+ if (info->ats_supported && ecap_prs(iommu->ecap) &&
+- pci_pri_supported(pdev))
++ ecap_pds(iommu->ecap) && pci_pri_supported(pdev))
+ info->pri_supported = 1;
+ }
+ }
+diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
+index bf69a4802b71e7..9c4af7d5884631 100644
+--- a/drivers/irqchip/irq-sifive-plic.c
++++ b/drivers/irqchip/irq-sifive-plic.c
+@@ -252,7 +252,8 @@ static int plic_irq_suspend(void)
+
+ priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
+
+- for (i = 0; i < priv->nr_irqs; i++) {
++ /* irq ID 0 is reserved */
++ for (i = 1; i < priv->nr_irqs; i++) {
+ __assign_bit(i, priv->prio_save,
+ readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID));
+ }
+@@ -283,7 +284,8 @@ static void plic_irq_resume(void)
+
+ priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
+
+- for (i = 0; i < priv->nr_irqs; i++) {
++ /* irq ID 0 is reserved */
++ for (i = 1; i < priv->nr_irqs; i++) {
+ index = BIT_WORD(i);
+ writel((priv->prio_save[index] & BIT_MASK(i)) ? 1 : 0,
+ priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID);
+diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
+index 532929916e9988..654a60f63756a4 100644
+--- a/drivers/mailbox/mtk-cmdq-mailbox.c
++++ b/drivers/mailbox/mtk-cmdq-mailbox.c
+@@ -379,20 +379,13 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
+ struct cmdq *cmdq = dev_get_drvdata(chan->mbox->dev);
+ struct cmdq_task *task;
+ unsigned long curr_pa, end_pa;
+- int ret;
+
+ /* Client should not flush new tasks if suspended. */
+ WARN_ON(cmdq->suspended);
+
+- ret = pm_runtime_get_sync(cmdq->mbox.dev);
+- if (ret < 0)
+- return ret;
+-
+ task = kzalloc(sizeof(*task), GFP_ATOMIC);
+- if (!task) {
+- pm_runtime_put_autosuspend(cmdq->mbox.dev);
++ if (!task)
+ return -ENOMEM;
+- }
+
+ task->cmdq = cmdq;
+ INIT_LIST_HEAD(&task->list_entry);
+@@ -439,9 +432,6 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
+ }
+ list_move_tail(&task->list_entry, &thread->task_busy_list);
+
+- pm_runtime_mark_last_busy(cmdq->mbox.dev);
+- pm_runtime_put_autosuspend(cmdq->mbox.dev);
+-
+ return 0;
+ }
+
+diff --git a/drivers/mailbox/zynqmp-ipi-mailbox.c b/drivers/mailbox/zynqmp-ipi-mailbox.c
+index 0c143beaafda60..967967b2b8a967 100644
+--- a/drivers/mailbox/zynqmp-ipi-mailbox.c
++++ b/drivers/mailbox/zynqmp-ipi-mailbox.c
+@@ -62,7 +62,8 @@
+ #define DST_BIT_POS 9U
+ #define SRC_BITMASK GENMASK(11, 8)
+
+-#define MAX_SGI 16
++/* Macro to represent SGI type for IPI IRQs */
++#define IPI_IRQ_TYPE_SGI 2
+
+ /*
+ * Module parameters
+@@ -121,6 +122,7 @@ struct zynqmp_ipi_mbox {
+ * @dev: device pointer corresponding to the Xilinx ZynqMP
+ * IPI agent
+ * @irq: IPI agent interrupt ID
++ * @irq_type: IPI SGI or SPI IRQ type
+ * @method: IPI SMC or HVC is going to be used
+ * @local_id: local IPI agent ID
+ * @virq_sgi: IRQ number mapped to SGI
+@@ -130,6 +132,7 @@ struct zynqmp_ipi_mbox {
+ struct zynqmp_ipi_pdata {
+ struct device *dev;
+ int irq;
++ unsigned int irq_type;
+ unsigned int method;
+ u32 local_id;
+ int virq_sgi;
+@@ -887,17 +890,14 @@ static void zynqmp_ipi_free_mboxes(struct zynqmp_ipi_pdata *pdata)
+ struct zynqmp_ipi_mbox *ipi_mbox;
+ int i;
+
+- if (pdata->irq < MAX_SGI)
++ if (pdata->irq_type == IPI_IRQ_TYPE_SGI)
+ xlnx_mbox_cleanup_sgi(pdata);
+
+- i = pdata->num_mboxes;
++ i = pdata->num_mboxes - 1;
+ for (; i >= 0; i--) {
+ ipi_mbox = &pdata->ipi_mboxes[i];
+- if (ipi_mbox->dev.parent) {
+- mbox_controller_unregister(&ipi_mbox->mbox);
+- if (device_is_registered(&ipi_mbox->dev))
+- device_unregister(&ipi_mbox->dev);
+- }
++ if (device_is_registered(&ipi_mbox->dev))
++ device_unregister(&ipi_mbox->dev);
+ }
+ }
+
+@@ -959,14 +959,16 @@ static int zynqmp_ipi_probe(struct platform_device *pdev)
+ dev_err(dev, "failed to parse interrupts\n");
+ goto free_mbox_dev;
+ }
+- ret = out_irq.args[1];
++
++ /* Use interrupt type to distinguish SGI and SPI interrupts */
++ pdata->irq_type = out_irq.args[0];
+
+ /*
+ * If Interrupt number is in SGI range, then request SGI else request
+ * IPI system IRQ.
+ */
+- if (ret < MAX_SGI) {
+- pdata->irq = ret;
++ if (pdata->irq_type == IPI_IRQ_TYPE_SGI) {
++ pdata->irq = out_irq.args[1];
+ ret = xlnx_mbox_init_sgi(pdev, pdata->irq, pdata);
+ if (ret)
+ goto free_mbox_dev;
+diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
+index 3e1f165c2d20f6..2379ddb69ac4f5 100644
+--- a/drivers/md/md-linear.c
++++ b/drivers/md/md-linear.c
+@@ -267,6 +267,7 @@ static bool linear_make_request(struct mddev *mddev, struct bio *bio)
+ }
+
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ submit_bio_noacct(bio);
+ bio = split;
+ }
+diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
+index 419139ad7663cc..c4d5ddee57fa82 100644
+--- a/drivers/md/raid0.c
++++ b/drivers/md/raid0.c
+@@ -473,7 +473,9 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ bio_endio(bio);
+ return;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ submit_bio_noacct(bio);
+ bio = split;
+ end = zone->zone_end;
+@@ -621,7 +623,9 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
+ bio_endio(bio);
+ return true;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ raid0_map_submit_bio(mddev, bio);
+ bio = split;
+ }
+diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
+index d30b82beeb92fb..20facad2c271e1 100644
+--- a/drivers/md/raid1.c
++++ b/drivers/md/raid1.c
+@@ -1383,7 +1383,9 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ submit_bio_noacct(bio);
+ bio = split;
+ r1_bio->master_bio = bio;
+@@ -1591,7 +1593,9 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ submit_bio_noacct(bio);
+ bio = split;
+ r1_bio->master_bio = bio;
+diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
+index 9832eefb2f157b..1baa1851a0dbb8 100644
+--- a/drivers/md/raid10.c
++++ b/drivers/md/raid10.c
+@@ -1209,7 +1209,9 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ allow_barrier(conf);
+ submit_bio_noacct(bio);
+ wait_barrier(conf, false);
+@@ -1495,7 +1497,9 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ allow_barrier(conf);
+ submit_bio_noacct(bio);
+ wait_barrier(conf, false);
+@@ -1679,7 +1683,9 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
+ bio_endio(bio);
+ return 0;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ allow_barrier(conf);
+ /* Resend the fist split part */
+ submit_bio_noacct(split);
+@@ -1694,7 +1700,9 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
+ bio_endio(bio);
+ return 0;
+ }
++
+ bio_chain(split, bio);
++ trace_block_split(split, bio->bi_iter.bi_sector);
+ allow_barrier(conf);
+ /* Resend the second split part */
+ submit_bio_noacct(bio);
+diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
+index e385ef1355e8b3..771ac1cbab995e 100644
+--- a/drivers/md/raid5.c
++++ b/drivers/md/raid5.c
+@@ -5475,8 +5475,10 @@ static struct bio *chunk_aligned_read(struct mddev *mddev, struct bio *raid_bio)
+
+ if (sectors < bio_sectors(raid_bio)) {
+ struct r5conf *conf = mddev->private;
++
+ split = bio_split(raid_bio, sectors, GFP_NOIO, &conf->bio_split);
+ bio_chain(split, raid_bio);
++ trace_block_split(split, raid_bio->bi_iter.bi_sector);
+ submit_bio_noacct(raid_bio);
+ raid_bio = split;
+ }
+diff --git a/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile b/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile
+index 2e8f7f60263f1c..08d58524419f7d 100644
+--- a/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile
++++ b/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile
+@@ -1,8 +1,2 @@
+ extron-da-hd-4k-plus-cec-objs := extron-da-hd-4k-plus.o cec-splitter.o
+ obj-$(CONFIG_USB_EXTRON_DA_HD_4K_PLUS_CEC) := extron-da-hd-4k-plus-cec.o
+-
+-all:
+- $(MAKE) -C $(KDIR) M=$(shell pwd) modules
+-
+-install:
+- $(MAKE) -C $(KDIR) M=$(shell pwd) modules_install
+diff --git a/drivers/media/i2c/mt9p031.c b/drivers/media/i2c/mt9p031.c
+index 4ef5fb06131d5d..f444dd26ecaa60 100644
+--- a/drivers/media/i2c/mt9p031.c
++++ b/drivers/media/i2c/mt9p031.c
+@@ -1092,6 +1092,7 @@ static int mt9p031_parse_properties(struct mt9p031 *mt9p031, struct device *dev)
+ static int mt9p031_probe(struct i2c_client *client)
+ {
+ struct i2c_adapter *adapter = client->adapter;
++ const struct mt9p031_model_info *info;
+ struct mt9p031 *mt9p031;
+ unsigned int i;
+ int ret;
+@@ -1112,7 +1113,8 @@ static int mt9p031_probe(struct i2c_client *client)
+
+ mt9p031->output_control = MT9P031_OUTPUT_CONTROL_DEF;
+ mt9p031->mode2 = MT9P031_READ_MODE_2_ROW_BLC;
+- mt9p031->code = (uintptr_t)device_get_match_data(&client->dev);
++ info = device_get_match_data(&client->dev);
++ mt9p031->code = info->code;
+
+ mt9p031->regulators[0].supply = "vdd";
+ mt9p031->regulators[1].supply = "vdd_io";
+diff --git a/drivers/media/i2c/mt9v111.c b/drivers/media/i2c/mt9v111.c
+index 723fe138e7bcc0..8bf06a763a2519 100644
+--- a/drivers/media/i2c/mt9v111.c
++++ b/drivers/media/i2c/mt9v111.c
+@@ -532,8 +532,8 @@ static int mt9v111_calc_frame_rate(struct mt9v111_dev *mt9v111,
+ static int mt9v111_hw_config(struct mt9v111_dev *mt9v111)
+ {
+ struct i2c_client *c = mt9v111->client;
+- unsigned int ret;
+ u16 outfmtctrl2;
++ int ret;
+
+ /* Force device reset. */
+ ret = __mt9v111_hw_reset(mt9v111);
+diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
+index 56444edaf13651..6daa7aa9944226 100644
+--- a/drivers/media/mc/mc-devnode.c
++++ b/drivers/media/mc/mc-devnode.c
+@@ -50,11 +50,6 @@ static void media_devnode_release(struct device *cd)
+ {
+ struct media_devnode *devnode = to_media_devnode(cd);
+
+- mutex_lock(&media_devnode_lock);
+- /* Mark device node number as free */
+- clear_bit(devnode->minor, media_devnode_nums);
+- mutex_unlock(&media_devnode_lock);
+-
+ /* Release media_devnode and perform other cleanups as needed. */
+ if (devnode->release)
+ devnode->release(devnode);
+@@ -281,6 +276,7 @@ void media_devnode_unregister(struct media_devnode *devnode)
+ /* Delete the cdev on this minor as well */
+ cdev_device_del(&devnode->cdev, &devnode->dev);
+ devnode->media_dev = NULL;
++ clear_bit(devnode->minor, media_devnode_nums);
+ mutex_unlock(&media_devnode_lock);
+
+ put_device(&devnode->dev);
+diff --git a/drivers/media/mc/mc-entity.c b/drivers/media/mc/mc-entity.c
+index 04559090558205..307920c8b35492 100644
+--- a/drivers/media/mc/mc-entity.c
++++ b/drivers/media/mc/mc-entity.c
+@@ -691,7 +691,7 @@ static int media_pipeline_explore_next_link(struct media_pipeline *pipe,
+ * (already discovered through iterating over links) and pads
+ * not internally connected.
+ */
+- if (origin == local || !local->num_links ||
++ if (origin == local || local->num_links ||
+ !media_entity_has_pad_interdep(origin->entity, origin->index,
+ local->index))
+ continue;
+diff --git a/drivers/media/pci/cx18/cx18-queue.c b/drivers/media/pci/cx18/cx18-queue.c
+index 013694bfcb1c1b..7cbb2d5869320b 100644
+--- a/drivers/media/pci/cx18/cx18-queue.c
++++ b/drivers/media/pci/cx18/cx18-queue.c
+@@ -379,15 +379,22 @@ int cx18_stream_alloc(struct cx18_stream *s)
+ break;
+ }
+
++ buf->dma_handle = dma_map_single(&s->cx->pci_dev->dev,
++ buf->buf, s->buf_size,
++ s->dma);
++ if (dma_mapping_error(&s->cx->pci_dev->dev, buf->dma_handle)) {
++ kfree(buf->buf);
++ kfree(mdl);
++ kfree(buf);
++ break;
++ }
++
+ INIT_LIST_HEAD(&mdl->list);
+ INIT_LIST_HEAD(&mdl->buf_list);
+ mdl->id = s->mdl_base_idx; /* a somewhat safe value */
+ cx18_enqueue(s, mdl, &s->q_idle);
+
+ INIT_LIST_HEAD(&buf->list);
+- buf->dma_handle = dma_map_single(&s->cx->pci_dev->dev,
+- buf->buf, s->buf_size,
+- s->dma);
+ cx18_buf_sync_for_cpu(s, buf);
+ list_add_tail(&buf->list, &s->buf_pool);
+ }
+diff --git a/drivers/media/pci/ivtv/ivtv-irq.c b/drivers/media/pci/ivtv/ivtv-irq.c
+index 748c14e879632a..4d63daa01eed26 100644
+--- a/drivers/media/pci/ivtv/ivtv-irq.c
++++ b/drivers/media/pci/ivtv/ivtv-irq.c
+@@ -351,7 +351,7 @@ void ivtv_dma_stream_dec_prepare(struct ivtv_stream *s, u32 offset, int lock)
+
+ /* Insert buffer block for YUV if needed */
+ if (s->type == IVTV_DEC_STREAM_TYPE_YUV && f->offset_y) {
+- if (yi->blanking_dmaptr) {
++ if (yi->blanking_ptr) {
+ s->sg_pending[idx].src = yi->blanking_dmaptr;
+ s->sg_pending[idx].dst = offset;
+ s->sg_pending[idx].size = 720 * 16;
+diff --git a/drivers/media/pci/ivtv/ivtv-yuv.c b/drivers/media/pci/ivtv/ivtv-yuv.c
+index 2d9274537725af..71f0401066471a 100644
+--- a/drivers/media/pci/ivtv/ivtv-yuv.c
++++ b/drivers/media/pci/ivtv/ivtv-yuv.c
+@@ -125,7 +125,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
+ ivtv_udma_fill_sg_array(dma, y_buffer_offset, uv_buffer_offset, y_size);
+
+ /* If we've offset the y plane, ensure top area is blanked */
+- if (f->offset_y && yi->blanking_dmaptr) {
++ if (f->offset_y && yi->blanking_ptr) {
+ dma->SGarray[dma->SG_length].size = cpu_to_le32(720*16);
+ dma->SGarray[dma->SG_length].src = cpu_to_le32(yi->blanking_dmaptr);
+ dma->SGarray[dma->SG_length].dst = cpu_to_le32(IVTV_DECODER_OFFSET + yuv_offset[frame]);
+@@ -929,6 +929,12 @@ static void ivtv_yuv_init(struct ivtv *itv)
+ yi->blanking_dmaptr = dma_map_single(&itv->pdev->dev,
+ yi->blanking_ptr,
+ 720 * 16, DMA_TO_DEVICE);
++ if (dma_mapping_error(&itv->pdev->dev, yi->blanking_dmaptr)) {
++ kfree(yi->blanking_ptr);
++ yi->blanking_ptr = NULL;
++ yi->blanking_dmaptr = 0;
++ IVTV_DEBUG_WARN("Failed to dma_map yuv blanking buffer\n");
++ }
+ } else {
+ yi->blanking_dmaptr = 0;
+ IVTV_DEBUG_WARN("Failed to allocate yuv blanking buffer\n");
+diff --git a/drivers/media/pci/mgb4/mgb4_trigger.c b/drivers/media/pci/mgb4/mgb4_trigger.c
+index 923650d53d4c82..d7dddc5c8728e8 100644
+--- a/drivers/media/pci/mgb4/mgb4_trigger.c
++++ b/drivers/media/pci/mgb4/mgb4_trigger.c
+@@ -91,7 +91,7 @@ static irqreturn_t trigger_handler(int irq, void *p)
+ struct {
+ u32 data;
+ s64 ts __aligned(8);
+- } scan;
++ } scan = { };
+
+ scan.data = mgb4_read_reg(&st->mgbdev->video, 0xA0);
+ mgb4_write_reg(&st->mgbdev->video, 0xA0, scan.data);
+diff --git a/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.c b/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.c
+index 683c066ed97586..7fcb2fbdd64eea 100644
+--- a/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.c
++++ b/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.c
+@@ -1530,6 +1530,9 @@ static const struct of_device_id mdp_comp_dt_ids[] __maybe_unused = {
+ }, {
+ .compatible = "mediatek,mt8195-mdp3-tcc",
+ .data = (void *)MDP_COMP_TYPE_TCC,
++ }, {
++ .compatible = "mediatek,mt8188-mdp3-rdma",
++ .data = (void *)MDP_COMP_TYPE_RDMA,
+ },
+ {}
+ };
+diff --git a/drivers/media/platform/qcom/iris/iris_buffer.c b/drivers/media/platform/qcom/iris/iris_buffer.c
+index 9f664c24114936..38548ee4749ea7 100644
+--- a/drivers/media/platform/qcom/iris/iris_buffer.c
++++ b/drivers/media/platform/qcom/iris/iris_buffer.c
+@@ -334,6 +334,29 @@ int iris_queue_buffer(struct iris_inst *inst, struct iris_buffer *buf)
+ return 0;
+ }
+
++int iris_queue_internal_deferred_buffers(struct iris_inst *inst, enum iris_buffer_type buffer_type)
++{
++ struct iris_buffer *buffer, *next;
++ struct iris_buffers *buffers;
++ int ret = 0;
++
++ buffers = &inst->buffers[buffer_type];
++ list_for_each_entry_safe(buffer, next, &buffers->list, list) {
++ if (buffer->attr & BUF_ATTR_PENDING_RELEASE)
++ continue;
++ if (buffer->attr & BUF_ATTR_QUEUED)
++ continue;
++
++ if (buffer->attr & BUF_ATTR_DEFERRED) {
++ ret = iris_queue_buffer(inst, buffer);
++ if (ret)
++ return ret;
++ }
++ }
++
++ return ret;
++}
++
+ int iris_queue_internal_buffers(struct iris_inst *inst, u32 plane)
+ {
+ const struct iris_platform_data *platform_data = inst->core->iris_platform_data;
+@@ -358,6 +381,10 @@ int iris_queue_internal_buffers(struct iris_inst *inst, u32 plane)
+ continue;
+ if (buffer->attr & BUF_ATTR_QUEUED)
+ continue;
++ if (buffer->type == BUF_DPB && inst->state != IRIS_INST_STREAMING) {
++ buffer->attr |= BUF_ATTR_DEFERRED;
++ continue;
++ }
+ ret = iris_queue_buffer(inst, buffer);
+ if (ret)
+ return ret;
+@@ -624,6 +651,8 @@ int iris_vb2_buffer_done(struct iris_inst *inst, struct iris_buffer *buf)
+
+ vb2 = &vbuf->vb2_buf;
+
++ vbuf->flags |= buf->flags;
++
+ if (buf->flags & V4L2_BUF_FLAG_ERROR) {
+ state = VB2_BUF_STATE_ERROR;
+ vb2_set_plane_payload(vb2, 0, 0);
+@@ -632,8 +661,6 @@ int iris_vb2_buffer_done(struct iris_inst *inst, struct iris_buffer *buf)
+ return 0;
+ }
+
+- vbuf->flags |= buf->flags;
+-
+ if (V4L2_TYPE_IS_CAPTURE(type)) {
+ vb2_set_plane_payload(vb2, 0, buf->data_size);
+ vbuf->sequence = inst->sequence_cap++;
+diff --git a/drivers/media/platform/qcom/iris/iris_buffer.h b/drivers/media/platform/qcom/iris/iris_buffer.h
+index 00825ad2dc3a4b..b9b011faa13ae7 100644
+--- a/drivers/media/platform/qcom/iris/iris_buffer.h
++++ b/drivers/media/platform/qcom/iris/iris_buffer.h
+@@ -105,6 +105,7 @@ int iris_get_buffer_size(struct iris_inst *inst, enum iris_buffer_type buffer_ty
+ void iris_get_internal_buffers(struct iris_inst *inst, u32 plane);
+ int iris_create_internal_buffers(struct iris_inst *inst, u32 plane);
+ int iris_queue_internal_buffers(struct iris_inst *inst, u32 plane);
++int iris_queue_internal_deferred_buffers(struct iris_inst *inst, enum iris_buffer_type buffer_type);
+ int iris_destroy_internal_buffer(struct iris_inst *inst, struct iris_buffer *buffer);
+ int iris_destroy_all_internal_buffers(struct iris_inst *inst, u32 plane);
+ int iris_destroy_dequeued_internal_buffers(struct iris_inst *inst, u32 plane);
+diff --git a/drivers/media/platform/qcom/iris/iris_core.c b/drivers/media/platform/qcom/iris/iris_core.c
+index 0fa0a3b549a238..8406c48d635b6e 100644
+--- a/drivers/media/platform/qcom/iris/iris_core.c
++++ b/drivers/media/platform/qcom/iris/iris_core.c
+@@ -15,10 +15,12 @@ void iris_core_deinit(struct iris_core *core)
+ pm_runtime_resume_and_get(core->dev);
+
+ mutex_lock(&core->lock);
+- iris_fw_unload(core);
+- iris_vpu_power_off(core);
+- iris_hfi_queues_deinit(core);
+- core->state = IRIS_CORE_DEINIT;
++ if (core->state != IRIS_CORE_DEINIT) {
++ iris_fw_unload(core);
++ iris_vpu_power_off(core);
++ iris_hfi_queues_deinit(core);
++ core->state = IRIS_CORE_DEINIT;
++ }
+ mutex_unlock(&core->lock);
+
+ pm_runtime_put_sync(core->dev);
+diff --git a/drivers/media/platform/qcom/iris/iris_firmware.c b/drivers/media/platform/qcom/iris/iris_firmware.c
+index f1b5cd56db3225..9ab499fad94644 100644
+--- a/drivers/media/platform/qcom/iris/iris_firmware.c
++++ b/drivers/media/platform/qcom/iris/iris_firmware.c
+@@ -60,16 +60,7 @@ static int iris_load_fw_to_memory(struct iris_core *core, const char *fw_name)
+
+ ret = qcom_mdt_load(dev, firmware, fw_name,
+ pas_id, mem_virt, mem_phys, res_size, NULL);
+- if (ret)
+- goto err_mem_unmap;
+-
+- ret = qcom_scm_pas_auth_and_reset(pas_id);
+- if (ret)
+- goto err_mem_unmap;
+-
+- return ret;
+
+-err_mem_unmap:
+ memunmap(mem_virt);
+ err_release_fw:
+ release_firmware(firmware);
+@@ -94,6 +85,12 @@ int iris_fw_load(struct iris_core *core)
+ return -ENOMEM;
+ }
+
++ ret = qcom_scm_pas_auth_and_reset(core->iris_platform_data->pas_id);
++ if (ret) {
++ dev_err(core->dev, "auth and reset failed: %d\n", ret);
++ return ret;
++ }
++
+ ret = qcom_scm_mem_protect_video_var(cp_config->cp_start,
+ cp_config->cp_size,
+ cp_config->cp_nonpixel_start,
+diff --git a/drivers/media/platform/qcom/iris/iris_hfi_gen1_command.c b/drivers/media/platform/qcom/iris/iris_hfi_gen1_command.c
+index 5fc30d54af4dc3..3f4f93b779ced5 100644
+--- a/drivers/media/platform/qcom/iris/iris_hfi_gen1_command.c
++++ b/drivers/media/platform/qcom/iris/iris_hfi_gen1_command.c
+@@ -184,11 +184,25 @@ static int iris_hfi_gen1_session_stop(struct iris_inst *inst, u32 plane)
+ u32 flush_type = 0;
+ int ret = 0;
+
+- if ((V4L2_TYPE_IS_OUTPUT(plane) &&
+- inst->state == IRIS_INST_INPUT_STREAMING) ||
+- (V4L2_TYPE_IS_CAPTURE(plane) &&
+- inst->state == IRIS_INST_OUTPUT_STREAMING) ||
+- inst->state == IRIS_INST_ERROR) {
++ if (inst->state == IRIS_INST_STREAMING) {
++ if (V4L2_TYPE_IS_OUTPUT(plane))
++ flush_type = HFI_FLUSH_ALL;
++ else if (V4L2_TYPE_IS_CAPTURE(plane))
++ flush_type = HFI_FLUSH_OUTPUT;
++
++ reinit_completion(&inst->flush_completion);
++
++ flush_pkt.shdr.hdr.size = sizeof(struct hfi_session_flush_pkt);
++ flush_pkt.shdr.hdr.pkt_type = HFI_CMD_SESSION_FLUSH;
++ flush_pkt.shdr.session_id = inst->session_id;
++ flush_pkt.flush_type = flush_type;
++
++ ret = iris_hfi_queue_cmd_write(core, &flush_pkt, flush_pkt.shdr.hdr.size);
++ if (!ret) {
++ inst->flush_responses_pending++;
++ ret = iris_wait_for_session_response(inst, true);
++ }
++ } else if (inst->sub_state & IRIS_INST_SUB_LOAD_RESOURCES) {
+ reinit_completion(&inst->completion);
+ iris_hfi_gen1_packet_session_cmd(inst, &pkt, HFI_CMD_SESSION_STOP);
+ ret = iris_hfi_queue_cmd_write(core, &pkt, pkt.shdr.hdr.size);
+@@ -207,24 +221,6 @@ static int iris_hfi_gen1_session_stop(struct iris_inst *inst, u32 plane)
+ VB2_BUF_STATE_ERROR);
+ iris_helper_buffers_done(inst, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE,
+ VB2_BUF_STATE_ERROR);
+- } else if (inst->state == IRIS_INST_STREAMING) {
+- if (V4L2_TYPE_IS_OUTPUT(plane))
+- flush_type = HFI_FLUSH_ALL;
+- else if (V4L2_TYPE_IS_CAPTURE(plane))
+- flush_type = HFI_FLUSH_OUTPUT;
+-
+- reinit_completion(&inst->flush_completion);
+-
+- flush_pkt.shdr.hdr.size = sizeof(struct hfi_session_flush_pkt);
+- flush_pkt.shdr.hdr.pkt_type = HFI_CMD_SESSION_FLUSH;
+- flush_pkt.shdr.session_id = inst->session_id;
+- flush_pkt.flush_type = flush_type;
+-
+- ret = iris_hfi_queue_cmd_write(core, &flush_pkt, flush_pkt.shdr.hdr.size);
+- if (!ret) {
+- inst->flush_responses_pending++;
+- ret = iris_wait_for_session_response(inst, true);
+- }
+ }
+
+ return ret;
+@@ -401,8 +397,7 @@ static int iris_hfi_gen1_session_drain(struct iris_inst *inst, u32 plane)
+ ip_pkt.shdr.hdr.pkt_type = HFI_CMD_SESSION_EMPTY_BUFFER;
+ ip_pkt.shdr.session_id = inst->session_id;
+ ip_pkt.flags = HFI_BUFFERFLAG_EOS;
+- if (inst->codec == V4L2_PIX_FMT_VP9)
+- ip_pkt.packet_buffer = 0xdeadb000;
++ ip_pkt.packet_buffer = 0xdeadb000;
+
+ return iris_hfi_queue_cmd_write(inst->core, &ip_pkt, ip_pkt.shdr.hdr.size);
+ }
+diff --git a/drivers/media/platform/qcom/iris/iris_hfi_gen1_response.c b/drivers/media/platform/qcom/iris/iris_hfi_gen1_response.c
+index 8d1ce8a19a45eb..2a964588338354 100644
+--- a/drivers/media/platform/qcom/iris/iris_hfi_gen1_response.c
++++ b/drivers/media/platform/qcom/iris/iris_hfi_gen1_response.c
+@@ -416,8 +416,6 @@ static void iris_hfi_gen1_session_ftb_done(struct iris_inst *inst, void *packet)
+ inst->flush_responses_pending++;
+
+ iris_inst_sub_state_change_drain_last(inst);
+-
+- return;
+ }
+
+ if (iris_split_mode_enabled(inst) && pkt->stream_id == 0) {
+@@ -462,7 +460,7 @@ static void iris_hfi_gen1_session_ftb_done(struct iris_inst *inst, void *packet)
+ timestamp_us = (timestamp_us << 32) | timestamp_lo;
+ } else {
+ if (pkt->stream_id == 1 && !inst->last_buffer_dequeued) {
+- if (iris_drc_pending(inst)) {
++ if (iris_drc_pending(inst) || iris_drain_pending(inst)) {
+ flags |= V4L2_BUF_FLAG_LAST;
+ inst->last_buffer_dequeued = true;
+ }
+diff --git a/drivers/media/platform/qcom/iris/iris_hfi_gen2_response.c b/drivers/media/platform/qcom/iris/iris_hfi_gen2_response.c
+index a8c30fc5c0d066..dda775d463e916 100644
+--- a/drivers/media/platform/qcom/iris/iris_hfi_gen2_response.c
++++ b/drivers/media/platform/qcom/iris/iris_hfi_gen2_response.c
+@@ -424,7 +424,6 @@ static int iris_hfi_gen2_handle_release_internal_buffer(struct iris_inst *inst,
+ struct iris_buffers *buffers = &inst->buffers[buf_type];
+ struct iris_buffer *buf, *iter;
+ bool found = false;
+- int ret = 0;
+
+ list_for_each_entry(iter, &buffers->list, list) {
+ if (iter->device_addr == buffer->base_address) {
+@@ -437,10 +436,8 @@ static int iris_hfi_gen2_handle_release_internal_buffer(struct iris_inst *inst,
+ return -EINVAL;
+
+ buf->attr &= ~BUF_ATTR_QUEUED;
+- if (buf->attr & BUF_ATTR_PENDING_RELEASE)
+- ret = iris_destroy_internal_buffer(inst, buf);
+
+- return ret;
++ return iris_destroy_internal_buffer(inst, buf);
+ }
+
+ static int iris_hfi_gen2_handle_session_stop(struct iris_inst *inst,
+diff --git a/drivers/media/platform/qcom/iris/iris_state.c b/drivers/media/platform/qcom/iris/iris_state.c
+index 104e1687ad39da..d1dc1a863da0b0 100644
+--- a/drivers/media/platform/qcom/iris/iris_state.c
++++ b/drivers/media/platform/qcom/iris/iris_state.c
+@@ -122,7 +122,8 @@ static bool iris_inst_allow_sub_state(struct iris_inst *inst, enum iris_inst_sub
+ return false;
+ case IRIS_INST_OUTPUT_STREAMING:
+ if (sub_state & (IRIS_INST_SUB_DRC_LAST |
+- IRIS_INST_SUB_DRAIN_LAST | IRIS_INST_SUB_OUTPUT_PAUSE))
++ IRIS_INST_SUB_DRAIN_LAST | IRIS_INST_SUB_OUTPUT_PAUSE |
++ IRIS_INST_SUB_LOAD_RESOURCES))
+ return true;
+ return false;
+ case IRIS_INST_STREAMING:
+@@ -251,7 +252,7 @@ bool iris_drc_pending(struct iris_inst *inst)
+ inst->sub_state & IRIS_INST_SUB_DRC_LAST;
+ }
+
+-static inline bool iris_drain_pending(struct iris_inst *inst)
++bool iris_drain_pending(struct iris_inst *inst)
+ {
+ return inst->sub_state & IRIS_INST_SUB_DRAIN &&
+ inst->sub_state & IRIS_INST_SUB_DRAIN_LAST;
+diff --git a/drivers/media/platform/qcom/iris/iris_state.h b/drivers/media/platform/qcom/iris/iris_state.h
+index e718386dbe0402..b09fa54cf17eee 100644
+--- a/drivers/media/platform/qcom/iris/iris_state.h
++++ b/drivers/media/platform/qcom/iris/iris_state.h
+@@ -141,5 +141,6 @@ int iris_inst_sub_state_change_drc_last(struct iris_inst *inst);
+ int iris_inst_sub_state_change_pause(struct iris_inst *inst, u32 plane);
+ bool iris_allow_cmd(struct iris_inst *inst, u32 cmd);
+ bool iris_drc_pending(struct iris_inst *inst);
++bool iris_drain_pending(struct iris_inst *inst);
+
+ #endif
+diff --git a/drivers/media/platform/qcom/iris/iris_vb2.c b/drivers/media/platform/qcom/iris/iris_vb2.c
+index 8b17c7c3948798..e62ed7a57df2de 100644
+--- a/drivers/media/platform/qcom/iris/iris_vb2.c
++++ b/drivers/media/platform/qcom/iris/iris_vb2.c
+@@ -173,9 +173,6 @@ int iris_vb2_start_streaming(struct vb2_queue *q, unsigned int count)
+
+ inst = vb2_get_drv_priv(q);
+
+- if (V4L2_TYPE_IS_CAPTURE(q->type) && inst->state == IRIS_INST_INIT)
+- return 0;
+-
+ mutex_lock(&inst->lock);
+ if (inst->state == IRIS_INST_ERROR) {
+ ret = -EBUSY;
+@@ -203,7 +200,10 @@ int iris_vb2_start_streaming(struct vb2_queue *q, unsigned int count)
+
+ buf_type = iris_v4l2_type_to_driver(q->type);
+
+- ret = iris_queue_deferred_buffers(inst, buf_type);
++ if (inst->state == IRIS_INST_STREAMING)
++ ret = iris_queue_internal_deferred_buffers(inst, BUF_DPB);
++ if (!ret)
++ ret = iris_queue_deferred_buffers(inst, buf_type);
+ if (ret)
+ goto error;
+
+diff --git a/drivers/media/platform/qcom/iris/iris_vdec.c b/drivers/media/platform/qcom/iris/iris_vdec.c
+index d670b51c5839d1..0f5adaac829f22 100644
+--- a/drivers/media/platform/qcom/iris/iris_vdec.c
++++ b/drivers/media/platform/qcom/iris/iris_vdec.c
+@@ -158,7 +158,7 @@ int iris_vdec_try_fmt(struct iris_inst *inst, struct v4l2_format *f)
+ }
+ break;
+ case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE:
+- if (!fmt) {
++ if (f->fmt.pix_mp.pixelformat != V4L2_PIX_FMT_NV12) {
+ f_inst = inst->fmt_dst;
+ f->fmt.pix_mp.pixelformat = f_inst->fmt.pix_mp.pixelformat;
+ f->fmt.pix_mp.width = f_inst->fmt.pix_mp.width;
+diff --git a/drivers/media/platform/qcom/iris/iris_vidc.c b/drivers/media/platform/qcom/iris/iris_vidc.c
+index c417e8c31f806e..8285bdaf9466d4 100644
+--- a/drivers/media/platform/qcom/iris/iris_vidc.c
++++ b/drivers/media/platform/qcom/iris/iris_vidc.c
+@@ -240,6 +240,7 @@ static void iris_check_num_queued_internal_buffers(struct iris_inst *inst, u32 p
+
+ for (i = 0; i < internal_buffer_count; i++) {
+ buffers = &inst->buffers[internal_buf_type[i]];
++ count = 0;
+ list_for_each_entry_safe(buf, next, &buffers->list, list)
+ count++;
+ if (count)
+diff --git a/drivers/media/platform/qcom/iris/iris_vpu3x.c b/drivers/media/platform/qcom/iris/iris_vpu3x.c
+index 9b7c9a1495ee2f..bfc52eb04ed0e1 100644
+--- a/drivers/media/platform/qcom/iris/iris_vpu3x.c
++++ b/drivers/media/platform/qcom/iris/iris_vpu3x.c
+@@ -19,6 +19,9 @@
+ #define WRAPPER_IRIS_CPU_NOC_LPI_CONTROL (WRAPPER_BASE_OFFS + 0x5C)
+ #define REQ_POWER_DOWN_PREP BIT(0)
+ #define WRAPPER_IRIS_CPU_NOC_LPI_STATUS (WRAPPER_BASE_OFFS + 0x60)
++#define NOC_LPI_STATUS_DONE BIT(0) /* Indicates the NOC handshake is complete */
++#define NOC_LPI_STATUS_DENY BIT(1) /* Indicates the NOC handshake is denied */
++#define NOC_LPI_STATUS_ACTIVE BIT(2) /* Indicates the NOC is active */
+ #define WRAPPER_CORE_CLOCK_CONFIG (WRAPPER_BASE_OFFS + 0x88)
+ #define CORE_CLK_RUN 0x0
+
+@@ -109,7 +112,9 @@ static void iris_vpu3_power_off_hardware(struct iris_core *core)
+
+ static void iris_vpu33_power_off_hardware(struct iris_core *core)
+ {
++ bool handshake_done = false, handshake_busy = false;
+ u32 reg_val = 0, value, i;
++ u32 count = 0;
+ int ret;
+
+ if (iris_vpu3x_hw_power_collapsed(core))
+@@ -128,13 +133,36 @@ static void iris_vpu33_power_off_hardware(struct iris_core *core)
+ goto disable_power;
+ }
+
++ /* Retry up to 1000 times as recommended by hardware documentation */
++ do {
++ /* set MNoC to low power */
++ writel(REQ_POWER_DOWN_PREP, core->reg_base + AON_WRAPPER_MVP_NOC_LPI_CONTROL);
++
++ udelay(15);
++
++ value = readl(core->reg_base + AON_WRAPPER_MVP_NOC_LPI_STATUS);
++
++ handshake_done = value & NOC_LPI_STATUS_DONE;
++ handshake_busy = value & (NOC_LPI_STATUS_DENY | NOC_LPI_STATUS_ACTIVE);
++
++ if (handshake_done || !handshake_busy)
++ break;
++
++ writel(0, core->reg_base + AON_WRAPPER_MVP_NOC_LPI_CONTROL);
++
++ udelay(15);
++
++ } while (++count < 1000);
++
++ if (!handshake_done && handshake_busy)
++ dev_err(core->dev, "LPI handshake timeout\n");
++
+ ret = readl_poll_timeout(core->reg_base + AON_WRAPPER_MVP_NOC_LPI_STATUS,
+ reg_val, reg_val & BIT(0), 200, 2000);
+ if (ret)
+ goto disable_power;
+
+- /* set MNoC to low power, set PD_NOC_QREQ (bit 0) */
+- writel(BIT(0), core->reg_base + AON_WRAPPER_MVP_NOC_LPI_CONTROL);
++ writel(0, core->reg_base + AON_WRAPPER_MVP_NOC_LPI_CONTROL);
+
+ writel(CORE_BRIDGE_SW_RESET | CORE_BRIDGE_HW_RESET_DISABLE,
+ core->reg_base + CPU_CS_AHB_BRIDGE_SYNC_RESET);
+diff --git a/drivers/media/platform/qcom/iris/iris_vpu_common.c b/drivers/media/platform/qcom/iris/iris_vpu_common.c
+index 268e45acaa7c0e..42a7c53ce48eb5 100644
+--- a/drivers/media/platform/qcom/iris/iris_vpu_common.c
++++ b/drivers/media/platform/qcom/iris/iris_vpu_common.c
+@@ -359,7 +359,7 @@ int iris_vpu_power_on(struct iris_core *core)
+ return 0;
+
+ err_power_off_ctrl:
+- iris_vpu_power_off_controller(core);
++ core->iris_platform_data->vpu_ops->power_off_controller(core);
+ err_unvote_icc:
+ iris_unset_icc_bw(core);
+ err:
+diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
+index 66a18830e66dac..4e2636b0536693 100644
+--- a/drivers/media/platform/qcom/venus/firmware.c
++++ b/drivers/media/platform/qcom/venus/firmware.c
+@@ -30,7 +30,7 @@ static void venus_reset_cpu(struct venus_core *core)
+ u32 fw_size = core->fw.mapped_mem_size;
+ void __iomem *wrapper_base;
+
+- if (IS_IRIS2_1(core))
++ if (IS_IRIS2(core) || IS_IRIS2_1(core))
+ wrapper_base = core->wrapper_tz_base;
+ else
+ wrapper_base = core->wrapper_base;
+@@ -42,7 +42,7 @@ static void venus_reset_cpu(struct venus_core *core)
+ writel(fw_size, wrapper_base + WRAPPER_NONPIX_START_ADDR);
+ writel(fw_size, wrapper_base + WRAPPER_NONPIX_END_ADDR);
+
+- if (IS_IRIS2_1(core)) {
++ if (IS_IRIS2(core) || IS_IRIS2_1(core)) {
+ /* Bring XTSS out of reset */
+ writel(0, wrapper_base + WRAPPER_TZ_XTSS_SW_RESET);
+ } else {
+@@ -68,7 +68,7 @@ int venus_set_hw_state(struct venus_core *core, bool resume)
+ if (resume) {
+ venus_reset_cpu(core);
+ } else {
+- if (IS_IRIS2_1(core))
++ if (IS_IRIS2(core) || IS_IRIS2_1(core))
+ writel(WRAPPER_XTSS_SW_RESET_BIT,
+ core->wrapper_tz_base + WRAPPER_TZ_XTSS_SW_RESET);
+ else
+@@ -181,7 +181,7 @@ static int venus_shutdown_no_tz(struct venus_core *core)
+ void __iomem *wrapper_base = core->wrapper_base;
+ void __iomem *wrapper_tz_base = core->wrapper_tz_base;
+
+- if (IS_IRIS2_1(core)) {
++ if (IS_IRIS2(core) || IS_IRIS2_1(core)) {
+ /* Assert the reset to XTSS */
+ reg = readl(wrapper_tz_base + WRAPPER_TZ_XTSS_SW_RESET);
+ reg |= WRAPPER_XTSS_SW_RESET_BIT;
+diff --git a/drivers/media/platform/qcom/venus/pm_helpers.c b/drivers/media/platform/qcom/venus/pm_helpers.c
+index e32f8862a9f90c..99f45169beb9bd 100644
+--- a/drivers/media/platform/qcom/venus/pm_helpers.c
++++ b/drivers/media/platform/qcom/venus/pm_helpers.c
+@@ -40,6 +40,8 @@ static int core_clks_get(struct venus_core *core)
+
+ static int core_clks_enable(struct venus_core *core)
+ {
++ const struct freq_tbl *freq_tbl = core->res->freq_tbl;
++ unsigned int freq_tbl_size = core->res->freq_tbl_size;
+ const struct venus_resources *res = core->res;
+ struct device *dev = core->dev;
+ unsigned long freq = 0;
+@@ -48,8 +50,13 @@ static int core_clks_enable(struct venus_core *core)
+ int ret;
+
+ opp = dev_pm_opp_find_freq_ceil(dev, &freq);
+- if (!IS_ERR(opp))
++ if (IS_ERR(opp)) {
++ if (!freq_tbl)
++ return -ENODEV;
++ freq = freq_tbl[freq_tbl_size - 1].freq;
++ } else {
+ dev_pm_opp_put(opp);
++ }
+
+ for (i = 0; i < res->clks_num; i++) {
+ if (IS_V6(core)) {
+diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
+index a754b92232bd57..1673479be0ffef 100644
+--- a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
++++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
+@@ -286,6 +286,7 @@ void vsp1_isp_free_buffer(struct device *dev,
+ dma_free_coherent(bus_master, buffer_desc->size, buffer_desc->cpu_addr,
+ buffer_desc->dma_addr);
+ }
++EXPORT_SYMBOL_GPL(vsp1_isp_free_buffer);
+
+ /**
+ * vsp1_isp_start_streaming - Start processing VSPX jobs
+diff --git a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
+index 47bc3014b5d8b8..f7c682fca64595 100644
+--- a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
++++ b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
+@@ -14,8 +14,7 @@
+ #include "s5p_mfc_opr.h"
+ #include "s5p_mfc_cmd_v6.h"
+
+-static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd,
+- const struct s5p_mfc_cmd_args *args)
++static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd)
+ {
+ mfc_debug(2, "Issue the command: %d\n", cmd);
+
+@@ -31,7 +30,6 @@ static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd,
+
+ static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev)
+ {
+- struct s5p_mfc_cmd_args h2r_args;
+ const struct s5p_mfc_buf_size_v6 *buf_size = dev->variant->buf_size->priv;
+ int ret;
+
+@@ -41,33 +39,23 @@ static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev)
+
+ mfc_write(dev, dev->ctx_buf.dma, S5P_FIMV_CONTEXT_MEM_ADDR_V6);
+ mfc_write(dev, buf_size->dev_ctx, S5P_FIMV_CONTEXT_MEM_SIZE_V6);
+- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6,
+- &h2r_args);
++ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6);
+ }
+
+ static int s5p_mfc_sleep_cmd_v6(struct s5p_mfc_dev *dev)
+ {
+- struct s5p_mfc_cmd_args h2r_args;
+-
+- memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args));
+- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6,
+- &h2r_args);
++ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6);
+ }
+
+ static int s5p_mfc_wakeup_cmd_v6(struct s5p_mfc_dev *dev)
+ {
+- struct s5p_mfc_cmd_args h2r_args;
+-
+- memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args));
+- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6,
+- &h2r_args);
++ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6);
+ }
+
+ /* Open a new instance and get its number */
+ static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
+ {
+ struct s5p_mfc_dev *dev = ctx->dev;
+- struct s5p_mfc_cmd_args h2r_args;
+ int codec_type;
+
+ mfc_debug(2, "Requested codec mode: %d\n", ctx->codec_mode);
+@@ -129,23 +117,20 @@ static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
+ mfc_write(dev, ctx->ctx.size, S5P_FIMV_CONTEXT_MEM_SIZE_V6);
+ mfc_write(dev, 0, S5P_FIMV_D_CRC_CTRL_V6); /* no crc */
+
+- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_OPEN_INSTANCE_V6,
+- &h2r_args);
++ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_OPEN_INSTANCE_V6);
+ }
+
+ /* Close instance */
+ static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
+ {
+ struct s5p_mfc_dev *dev = ctx->dev;
+- struct s5p_mfc_cmd_args h2r_args;
+ int ret = 0;
+
+ dev->curr_ctx = ctx->num;
+ if (ctx->state != MFCINST_FREE) {
+ mfc_write(dev, ctx->inst_no, S5P_FIMV_INSTANCE_ID_V6);
+ ret = s5p_mfc_cmd_host2risc_v6(dev,
+- S5P_FIMV_H2R_CMD_CLOSE_INSTANCE_V6,
+- &h2r_args);
++ S5P_FIMV_H2R_CMD_CLOSE_INSTANCE_V6);
+ } else {
+ ret = -EINVAL;
+ }
+@@ -153,9 +138,15 @@ static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
+ return ret;
+ }
+
++static int s5p_mfc_cmd_host2risc_v6_args(struct s5p_mfc_dev *dev, int cmd,
++ const struct s5p_mfc_cmd_args *ignored)
++{
++ return s5p_mfc_cmd_host2risc_v6(dev, cmd);
++}
++
+ /* Initialize cmd function pointers for MFC v6 */
+ static const struct s5p_mfc_hw_cmds s5p_mfc_cmds_v6 = {
+- .cmd_host2risc = s5p_mfc_cmd_host2risc_v6,
++ .cmd_host2risc = s5p_mfc_cmd_host2risc_v6_args,
+ .sys_init_cmd = s5p_mfc_sys_init_cmd_v6,
+ .sleep_cmd = s5p_mfc_sleep_cmd_v6,
+ .wakeup_cmd = s5p_mfc_wakeup_cmd_v6,
+diff --git a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
+index b628d6e081dbcb..3c7a4bedb25721 100644
+--- a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
++++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
+@@ -52,6 +52,8 @@
+ #define DRAIN_TIMEOUT_MS 50
+ #define DRAIN_BUFFER_SIZE SZ_32K
+
++#define CSI2RX_BRIDGE_SOURCE_PAD 1
++
+ struct ti_csi2rx_fmt {
+ u32 fourcc; /* Four character code. */
+ u32 code; /* Mbus code. */
+@@ -426,8 +428,9 @@ static int csi_async_notifier_complete(struct v4l2_async_notifier *notifier)
+ if (ret)
+ return ret;
+
+- ret = v4l2_create_fwnode_links_to_pad(csi->source, &csi->pad,
+- MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
++ ret = media_create_pad_link(&csi->source->entity, CSI2RX_BRIDGE_SOURCE_PAD,
++ &vdev->entity, csi->pad.index,
++ MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
+
+ if (ret) {
+ video_unregister_device(vdev);
+@@ -1120,7 +1123,7 @@ static int ti_csi2rx_probe(struct platform_device *pdev)
+ if (ret)
+ goto err_vb2q;
+
+- ret = of_platform_populate(csi->dev->of_node, NULL, NULL, csi->dev);
++ ret = devm_of_platform_populate(csi->dev);
+ if (ret) {
+ dev_err(csi->dev, "Failed to create children: %d\n", ret);
+ goto err_subdev;
+diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c
+index a2257dc2f25d6b..7d4942925993a3 100644
+--- a/drivers/media/rc/lirc_dev.c
++++ b/drivers/media/rc/lirc_dev.c
+@@ -736,11 +736,11 @@ int lirc_register(struct rc_dev *dev)
+
+ cdev_init(&dev->lirc_cdev, &lirc_fops);
+
++ get_device(&dev->dev);
++
+ err = cdev_device_add(&dev->lirc_cdev, &dev->lirc_dev);
+ if (err)
+- goto out_ida;
+-
+- get_device(&dev->dev);
++ goto out_put_device;
+
+ switch (dev->driver_type) {
+ case RC_DRIVER_SCANCODE:
+@@ -764,7 +764,8 @@ int lirc_register(struct rc_dev *dev)
+
+ return 0;
+
+-out_ida:
++out_put_device:
++ put_device(&dev->lirc_dev);
+ ida_free(&lirc_ida, minor);
+ return err;
+ }
+diff --git a/drivers/media/test-drivers/vivid/vivid-cec.c b/drivers/media/test-drivers/vivid/vivid-cec.c
+index 356a988dd6a135..2d15fdd5d999e0 100644
+--- a/drivers/media/test-drivers/vivid/vivid-cec.c
++++ b/drivers/media/test-drivers/vivid/vivid-cec.c
+@@ -327,7 +327,7 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg)
+ char osd[14];
+
+ if (!cec_is_sink(adap))
+- return -ENOMSG;
++ break;
+ cec_ops_set_osd_string(msg, &disp_ctl, osd);
+ switch (disp_ctl) {
+ case CEC_OP_DISP_CTL_DEFAULT:
+@@ -348,7 +348,7 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg)
+ cec_transmit_msg(adap, &reply, false);
+ break;
+ }
+- break;
++ return 0;
+ }
+ case CEC_MSG_VENDOR_COMMAND_WITH_ID: {
+ u32 vendor_id;
+@@ -379,7 +379,7 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg)
+ if (size == 1) {
+ // Ignore even op values
+ if (!(vendor_cmd[0] & 1))
+- break;
++ return 0;
+ reply.len = msg->len;
+ memcpy(reply.msg + 1, msg->msg + 1, msg->len - 1);
+ reply.msg[msg->len - 1]++;
+@@ -388,12 +388,10 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg)
+ CEC_OP_ABORT_INVALID_OP);
+ }
+ cec_transmit_msg(adap, &reply, false);
+- break;
++ return 0;
+ }
+- default:
+- return -ENOMSG;
+ }
+- return 0;
++ return -ENOMSG;
+ }
+
+ static const struct cec_adap_ops vivid_cec_adap_ops = {
+diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
+index efe609d7087752..55bbbef399d45e 100644
+--- a/drivers/media/usb/uvc/uvc_ctrl.c
++++ b/drivers/media/usb/uvc/uvc_ctrl.c
+@@ -3307,7 +3307,6 @@ int uvc_ctrl_init_device(struct uvc_device *dev)
+ void uvc_ctrl_cleanup_fh(struct uvc_fh *handle)
+ {
+ struct uvc_entity *entity;
+- int i;
+
+ guard(mutex)(&handle->chain->ctrl_mutex);
+
+@@ -3325,7 +3324,7 @@ void uvc_ctrl_cleanup_fh(struct uvc_fh *handle)
+ if (!WARN_ON(handle->pending_async_ctrls))
+ return;
+
+- for (i = 0; i < handle->pending_async_ctrls; i++)
++ for (unsigned int i = 0; i < handle->pending_async_ctrls; i++)
+ uvc_pm_put(handle->stream->dev);
+ }
+
+diff --git a/drivers/memory/samsung/exynos-srom.c b/drivers/memory/samsung/exynos-srom.c
+index e73dd330af477d..d913fb901973f0 100644
+--- a/drivers/memory/samsung/exynos-srom.c
++++ b/drivers/memory/samsung/exynos-srom.c
+@@ -121,20 +121,18 @@ static int exynos_srom_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ srom->dev = dev;
+- srom->reg_base = of_iomap(np, 0);
+- if (!srom->reg_base) {
++ srom->reg_base = devm_platform_ioremap_resource(pdev, 0);
++ if (IS_ERR(srom->reg_base)) {
+ dev_err(&pdev->dev, "iomap of exynos srom controller failed\n");
+- return -ENOMEM;
++ return PTR_ERR(srom->reg_base);
+ }
+
+ platform_set_drvdata(pdev, srom);
+
+ srom->reg_offset = exynos_srom_alloc_reg_dump(exynos_srom_offsets,
+ ARRAY_SIZE(exynos_srom_offsets));
+- if (!srom->reg_offset) {
+- iounmap(srom->reg_base);
++ if (!srom->reg_offset)
+ return -ENOMEM;
+- }
+
+ for_each_child_of_node(np, child) {
+ if (exynos_srom_configure_bank(srom, child)) {
+diff --git a/drivers/memory/stm32_omm.c b/drivers/memory/stm32_omm.c
+index bee2ecc8c2b963..5d06623f3f6899 100644
+--- a/drivers/memory/stm32_omm.c
++++ b/drivers/memory/stm32_omm.c
+@@ -238,7 +238,7 @@ static int stm32_omm_configure(struct device *dev)
+ if (mux & CR_MUXEN) {
+ ret = of_property_read_u32(dev->of_node, "st,omm-req2ack-ns",
+ &req2ack);
+- if (!ret && !req2ack) {
++ if (!ret && req2ack) {
+ req2ack = DIV_ROUND_UP(req2ack, NSEC_PER_SEC / clk_rate_max) - 1;
+
+ if (req2ack > 256)
+diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
+index 0f753367aec1c1..83085e76486aa8 100644
+--- a/drivers/mmc/core/sdio.c
++++ b/drivers/mmc/core/sdio.c
+@@ -945,7 +945,11 @@ static void mmc_sdio_remove(struct mmc_host *host)
+ */
+ static int mmc_sdio_alive(struct mmc_host *host)
+ {
+- return mmc_select_card(host->card);
++ if (!mmc_host_is_spi(host))
++ return mmc_select_card(host->card);
++ else
++ return mmc_io_rw_direct(host->card, 0, 0, SDIO_CCCR_CCCR, 0,
++ NULL);
+ }
+
+ /*
+diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c
+index 35b0ad273b4ff6..95a32ff29ee166 100644
+--- a/drivers/mmc/host/mmc_spi.c
++++ b/drivers/mmc/host/mmc_spi.c
+@@ -563,7 +563,7 @@ mmc_spi_setup_data_message(struct mmc_spi_host *host, bool multiple, bool write)
+ * the next token (next data block, or STOP_TRAN). We can try to
+ * minimize I/O ops by using a single read to collect end-of-busy.
+ */
+- if (multiple || write) {
++ if (write) {
+ t = &host->early_status;
+ memset(t, 0, sizeof(*t));
+ t->len = write ? sizeof(scratch->status) : 1;
+diff --git a/drivers/mtd/nand/raw/fsmc_nand.c b/drivers/mtd/nand/raw/fsmc_nand.c
+index df61db8ce46659..b13b2b0c3f300c 100644
+--- a/drivers/mtd/nand/raw/fsmc_nand.c
++++ b/drivers/mtd/nand/raw/fsmc_nand.c
+@@ -876,10 +876,14 @@ static int fsmc_nand_probe_config_dt(struct platform_device *pdev,
+ if (!of_property_read_u32(np, "bank-width", &val)) {
+ if (val == 2) {
+ nand->options |= NAND_BUSWIDTH_16;
+- } else if (val != 1) {
++ } else if (val == 1) {
++ nand->options |= NAND_BUSWIDTH_AUTO;
++ } else {
+ dev_err(&pdev->dev, "invalid bank-width %u\n", val);
+ return -EINVAL;
+ }
++ } else {
++ nand->options |= NAND_BUSWIDTH_AUTO;
+ }
+
+ if (of_property_read_bool(np, "nand-skip-bbtscan"))
+diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+index f4e68008ea0303..a750f5839e3424 100644
+--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
++++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+@@ -145,6 +145,9 @@ static int __gpmi_enable_clk(struct gpmi_nand_data *this, bool v)
+ return ret;
+ }
+
++#define gpmi_enable_clk(x) __gpmi_enable_clk(x, true)
++#define gpmi_disable_clk(x) __gpmi_enable_clk(x, false)
++
+ static int gpmi_init(struct gpmi_nand_data *this)
+ {
+ struct resources *r = &this->resources;
+@@ -2765,6 +2768,11 @@ static int gpmi_nand_probe(struct platform_device *pdev)
+ pm_runtime_enable(&pdev->dev);
+ pm_runtime_set_autosuspend_delay(&pdev->dev, 500);
+ pm_runtime_use_autosuspend(&pdev->dev);
++#ifndef CONFIG_PM
++ ret = gpmi_enable_clk(this);
++ if (ret)
++ goto exit_acquire_resources;
++#endif
+
+ ret = gpmi_init(this);
+ if (ret)
+@@ -2800,6 +2808,9 @@ static void gpmi_nand_remove(struct platform_device *pdev)
+ release_resources(this);
+ pm_runtime_dont_use_autosuspend(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
++#ifndef CONFIG_PM
++ gpmi_disable_clk(this);
++#endif
+ }
+
+ static int gpmi_pm_suspend(struct device *dev)
+@@ -2846,9 +2857,6 @@ static int gpmi_pm_resume(struct device *dev)
+ return 0;
+ }
+
+-#define gpmi_enable_clk(x) __gpmi_enable_clk(x, true)
+-#define gpmi_disable_clk(x) __gpmi_enable_clk(x, false)
+-
+ static int gpmi_runtime_suspend(struct device *dev)
+ {
+ struct gpmi_nand_data *this = dev_get_drvdata(dev);
+diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
+index e6b802e3d84493..6d23c5c049b9a7 100644
+--- a/drivers/net/ethernet/airoha/airoha_eth.c
++++ b/drivers/net/ethernet/airoha/airoha_eth.c
+@@ -1709,7 +1709,9 @@ static void airhoha_set_gdm2_loopback(struct airoha_gdm_port *port)
+ airoha_fe_wr(eth, REG_GDM_RXCHN_EN(2), 0xffff);
+ airoha_fe_rmw(eth, REG_GDM_LPBK_CFG(2),
+ LPBK_CHAN_MASK | LPBK_MODE_MASK | LPBK_EN_MASK,
+- FIELD_PREP(LPBK_CHAN_MASK, chan) | LPBK_EN_MASK);
++ FIELD_PREP(LPBK_CHAN_MASK, chan) |
++ LBK_GAP_MODE_MASK | LBK_LEN_MODE_MASK |
++ LBK_CHAN_MODE_MASK | LPBK_EN_MASK);
+ airoha_fe_rmw(eth, REG_GDM_LEN_CFG(2),
+ GDM_SHORT_LEN_MASK | GDM_LONG_LEN_MASK,
+ FIELD_PREP(GDM_SHORT_LEN_MASK, 60) |
+diff --git a/drivers/net/ethernet/airoha/airoha_regs.h b/drivers/net/ethernet/airoha/airoha_regs.h
+index 150c85995cc1a7..0c8f61081699cb 100644
+--- a/drivers/net/ethernet/airoha/airoha_regs.h
++++ b/drivers/net/ethernet/airoha/airoha_regs.h
+@@ -151,6 +151,9 @@
+ #define LPBK_LEN_MASK GENMASK(23, 10)
+ #define LPBK_CHAN_MASK GENMASK(8, 4)
+ #define LPBK_MODE_MASK GENMASK(3, 1)
++#define LBK_GAP_MODE_MASK BIT(3)
++#define LBK_LEN_MODE_MASK BIT(2)
++#define LBK_CHAN_MODE_MASK BIT(1)
+ #define LPBK_EN_MASK BIT(0)
+
+ #define REG_GDM_TXCHN_EN(_n) (GDM_BASE(_n) + 0x24)
+diff --git a/drivers/net/ethernet/freescale/fsl_pq_mdio.c b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
+index 577f9b1780ad6e..de88776dd2a20f 100644
+--- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c
++++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
+@@ -479,10 +479,12 @@ static int fsl_pq_mdio_probe(struct platform_device *pdev)
+ "missing 'reg' property in node %pOF\n",
+ tbi);
+ err = -EBUSY;
++ of_node_put(tbi);
+ goto error;
+ }
+ set_tbipa(*prop, pdev,
+ data->get_tbipa, priv->map, &res);
++ of_node_put(tbi);
+ }
+ }
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
+index b53561c347082f..0a8a48cd4bce6f 100644
+--- a/drivers/net/ethernet/intel/ice/ice_adapter.c
++++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
+@@ -99,19 +99,21 @@ struct ice_adapter *ice_adapter_get(struct pci_dev *pdev)
+
+ index = ice_adapter_xa_index(pdev);
+ scoped_guard(mutex, &ice_adapters_mutex) {
+- err = xa_insert(&ice_adapters, index, NULL, GFP_KERNEL);
+- if (err == -EBUSY) {
+- adapter = xa_load(&ice_adapters, index);
++ adapter = xa_load(&ice_adapters, index);
++ if (adapter) {
+ refcount_inc(&adapter->refcount);
+ WARN_ON_ONCE(adapter->index != ice_adapter_index(pdev));
+ return adapter;
+ }
++ err = xa_reserve(&ice_adapters, index, GFP_KERNEL);
+ if (err)
+ return ERR_PTR(err);
+
+ adapter = ice_adapter_new(pdev);
+- if (!adapter)
++ if (!adapter) {
++ xa_release(&ice_adapters, index);
+ return ERR_PTR(-ENOMEM);
++ }
+ xa_store(&ice_adapters, index, adapter, GFP_KERNEL);
+ }
+ return adapter;
+diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+index d2071aff7b8f3b..308b4458e0d445 100644
+--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
++++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+@@ -1180,9 +1180,9 @@ static void mlx4_en_do_uc_filter(struct mlx4_en_priv *priv,
+ mlx4_unregister_mac(mdev->dev, priv->port, mac);
+
+ hlist_del_rcu(&entry->hlist);
+- kfree_rcu(entry, rcu);
+ en_dbg(DRV, priv, "Removed MAC %pM on port:%d\n",
+ entry->mac, priv->port);
++ kfree_rcu(entry, rcu);
+ ++removed;
+ }
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+index 00e77c71e201f8..0a4fb8c922684d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+@@ -772,6 +772,7 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+ struct netlink_ext_ack *extack)
+ {
+ struct mlx5e_ipsec_sa_entry *sa_entry = NULL;
++ bool allow_tunnel_mode = false;
+ struct mlx5e_ipsec *ipsec;
+ struct mlx5e_priv *priv;
+ gfp_t gfp;
+@@ -803,6 +804,20 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+ goto err_xfrm;
+ }
+
++ if (mlx5_eswitch_block_mode(priv->mdev))
++ goto unblock_ipsec;
++
++ if (x->props.mode == XFRM_MODE_TUNNEL &&
++ x->xso.type == XFRM_DEV_OFFLOAD_PACKET) {
++ allow_tunnel_mode = mlx5e_ipsec_fs_tunnel_allowed(sa_entry);
++ if (!allow_tunnel_mode) {
++ NL_SET_ERR_MSG_MOD(extack,
++ "Packet offload tunnel mode is disabled due to encap settings");
++ err = -EINVAL;
++ goto unblock_mode;
++ }
++ }
++
+ /* check esn */
+ if (x->props.flags & XFRM_STATE_ESN)
+ mlx5e_ipsec_update_esn_state(sa_entry);
+@@ -817,7 +832,7 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+
+ err = mlx5_ipsec_create_work(sa_entry);
+ if (err)
+- goto unblock_ipsec;
++ goto unblock_encap;
+
+ err = mlx5e_ipsec_create_dwork(sa_entry);
+ if (err)
+@@ -832,14 +847,6 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+ if (err)
+ goto err_hw_ctx;
+
+- if (x->props.mode == XFRM_MODE_TUNNEL &&
+- x->xso.type == XFRM_DEV_OFFLOAD_PACKET &&
+- !mlx5e_ipsec_fs_tunnel_enabled(sa_entry)) {
+- NL_SET_ERR_MSG_MOD(extack, "Packet offload tunnel mode is disabled due to encap settings");
+- err = -EINVAL;
+- goto err_add_rule;
+- }
+-
+ /* We use *_bh() variant because xfrm_timer_handler(), which runs
+ * in softirq context, can reach our state delete logic and we need
+ * xa_erase_bh() there.
+@@ -855,8 +862,7 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+ queue_delayed_work(ipsec->wq, &sa_entry->dwork->dwork,
+ MLX5_IPSEC_RESCHED);
+
+- if (x->xso.type == XFRM_DEV_OFFLOAD_PACKET &&
+- x->props.mode == XFRM_MODE_TUNNEL) {
++ if (allow_tunnel_mode) {
+ xa_lock_bh(&ipsec->sadb);
+ __xa_set_mark(&ipsec->sadb, sa_entry->ipsec_obj_id,
+ MLX5E_IPSEC_TUNNEL_SA);
+@@ -865,6 +871,11 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+
+ out:
+ x->xso.offload_handle = (unsigned long)sa_entry;
++ if (allow_tunnel_mode)
++ mlx5_eswitch_unblock_encap(priv->mdev);
++
++ mlx5_eswitch_unblock_mode(priv->mdev);
++
+ return 0;
+
+ err_add_rule:
+@@ -877,6 +888,11 @@ static int mlx5e_xfrm_add_state(struct net_device *dev,
+ if (sa_entry->work)
+ kfree(sa_entry->work->data);
+ kfree(sa_entry->work);
++unblock_encap:
++ if (allow_tunnel_mode)
++ mlx5_eswitch_unblock_encap(priv->mdev);
++unblock_mode:
++ mlx5_eswitch_unblock_mode(priv->mdev);
+ unblock_ipsec:
+ mlx5_eswitch_unblock_ipsec(priv->mdev);
+ err_xfrm:
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h
+index 23703f28386ad9..5d7c15abfcaf6c 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h
+@@ -319,7 +319,7 @@ void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry);
+ int mlx5e_accel_ipsec_fs_add_pol(struct mlx5e_ipsec_pol_entry *pol_entry);
+ void mlx5e_accel_ipsec_fs_del_pol(struct mlx5e_ipsec_pol_entry *pol_entry);
+ void mlx5e_accel_ipsec_fs_modify(struct mlx5e_ipsec_sa_entry *sa_entry);
+-bool mlx5e_ipsec_fs_tunnel_enabled(struct mlx5e_ipsec_sa_entry *sa_entry);
++bool mlx5e_ipsec_fs_tunnel_allowed(struct mlx5e_ipsec_sa_entry *sa_entry);
+
+ int mlx5_ipsec_create_sa_ctx(struct mlx5e_ipsec_sa_entry *sa_entry);
+ void mlx5_ipsec_free_sa_ctx(struct mlx5e_ipsec_sa_entry *sa_entry);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
+index 65dc3529283b69..9e236525356383 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
+@@ -1045,7 +1045,9 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec,
+
+ /* Create FT */
+ if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_TUNNEL)
+- rx->allow_tunnel_mode = mlx5_eswitch_block_encap(mdev);
++ rx->allow_tunnel_mode =
++ mlx5_eswitch_block_encap(mdev, rx == ipsec->rx_esw);
++
+ if (rx->allow_tunnel_mode)
+ flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT;
+ ft = ipsec_ft_create(attr.ns, attr.sa_level, attr.prio, 1, 2, flags);
+@@ -1286,7 +1288,9 @@ static int tx_create(struct mlx5e_ipsec *ipsec, struct mlx5e_ipsec_tx *tx,
+ goto err_status_rule;
+
+ if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_TUNNEL)
+- tx->allow_tunnel_mode = mlx5_eswitch_block_encap(mdev);
++ tx->allow_tunnel_mode =
++ mlx5_eswitch_block_encap(mdev, tx == ipsec->tx_esw);
++
+ if (tx->allow_tunnel_mode)
+ flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT;
+ ft = ipsec_ft_create(tx->ns, attr.sa_level, attr.prio, 1, 4, flags);
+@@ -2822,18 +2826,24 @@ void mlx5e_accel_ipsec_fs_modify(struct mlx5e_ipsec_sa_entry *sa_entry)
+ memcpy(sa_entry, &sa_entry_shadow, sizeof(*sa_entry));
+ }
+
+-bool mlx5e_ipsec_fs_tunnel_enabled(struct mlx5e_ipsec_sa_entry *sa_entry)
++bool mlx5e_ipsec_fs_tunnel_allowed(struct mlx5e_ipsec_sa_entry *sa_entry)
+ {
+- struct mlx5_accel_esp_xfrm_attrs *attrs = &sa_entry->attrs;
+- struct mlx5e_ipsec_rx *rx;
+- struct mlx5e_ipsec_tx *tx;
++ struct mlx5e_ipsec *ipsec = sa_entry->ipsec;
++ struct xfrm_state *x = sa_entry->x;
++ bool from_fdb;
+
+- rx = ipsec_rx(sa_entry->ipsec, attrs->addrs.family, attrs->type);
+- tx = ipsec_tx(sa_entry->ipsec, attrs->type);
+- if (sa_entry->attrs.dir == XFRM_DEV_OFFLOAD_OUT)
+- return tx->allow_tunnel_mode;
++ if (x->xso.dir == XFRM_DEV_OFFLOAD_OUT) {
++ struct mlx5e_ipsec_tx *tx = ipsec_tx(ipsec, x->xso.type);
++
++ from_fdb = (tx == ipsec->tx_esw);
++ } else {
++ struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, x->props.family,
++ x->xso.type);
++
++ from_fdb = (rx == ipsec->rx_esw);
++ }
+
+- return rx->allow_tunnel_mode;
++ return mlx5_eswitch_block_encap(ipsec->mdev, from_fdb);
+ }
+
+ void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+index 45506ad568470d..53d7e33d6c0b13 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+@@ -851,7 +851,7 @@ void mlx5_eswitch_offloads_single_fdb_del_one(struct mlx5_eswitch *master_esw,
+ struct mlx5_eswitch *slave_esw);
+ int mlx5_eswitch_reload_ib_reps(struct mlx5_eswitch *esw);
+
+-bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev);
++bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev, bool from_fdb);
+ void mlx5_eswitch_unblock_encap(struct mlx5_core_dev *dev);
+
+ int mlx5_eswitch_block_mode(struct mlx5_core_dev *dev);
+@@ -943,7 +943,8 @@ mlx5_eswitch_reload_ib_reps(struct mlx5_eswitch *esw)
+ return 0;
+ }
+
+-static inline bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev)
++static inline bool
++mlx5_eswitch_block_encap(struct mlx5_core_dev *dev, bool from_fdb)
+ {
+ return true;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+index bee906661282aa..f358e8fe432cfb 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+@@ -3938,23 +3938,25 @@ int mlx5_devlink_eswitch_inline_mode_get(struct devlink *devlink, u8 *mode)
+ return esw_inline_mode_to_devlink(esw->offloads.inline_mode, mode);
+ }
+
+-bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev)
++bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev, bool from_fdb)
+ {
+ struct mlx5_eswitch *esw = dev->priv.eswitch;
++ enum devlink_eswitch_encap_mode encap;
++ bool allow_tunnel = false;
+
+ if (!mlx5_esw_allowed(esw))
+ return true;
+
+ down_write(&esw->mode_lock);
+- if (esw->mode != MLX5_ESWITCH_LEGACY &&
+- esw->offloads.encap != DEVLINK_ESWITCH_ENCAP_MODE_NONE) {
+- up_write(&esw->mode_lock);
+- return false;
++ encap = esw->offloads.encap;
++ if (esw->mode == MLX5_ESWITCH_LEGACY ||
++ (encap == DEVLINK_ESWITCH_ENCAP_MODE_NONE && !from_fdb)) {
++ allow_tunnel = true;
++ esw->offloads.num_block_encap++;
+ }
+-
+- esw->offloads.num_block_encap++;
+ up_write(&esw->mode_lock);
+- return true;
++
++ return allow_tunnel;
+ }
+
+ void mlx5_eswitch_unblock_encap(struct mlx5_core_dev *dev)
+diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_main.c b/drivers/net/ethernet/microchip/sparx5/sparx5_main.c
+index 74ad1d73b4652e..40b1bfc600a791 100644
+--- a/drivers/net/ethernet/microchip/sparx5/sparx5_main.c
++++ b/drivers/net/ethernet/microchip/sparx5/sparx5_main.c
+@@ -708,6 +708,11 @@ static int sparx5_start(struct sparx5 *sparx5)
+ /* Init masks */
+ sparx5_update_fwd(sparx5);
+
++ /* Init flood masks */
++ for (int pgid = sparx5_get_pgid(sparx5, PGID_UC_FLOOD);
++ pgid <= sparx5_get_pgid(sparx5, PGID_BCAST); pgid++)
++ sparx5_pgid_clear(sparx5, pgid);
++
+ /* CPU copy CPU pgids */
+ spx5_wr(ANA_AC_PGID_MISC_CFG_PGID_CPU_COPY_ENA_SET(1), sparx5,
+ ANA_AC_PGID_MISC_CFG(sparx5_get_pgid(sparx5, PGID_CPU)));
+diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+index bc9ecb9392cd35..0a71abbd3da58c 100644
+--- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
++++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+@@ -176,6 +176,7 @@ static int sparx5_port_bridge_join(struct sparx5_port *port,
+ struct net_device *bridge,
+ struct netlink_ext_ack *extack)
+ {
++ struct switchdev_brport_flags flags = {0};
+ struct sparx5 *sparx5 = port->sparx5;
+ struct net_device *ndev = port->ndev;
+ int err;
+@@ -205,6 +206,11 @@ static int sparx5_port_bridge_join(struct sparx5_port *port,
+ */
+ __dev_mc_unsync(ndev, sparx5_mc_unsync);
+
++ /* Enable uc/mc/bc flooding */
++ flags.mask = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
++ flags.val = flags.mask;
++ sparx5_port_attr_bridge_flags(port, flags);
++
+ return 0;
+
+ err_switchdev_offload:
+@@ -215,6 +221,7 @@ static int sparx5_port_bridge_join(struct sparx5_port *port,
+ static void sparx5_port_bridge_leave(struct sparx5_port *port,
+ struct net_device *bridge)
+ {
++ struct switchdev_brport_flags flags = {0};
+ struct sparx5 *sparx5 = port->sparx5;
+
+ switchdev_bridge_port_unoffload(port->ndev, NULL, NULL, NULL);
+@@ -234,6 +241,11 @@ static void sparx5_port_bridge_leave(struct sparx5_port *port,
+
+ /* Port enters in host more therefore restore mc list */
+ __dev_mc_sync(port->ndev, sparx5_mc_sync, sparx5_mc_unsync);
++
++ /* Disable uc/mc/bc flooding */
++ flags.mask = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
++ flags.val = 0;
++ sparx5_port_attr_bridge_flags(port, flags);
+ }
+
+ static int sparx5_port_changeupper(struct net_device *dev,
+diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_vlan.c b/drivers/net/ethernet/microchip/sparx5/sparx5_vlan.c
+index d42097aa60a0e4..4947828719038b 100644
+--- a/drivers/net/ethernet/microchip/sparx5/sparx5_vlan.c
++++ b/drivers/net/ethernet/microchip/sparx5/sparx5_vlan.c
+@@ -167,16 +167,6 @@ void sparx5_update_fwd(struct sparx5 *sparx5)
+ /* Divide up fwd mask in 32 bit words */
+ bitmap_to_arr32(mask, sparx5->bridge_fwd_mask, SPX5_PORTS);
+
+- /* Update flood masks */
+- for (port = sparx5_get_pgid(sparx5, PGID_UC_FLOOD);
+- port <= sparx5_get_pgid(sparx5, PGID_BCAST); port++) {
+- spx5_wr(mask[0], sparx5, ANA_AC_PGID_CFG(port));
+- if (is_sparx5(sparx5)) {
+- spx5_wr(mask[1], sparx5, ANA_AC_PGID_CFG1(port));
+- spx5_wr(mask[2], sparx5, ANA_AC_PGID_CFG2(port));
+- }
+- }
+-
+ /* Update SRC masks */
+ for (port = 0; port < sparx5->data->consts->n_ports; port++) {
+ if (test_bit(port, sparx5->bridge_fwd_mask)) {
+diff --git a/drivers/net/ethernet/mscc/ocelot_stats.c b/drivers/net/ethernet/mscc/ocelot_stats.c
+index 545710dadcf544..d2be1be3771658 100644
+--- a/drivers/net/ethernet/mscc/ocelot_stats.c
++++ b/drivers/net/ethernet/mscc/ocelot_stats.c
+@@ -1021,6 +1021,6 @@ int ocelot_stats_init(struct ocelot *ocelot)
+
+ void ocelot_stats_deinit(struct ocelot *ocelot)
+ {
+- cancel_delayed_work(&ocelot->stats_work);
++ disable_delayed_work_sync(&ocelot->stats_work);
+ destroy_workqueue(ocelot->stats_queue);
+ }
+diff --git a/drivers/net/mdio/mdio-i2c.c b/drivers/net/mdio/mdio-i2c.c
+index 53e96bfab54229..ed20352a589a3d 100644
+--- a/drivers/net/mdio/mdio-i2c.c
++++ b/drivers/net/mdio/mdio-i2c.c
+@@ -116,17 +116,23 @@ static int smbus_byte_mii_read_default_c22(struct mii_bus *bus, int phy_id,
+ if (!i2c_mii_valid_phy_id(phy_id))
+ return 0;
+
+- ret = i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
+- I2C_SMBUS_READ, reg,
+- I2C_SMBUS_BYTE_DATA, &smbus_data);
++ i2c_lock_bus(i2c, I2C_LOCK_SEGMENT);
++
++ ret = __i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
++ I2C_SMBUS_READ, reg,
++ I2C_SMBUS_BYTE_DATA, &smbus_data);
+ if (ret < 0)
+- return ret;
++ goto unlock;
+
+ val = (smbus_data.byte & 0xff) << 8;
+
+- ret = i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
+- I2C_SMBUS_READ, reg,
+- I2C_SMBUS_BYTE_DATA, &smbus_data);
++ ret = __i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
++ I2C_SMBUS_READ, reg,
++ I2C_SMBUS_BYTE_DATA, &smbus_data);
++
++unlock:
++ i2c_unlock_bus(i2c, I2C_LOCK_SEGMENT);
++
+ if (ret < 0)
+ return ret;
+
+@@ -147,17 +153,22 @@ static int smbus_byte_mii_write_default_c22(struct mii_bus *bus, int phy_id,
+
+ smbus_data.byte = (val & 0xff00) >> 8;
+
+- ret = i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
+- I2C_SMBUS_WRITE, reg,
+- I2C_SMBUS_BYTE_DATA, &smbus_data);
++ i2c_lock_bus(i2c, I2C_LOCK_SEGMENT);
++
++ ret = __i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
++ I2C_SMBUS_WRITE, reg,
++ I2C_SMBUS_BYTE_DATA, &smbus_data);
+ if (ret < 0)
+- return ret;
++ goto unlock;
+
+ smbus_data.byte = val & 0xff;
+
+- ret = i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
+- I2C_SMBUS_WRITE, reg,
+- I2C_SMBUS_BYTE_DATA, &smbus_data);
++ ret = __i2c_smbus_xfer(i2c, i2c_mii_phy_addr(phy_id), 0,
++ I2C_SMBUS_WRITE, reg,
++ I2C_SMBUS_BYTE_DATA, &smbus_data);
++
++unlock:
++ i2c_unlock_bus(i2c, I2C_LOCK_SEGMENT);
+
+ return ret < 0 ? ret : 0;
+ }
+diff --git a/drivers/net/pse-pd/tps23881.c b/drivers/net/pse-pd/tps23881.c
+index 63f8f43062bce6..b724b222ab44c9 100644
+--- a/drivers/net/pse-pd/tps23881.c
++++ b/drivers/net/pse-pd/tps23881.c
+@@ -62,7 +62,7 @@
+ #define TPS23881_REG_SRAM_DATA 0x61
+
+ #define TPS23881_UV_STEP 3662
+-#define TPS23881_NA_STEP 70190
++#define TPS23881_NA_STEP 89500
+ #define TPS23881_MW_STEP 500
+ #define TPS23881_MIN_PI_PW_LIMIT_MW 2000
+
+diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
+index 1ff25f57329a81..d75502ebbc0d92 100644
+--- a/drivers/net/usb/lan78xx.c
++++ b/drivers/net/usb/lan78xx.c
+@@ -1079,10 +1079,13 @@ static int lan78xx_read_raw_eeprom(struct lan78xx_net *dev, u32 offset,
+ }
+
+ read_raw_eeprom_done:
+- if (dev->chipid == ID_REV_CHIP_ID_7800_)
+- return lan78xx_write_reg(dev, HW_CFG, saved);
+-
+- return 0;
++ if (dev->chipid == ID_REV_CHIP_ID_7800_) {
++ int rc = lan78xx_write_reg(dev, HW_CFG, saved);
++ /* If USB fails, there is nothing to do */
++ if (rc < 0)
++ return rc;
++ }
++ return ret;
+ }
+
+ static int lan78xx_read_eeprom(struct lan78xx_net *dev, u32 offset,
+diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
+index d49353b6b2e765..2810752260f2f7 100644
+--- a/drivers/net/wireless/ath/ath11k/core.c
++++ b/drivers/net/wireless/ath/ath11k/core.c
+@@ -2215,14 +2215,10 @@ static int ath11k_core_reconfigure_on_crash(struct ath11k_base *ab)
+ mutex_unlock(&ab->core_lock);
+
+ ath11k_dp_free(ab);
+- ath11k_hal_srng_deinit(ab);
++ ath11k_hal_srng_clear(ab);
+
+ ab->free_vdev_map = (1LL << (ab->num_radios * TARGET_NUM_VDEVS(ab))) - 1;
+
+- ret = ath11k_hal_srng_init(ab);
+- if (ret)
+- return ret;
+-
+ clear_bit(ATH11K_FLAG_CRASH_FLUSH, &ab->dev_flags);
+
+ ret = ath11k_core_qmi_firmware_ready(ab);
+diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c
+index 0c3ce7509ab83d..0c797b8d0a276a 100644
+--- a/drivers/net/wireless/ath/ath11k/hal.c
++++ b/drivers/net/wireless/ath/ath11k/hal.c
+@@ -1386,6 +1386,22 @@ void ath11k_hal_srng_deinit(struct ath11k_base *ab)
+ }
+ EXPORT_SYMBOL(ath11k_hal_srng_deinit);
+
++void ath11k_hal_srng_clear(struct ath11k_base *ab)
++{
++ /* No need to memset rdp and wrp memory since each individual
++ * segment would get cleared in ath11k_hal_srng_src_hw_init()
++ * and ath11k_hal_srng_dst_hw_init().
++ */
++ memset(ab->hal.srng_list, 0,
++ sizeof(ab->hal.srng_list));
++ memset(ab->hal.shadow_reg_addr, 0,
++ sizeof(ab->hal.shadow_reg_addr));
++ ab->hal.avail_blk_resource = 0;
++ ab->hal.current_blk_index = 0;
++ ab->hal.num_shadow_reg_configured = 0;
++}
++EXPORT_SYMBOL(ath11k_hal_srng_clear);
++
+ void ath11k_hal_dump_srng_stats(struct ath11k_base *ab)
+ {
+ struct hal_srng *srng;
+diff --git a/drivers/net/wireless/ath/ath11k/hal.h b/drivers/net/wireless/ath/ath11k/hal.h
+index 601542410c7529..839095af9267e5 100644
+--- a/drivers/net/wireless/ath/ath11k/hal.h
++++ b/drivers/net/wireless/ath/ath11k/hal.h
+@@ -965,6 +965,7 @@ int ath11k_hal_srng_setup(struct ath11k_base *ab, enum hal_ring_type type,
+ struct hal_srng_params *params);
+ int ath11k_hal_srng_init(struct ath11k_base *ath11k);
+ void ath11k_hal_srng_deinit(struct ath11k_base *ath11k);
++void ath11k_hal_srng_clear(struct ath11k_base *ab);
+ void ath11k_hal_dump_srng_stats(struct ath11k_base *ab);
+ void ath11k_hal_srng_get_shadow_config(struct ath11k_base *ab,
+ u32 **cfg, u32 *len);
+diff --git a/drivers/net/wireless/intel/iwlwifi/mld/debugfs.c b/drivers/net/wireless/intel/iwlwifi/mld/debugfs.c
+index cc052b0aa53ff3..372204bf845224 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mld/debugfs.c
++++ b/drivers/net/wireless/intel/iwlwifi/mld/debugfs.c
+@@ -1001,8 +1001,12 @@ void iwl_mld_add_link_debugfs(struct ieee80211_hw *hw,
+ * If not, this is a per-link dir of a MLO vif, add in it the iwlmld
+ * dir.
+ */
+- if (!mld_link_dir)
++ if (!mld_link_dir) {
+ mld_link_dir = debugfs_create_dir("iwlmld", dir);
++ } else {
++ /* Release the reference from debugfs_lookup */
++ dput(mld_link_dir);
++ }
+ }
+
+ static ssize_t _iwl_dbgfs_fixed_rate_write(struct iwl_mld *mld, char *buf,
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/usb.c b/drivers/net/wireless/mediatek/mt76/mt7921/usb.c
+index fe9751851ff747..100bdba32ba59e 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/usb.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/usb.c
+@@ -21,6 +21,9 @@ static const struct usb_device_id mt7921u_device_table[] = {
+ /* Netgear, Inc. [A8000,AXE3000] */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9060, 0xff, 0xff, 0xff),
+ .driver_info = (kernel_ulong_t)MT7921_FIRMWARE_WM },
++ /* Netgear, Inc. A7500 */
++ { USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9065, 0xff, 0xff, 0xff),
++ .driver_info = (kernel_ulong_t)MT7921_FIRMWARE_WM },
+ /* TP-Link TXE50UH */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x35bc, 0x0107, 0xff, 0xff, 0xff),
+ .driver_info = (kernel_ulong_t)MT7921_FIRMWARE_WM },
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/usb.c b/drivers/net/wireless/mediatek/mt76/mt7925/usb.c
+index 4dfbc1b6cfddb4..bf040f34e4b9f7 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7925/usb.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7925/usb.c
+@@ -12,6 +12,9 @@
+ static const struct usb_device_id mt7925u_device_table[] = {
+ { USB_DEVICE_AND_INTERFACE_INFO(0x0e8d, 0x7925, 0xff, 0xff, 0xff),
+ .driver_info = (kernel_ulong_t)MT7925_FIRMWARE_WM },
++ /* Netgear, Inc. A9000 */
++ { USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9072, 0xff, 0xff, 0xff),
++ .driver_info = (kernel_ulong_t)MT7925_FIRMWARE_WM },
+ { },
+ };
+
+diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
+index 1837f17239ab60..5dd05b296e71cc 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.c
++++ b/drivers/net/wireless/realtek/rtw89/core.c
+@@ -1091,25 +1091,14 @@ void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel)
+ }
+
+ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *skb,
+- int qsel, unsigned int timeout)
++ struct rtw89_tx_wait_info *wait, int qsel,
++ unsigned int timeout)
+ {
+- struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
+- struct rtw89_tx_wait_info *wait;
+ unsigned long time_left;
+ int ret = 0;
+
+ lockdep_assert_wiphy(rtwdev->hw->wiphy);
+
+- wait = kzalloc(sizeof(*wait), GFP_KERNEL);
+- if (!wait) {
+- rtw89_core_tx_kick_off(rtwdev, qsel);
+- return 0;
+- }
+-
+- init_completion(&wait->completion);
+- wait->skb = skb;
+- rcu_assign_pointer(skb_data->wait, wait);
+-
+ rtw89_core_tx_kick_off(rtwdev, qsel);
+ time_left = wait_for_completion_timeout(&wait->completion,
+ msecs_to_jiffies(timeout));
+@@ -1172,10 +1161,12 @@ int rtw89_h2c_tx(struct rtw89_dev *rtwdev,
+ static int rtw89_core_tx_write_link(struct rtw89_dev *rtwdev,
+ struct rtw89_vif_link *rtwvif_link,
+ struct rtw89_sta_link *rtwsta_link,
+- struct sk_buff *skb, int *qsel, bool sw_mld)
++ struct sk_buff *skb, int *qsel, bool sw_mld,
++ struct rtw89_tx_wait_info *wait)
+ {
+ struct ieee80211_sta *sta = rtwsta_link_to_sta_safe(rtwsta_link);
+ struct ieee80211_vif *vif = rtwvif_link_to_vif(rtwvif_link);
++ struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
+ struct rtw89_vif *rtwvif = rtwvif_link->rtwvif;
+ struct rtw89_core_tx_request tx_req = {};
+ int ret;
+@@ -1192,6 +1183,8 @@ static int rtw89_core_tx_write_link(struct rtw89_dev *rtwdev,
+ rtw89_core_tx_update_desc_info(rtwdev, &tx_req);
+ rtw89_core_tx_wake(rtwdev, &tx_req);
+
++ rcu_assign_pointer(skb_data->wait, wait);
++
+ ret = rtw89_hci_tx_write(rtwdev, &tx_req);
+ if (ret) {
+ rtw89_err(rtwdev, "failed to transmit skb to HCI\n");
+@@ -1228,7 +1221,8 @@ int rtw89_core_tx_write(struct rtw89_dev *rtwdev, struct ieee80211_vif *vif,
+ }
+ }
+
+- return rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, qsel, false);
++ return rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, qsel, false,
++ NULL);
+ }
+
+ static __le32 rtw89_build_txwd_body0(struct rtw89_tx_desc_info *desc_info)
+@@ -3426,6 +3420,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
+ struct ieee80211_vif *vif = rtwvif_link_to_vif(rtwvif_link);
+ int link_id = ieee80211_vif_is_mld(vif) ? rtwvif_link->link_id : -1;
+ struct rtw89_sta_link *rtwsta_link;
++ struct rtw89_tx_wait_info *wait;
+ struct ieee80211_sta *sta;
+ struct ieee80211_hdr *hdr;
+ struct rtw89_sta *rtwsta;
+@@ -3435,6 +3430,12 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
+ if (vif->type != NL80211_IFTYPE_STATION || !vif->cfg.assoc)
+ return 0;
+
++ wait = kzalloc(sizeof(*wait), GFP_KERNEL);
++ if (!wait)
++ return -ENOMEM;
++
++ init_completion(&wait->completion);
++
+ rcu_read_lock();
+ sta = ieee80211_find_sta(vif, vif->cfg.ap_addr);
+ if (!sta) {
+@@ -3449,6 +3450,8 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
+ goto out;
+ }
+
++ wait->skb = skb;
++
+ hdr = (struct ieee80211_hdr *)skb->data;
+ if (ps)
+ hdr->frame_control |= cpu_to_le16(IEEE80211_FCTL_PM);
+@@ -3460,7 +3463,8 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
+ goto out;
+ }
+
+- ret = rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, &qsel, true);
++ ret = rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, &qsel, true,
++ wait);
+ if (ret) {
+ rtw89_warn(rtwdev, "nullfunc transmit failed: %d\n", ret);
+ dev_kfree_skb_any(skb);
+@@ -3469,10 +3473,11 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
+
+ rcu_read_unlock();
+
+- return rtw89_core_tx_kick_off_and_wait(rtwdev, skb, qsel,
++ return rtw89_core_tx_kick_off_and_wait(rtwdev, skb, wait, qsel,
+ timeout);
+ out:
+ rcu_read_unlock();
++ kfree(wait);
+
+ return ret;
+ }
+diff --git a/drivers/net/wireless/realtek/rtw89/core.h b/drivers/net/wireless/realtek/rtw89/core.h
+index 337971c744e60f..2de9505c48ffcf 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.h
++++ b/drivers/net/wireless/realtek/rtw89/core.h
+@@ -7389,7 +7389,8 @@ int rtw89_h2c_tx(struct rtw89_dev *rtwdev,
+ struct sk_buff *skb, bool fwdl);
+ void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel);
+ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *skb,
+- int qsel, unsigned int timeout);
++ struct rtw89_tx_wait_info *wait, int qsel,
++ unsigned int timeout);
+ void rtw89_core_fill_txdesc(struct rtw89_dev *rtwdev,
+ struct rtw89_tx_desc_info *desc_info,
+ void *txdesc);
+diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
+index 4e3034b44f5641..cb9682f306a6ae 100644
+--- a/drivers/net/wireless/realtek/rtw89/pci.c
++++ b/drivers/net/wireless/realtek/rtw89/pci.c
+@@ -1372,7 +1372,6 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
+ struct pci_dev *pdev = rtwpci->pdev;
+ struct sk_buff *skb = tx_req->skb;
+ struct rtw89_pci_tx_data *tx_data = RTW89_PCI_TX_SKB_CB(skb);
+- struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
+ bool en_wd_info = desc_info->en_wd_info;
+ u32 txwd_len;
+ u32 txwp_len;
+@@ -1388,7 +1387,6 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
+ }
+
+ tx_data->dma = dma;
+- rcu_assign_pointer(skb_data->wait, NULL);
+
+ txwp_len = sizeof(*txwp_info);
+ txwd_len = chip->txwd_body_size;
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index 2c6d9506b17250..8ed5f1941f05c5 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -3324,10 +3324,12 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
+ * Exclude Samsung 990 Evo from NVME_QUIRK_SIMPLE_SUSPEND
+ * because of high power consumption (> 2 Watt) in s2idle
+ * sleep. Only some boards with Intel CPU are affected.
++ * (Note for testing: Samsung 990 Evo Plus has same PCI ID)
+ */
+ if (dmi_match(DMI_BOARD_NAME, "DN50Z-140HC-YD") ||
+ dmi_match(DMI_BOARD_NAME, "GMxPXxx") ||
+ dmi_match(DMI_BOARD_NAME, "GXxMRXx") ||
++ dmi_match(DMI_BOARD_NAME, "NS5X_NS7XAU") ||
+ dmi_match(DMI_BOARD_NAME, "PH4PG31") ||
+ dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1") ||
+ dmi_match(DMI_BOARD_NAME, "PH6PG01_PH6PG71"))
+diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
+index e3503ec20f6cc7..388e9ec2cccf83 100644
+--- a/drivers/of/unittest.c
++++ b/drivers/of/unittest.c
+@@ -4300,6 +4300,7 @@ static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
+ unittest(!np, "Child device tree node is not removed\n");
+ child_dev = device_find_any_child(&pdev->dev);
+ unittest(!child_dev, "Child device is not removed\n");
++ put_device(child_dev);
+ }
+
+ failed:
+diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
+index b77fd30bbfd9da..f401e296d15d22 100644
+--- a/drivers/pci/bus.c
++++ b/drivers/pci/bus.c
+@@ -361,11 +361,15 @@ void pci_bus_add_device(struct pci_dev *dev)
+ * before PCI client drivers.
+ */
+ pdev = of_find_device_by_node(dn);
+- if (pdev && of_pci_supply_present(dn)) {
+- if (!device_link_add(&dev->dev, &pdev->dev,
+- DL_FLAG_AUTOREMOVE_CONSUMER))
+- pci_err(dev, "failed to add device link to power control device %s\n",
+- pdev->name);
++ if (pdev) {
++ if (of_pci_supply_present(dn)) {
++ if (!device_link_add(&dev->dev, &pdev->dev,
++ DL_FLAG_AUTOREMOVE_CONSUMER)) {
++ pci_err(dev, "failed to add device link to power control device %s\n",
++ pdev->name);
++ }
++ }
++ put_device(&pdev->dev);
+ }
+
+ if (!dn || of_device_is_available(dn))
+diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
+index 5e445a7bda3328..5bc5ab20aa6d96 100644
+--- a/drivers/pci/controller/cadence/pci-j721e.c
++++ b/drivers/pci/controller/cadence/pci-j721e.c
+@@ -284,6 +284,25 @@ static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie)
+ if (!ret)
+ offset = args.args[0];
+
++ /*
++ * The PCIe Controller's registers have different "reset-values"
++ * depending on the "strap" settings programmed into the PCIEn_CTRL
++ * register within the CTRL_MMR memory-mapped register space.
++ * The registers latch onto a "reset-value" based on the "strap"
++ * settings sampled after the PCIe Controller is powered on.
++ * To ensure that the "reset-values" are sampled accurately, power
++ * off the PCIe Controller before programming the "strap" settings
++ * and power it on after that. The runtime PM APIs namely
++ * pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and
++ * increment the usage counter respectively, causing GENPD to power off
++ * and power on the PCIe Controller.
++ */
++ ret = pm_runtime_put_sync(dev);
++ if (ret < 0) {
++ dev_err(dev, "Failed to power off PCIe Controller\n");
++ return ret;
++ }
++
+ ret = j721e_pcie_set_mode(pcie, syscon, offset);
+ if (ret < 0) {
+ dev_err(dev, "Failed to set pci mode\n");
+@@ -302,6 +321,12 @@ static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie)
+ return ret;
+ }
+
++ ret = pm_runtime_get_sync(dev);
++ if (ret < 0) {
++ dev_err(dev, "Failed to power on PCIe Controller\n");
++ return ret;
++ }
++
+ /* Enable ACSPCIE refclk output if the optional property exists */
+ syscon = syscon_regmap_lookup_by_phandle_optional(node,
+ "ti,syscon-acspcie-proxy-ctrl");
+@@ -440,6 +465,7 @@ static const struct of_device_id of_j721e_pcie_match[] = {
+ },
+ {},
+ };
++MODULE_DEVICE_TABLE(of, of_j721e_pcie_match);
+
+ static int j721e_pcie_probe(struct platform_device *pdev)
+ {
+diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
+index 2b2632e513b52f..21808a9e51586d 100644
+--- a/drivers/pci/controller/dwc/pci-keystone.c
++++ b/drivers/pci/controller/dwc/pci-keystone.c
+@@ -1201,8 +1201,8 @@ static int ks_pcie_probe(struct platform_device *pdev)
+ if (irq < 0)
+ return irq;
+
+- ret = request_irq(irq, ks_pcie_err_irq_handler, IRQF_SHARED,
+- "ks-pcie-error-irq", ks_pcie);
++ ret = devm_request_irq(dev, irq, ks_pcie_err_irq_handler, IRQF_SHARED,
++ "ks-pcie-error-irq", ks_pcie);
+ if (ret < 0) {
+ dev_err(dev, "failed to request error IRQ %d\n",
+ irq);
+diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
+index c16c4c2be4993a..1ac71ee0ac2553 100644
+--- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
++++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
+@@ -723,7 +723,7 @@ static int rcar_gen4_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable
+ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(23, 22), BIT(22));
+ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(18, 16), GENMASK(17, 16));
+ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(7, 6), BIT(6));
+- rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(2, 0), GENMASK(11, 0));
++ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(2, 0), GENMASK(1, 0));
+ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x1d4, GENMASK(16, 15), GENMASK(16, 15));
+ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x514, BIT(26), BIT(26));
+ rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x0f8, BIT(16), 0);
+diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
+index 0c0734aa14b680..f1cbc12dcaf147 100644
+--- a/drivers/pci/controller/dwc/pcie-tegra194.c
++++ b/drivers/pci/controller/dwc/pcie-tegra194.c
+@@ -1214,6 +1214,7 @@ static int tegra_pcie_bpmp_set_ctrl_state(struct tegra_pcie_dw *pcie,
+ struct mrq_uphy_response resp;
+ struct tegra_bpmp_message msg;
+ struct mrq_uphy_request req;
++ int err;
+
+ /*
+ * Controller-5 doesn't need to have its state set by BPMP-FW in
+@@ -1236,7 +1237,13 @@ static int tegra_pcie_bpmp_set_ctrl_state(struct tegra_pcie_dw *pcie,
+ msg.rx.data = &resp;
+ msg.rx.size = sizeof(resp);
+
+- return tegra_bpmp_transfer(pcie->bpmp, &msg);
++ err = tegra_bpmp_transfer(pcie->bpmp, &msg);
++ if (err)
++ return err;
++ if (msg.rx.ret)
++ return -EINVAL;
++
++ return 0;
+ }
+
+ static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie,
+@@ -1245,6 +1252,7 @@ static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie,
+ struct mrq_uphy_response resp;
+ struct tegra_bpmp_message msg;
+ struct mrq_uphy_request req;
++ int err;
+
+ memset(&req, 0, sizeof(req));
+ memset(&resp, 0, sizeof(resp));
+@@ -1264,7 +1272,13 @@ static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie,
+ msg.rx.data = &resp;
+ msg.rx.size = sizeof(resp);
+
+- return tegra_bpmp_transfer(pcie->bpmp, &msg);
++ err = tegra_bpmp_transfer(pcie->bpmp, &msg);
++ if (err)
++ return err;
++ if (msg.rx.ret)
++ return -EINVAL;
++
++ return 0;
+ }
+
+ static void tegra_pcie_downstream_dev_to_D0(struct tegra_pcie_dw *pcie)
+@@ -1941,6 +1955,15 @@ static irqreturn_t tegra_pcie_ep_pex_rst_irq(int irq, void *arg)
+ return IRQ_HANDLED;
+ }
+
++static void tegra_pcie_ep_init(struct dw_pcie_ep *ep)
++{
++ struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
++ enum pci_barno bar;
++
++ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++)
++ dw_pcie_ep_reset_bar(pci, bar);
++};
++
+ static int tegra_pcie_ep_raise_intx_irq(struct tegra_pcie_dw *pcie, u16 irq)
+ {
+ /* Tegra194 supports only INTA */
+@@ -1955,10 +1978,10 @@ static int tegra_pcie_ep_raise_intx_irq(struct tegra_pcie_dw *pcie, u16 irq)
+
+ static int tegra_pcie_ep_raise_msi_irq(struct tegra_pcie_dw *pcie, u16 irq)
+ {
+- if (unlikely(irq > 31))
++ if (unlikely(irq > 32))
+ return -EINVAL;
+
+- appl_writel(pcie, BIT(irq), APPL_MSI_CTRL_1);
++ appl_writel(pcie, BIT(irq - 1), APPL_MSI_CTRL_1);
+
+ return 0;
+ }
+@@ -2017,6 +2040,7 @@ tegra_pcie_ep_get_features(struct dw_pcie_ep *ep)
+ }
+
+ static const struct dw_pcie_ep_ops pcie_ep_ops = {
++ .init = tegra_pcie_ep_init,
+ .raise_irq = tegra_pcie_ep_raise_irq,
+ .get_features = tegra_pcie_ep_get_features,
+ };
+diff --git a/drivers/pci/controller/pci-tegra.c b/drivers/pci/controller/pci-tegra.c
+index bb88767a379795..942ddfca3bf6b7 100644
+--- a/drivers/pci/controller/pci-tegra.c
++++ b/drivers/pci/controller/pci-tegra.c
+@@ -14,6 +14,7 @@
+ */
+
+ #include <linux/clk.h>
++#include <linux/cleanup.h>
+ #include <linux/debugfs.h>
+ #include <linux/delay.h>
+ #include <linux/export.h>
+@@ -270,7 +271,7 @@ struct tegra_msi {
+ DECLARE_BITMAP(used, INT_PCI_MSI_NR);
+ struct irq_domain *domain;
+ struct mutex map_lock;
+- spinlock_t mask_lock;
++ raw_spinlock_t mask_lock;
+ void *virt;
+ dma_addr_t phys;
+ int irq;
+@@ -1581,14 +1582,13 @@ static void tegra_msi_irq_mask(struct irq_data *d)
+ struct tegra_msi *msi = irq_data_get_irq_chip_data(d);
+ struct tegra_pcie *pcie = msi_to_pcie(msi);
+ unsigned int index = d->hwirq / 32;
+- unsigned long flags;
+ u32 value;
+
+- spin_lock_irqsave(&msi->mask_lock, flags);
+- value = afi_readl(pcie, AFI_MSI_EN_VEC(index));
+- value &= ~BIT(d->hwirq % 32);
+- afi_writel(pcie, value, AFI_MSI_EN_VEC(index));
+- spin_unlock_irqrestore(&msi->mask_lock, flags);
++ scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) {
++ value = afi_readl(pcie, AFI_MSI_EN_VEC(index));
++ value &= ~BIT(d->hwirq % 32);
++ afi_writel(pcie, value, AFI_MSI_EN_VEC(index));
++ }
+ }
+
+ static void tegra_msi_irq_unmask(struct irq_data *d)
+@@ -1596,14 +1596,13 @@ static void tegra_msi_irq_unmask(struct irq_data *d)
+ struct tegra_msi *msi = irq_data_get_irq_chip_data(d);
+ struct tegra_pcie *pcie = msi_to_pcie(msi);
+ unsigned int index = d->hwirq / 32;
+- unsigned long flags;
+ u32 value;
+
+- spin_lock_irqsave(&msi->mask_lock, flags);
+- value = afi_readl(pcie, AFI_MSI_EN_VEC(index));
+- value |= BIT(d->hwirq % 32);
+- afi_writel(pcie, value, AFI_MSI_EN_VEC(index));
+- spin_unlock_irqrestore(&msi->mask_lock, flags);
++ scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) {
++ value = afi_readl(pcie, AFI_MSI_EN_VEC(index));
++ value |= BIT(d->hwirq % 32);
++ afi_writel(pcie, value, AFI_MSI_EN_VEC(index));
++ }
+ }
+
+ static void tegra_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+@@ -1711,7 +1710,7 @@ static int tegra_pcie_msi_setup(struct tegra_pcie *pcie)
+ int err;
+
+ mutex_init(&msi->map_lock);
+- spin_lock_init(&msi->mask_lock);
++ raw_spin_lock_init(&msi->mask_lock);
+
+ if (IS_ENABLED(CONFIG_PCI_MSI)) {
+ err = tegra_allocate_domains(msi);
+diff --git a/drivers/pci/controller/pcie-rcar-host.c b/drivers/pci/controller/pcie-rcar-host.c
+index 4780e0109e5834..213028052aa589 100644
+--- a/drivers/pci/controller/pcie-rcar-host.c
++++ b/drivers/pci/controller/pcie-rcar-host.c
+@@ -12,6 +12,7 @@
+ */
+
+ #include <linux/bitops.h>
++#include <linux/cleanup.h>
+ #include <linux/clk.h>
+ #include <linux/clk-provider.h>
+ #include <linux/delay.h>
+@@ -38,7 +39,7 @@ struct rcar_msi {
+ DECLARE_BITMAP(used, INT_PCI_MSI_NR);
+ struct irq_domain *domain;
+ struct mutex map_lock;
+- spinlock_t mask_lock;
++ raw_spinlock_t mask_lock;
+ int irq1;
+ int irq2;
+ };
+@@ -52,20 +53,13 @@ struct rcar_pcie_host {
+ int (*phy_init_fn)(struct rcar_pcie_host *host);
+ };
+
+-static DEFINE_SPINLOCK(pmsr_lock);
+-
+ static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base)
+ {
+- unsigned long flags;
+ u32 pmsr, val;
+ int ret = 0;
+
+- spin_lock_irqsave(&pmsr_lock, flags);
+-
+- if (!pcie_base || pm_runtime_suspended(pcie_dev)) {
+- ret = -EINVAL;
+- goto unlock_exit;
+- }
++ if (!pcie_base || pm_runtime_suspended(pcie_dev))
++ return -EINVAL;
+
+ pmsr = readl(pcie_base + PMSR);
+
+@@ -87,8 +81,6 @@ static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base)
+ writel(L1FAEG | PMEL1RX, pcie_base + PMSR);
+ }
+
+-unlock_exit:
+- spin_unlock_irqrestore(&pmsr_lock, flags);
+ return ret;
+ }
+
+@@ -611,28 +603,26 @@ static void rcar_msi_irq_mask(struct irq_data *d)
+ {
+ struct rcar_msi *msi = irq_data_get_irq_chip_data(d);
+ struct rcar_pcie *pcie = &msi_to_host(msi)->pcie;
+- unsigned long flags;
+ u32 value;
+
+- spin_lock_irqsave(&msi->mask_lock, flags);
+- value = rcar_pci_read_reg(pcie, PCIEMSIIER);
+- value &= ~BIT(d->hwirq);
+- rcar_pci_write_reg(pcie, value, PCIEMSIIER);
+- spin_unlock_irqrestore(&msi->mask_lock, flags);
++ scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) {
++ value = rcar_pci_read_reg(pcie, PCIEMSIIER);
++ value &= ~BIT(d->hwirq);
++ rcar_pci_write_reg(pcie, value, PCIEMSIIER);
++ }
+ }
+
+ static void rcar_msi_irq_unmask(struct irq_data *d)
+ {
+ struct rcar_msi *msi = irq_data_get_irq_chip_data(d);
+ struct rcar_pcie *pcie = &msi_to_host(msi)->pcie;
+- unsigned long flags;
+ u32 value;
+
+- spin_lock_irqsave(&msi->mask_lock, flags);
+- value = rcar_pci_read_reg(pcie, PCIEMSIIER);
+- value |= BIT(d->hwirq);
+- rcar_pci_write_reg(pcie, value, PCIEMSIIER);
+- spin_unlock_irqrestore(&msi->mask_lock, flags);
++ scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) {
++ value = rcar_pci_read_reg(pcie, PCIEMSIIER);
++ value |= BIT(d->hwirq);
++ rcar_pci_write_reg(pcie, value, PCIEMSIIER);
++ }
+ }
+
+ static void rcar_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+@@ -745,7 +735,7 @@ static int rcar_pcie_enable_msi(struct rcar_pcie_host *host)
+ int err;
+
+ mutex_init(&msi->map_lock);
+- spin_lock_init(&msi->mask_lock);
++ raw_spin_lock_init(&msi->mask_lock);
+
+ err = of_address_to_resource(dev->of_node, 0, &res);
+ if (err)
+diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
+index 05b8c205493cd8..7db2c96c6cec2b 100644
+--- a/drivers/pci/controller/pcie-xilinx-nwl.c
++++ b/drivers/pci/controller/pcie-xilinx-nwl.c
+@@ -718,9 +718,10 @@ static int nwl_pcie_bridge_init(struct nwl_pcie *pcie)
+ nwl_bridge_writel(pcie, nwl_bridge_readl(pcie, E_ECAM_CONTROL) |
+ E_ECAM_CR_ENABLE, E_ECAM_CONTROL);
+
+- nwl_bridge_writel(pcie, nwl_bridge_readl(pcie, E_ECAM_CONTROL) |
+- (NWL_ECAM_MAX_SIZE << E_ECAM_SIZE_SHIFT),
+- E_ECAM_CONTROL);
++ ecam_val = nwl_bridge_readl(pcie, E_ECAM_CONTROL);
++ ecam_val &= ~E_ECAM_SIZE_LOC;
++ ecam_val |= NWL_ECAM_MAX_SIZE << E_ECAM_SIZE_SHIFT;
++ nwl_bridge_writel(pcie, ecam_val, E_ECAM_CONTROL);
+
+ nwl_bridge_writel(pcie, lower_32_bits(pcie->phys_ecam_base),
+ E_ECAM_BASE_LO);
+diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
+index ac4375954c9479..77dee43b785838 100644
+--- a/drivers/pci/iov.c
++++ b/drivers/pci/iov.c
+@@ -629,15 +629,18 @@ static int sriov_add_vfs(struct pci_dev *dev, u16 num_vfs)
+ if (dev->no_vf_scan)
+ return 0;
+
++ pci_lock_rescan_remove();
+ for (i = 0; i < num_vfs; i++) {
+ rc = pci_iov_add_virtfn(dev, i);
+ if (rc)
+ goto failed;
+ }
++ pci_unlock_rescan_remove();
+ return 0;
+ failed:
+ while (i--)
+ pci_iov_remove_virtfn(dev, i);
++ pci_unlock_rescan_remove();
+
+ return rc;
+ }
+@@ -762,8 +765,10 @@ static void sriov_del_vfs(struct pci_dev *dev)
+ struct pci_sriov *iov = dev->sriov;
+ int i;
+
++ pci_lock_rescan_remove();
+ for (i = 0; i < iov->num_VFs; i++)
+ pci_iov_remove_virtfn(dev, i);
++ pci_unlock_rescan_remove();
+ }
+
+ static void sriov_disable(struct pci_dev *dev)
+diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
+index 63665240ae87f4..6405acdb5d0f38 100644
+--- a/drivers/pci/pci-driver.c
++++ b/drivers/pci/pci-driver.c
+@@ -1596,6 +1596,7 @@ void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type)
+ switch (err_type) {
+ case PCI_ERS_RESULT_NONE:
+ case PCI_ERS_RESULT_CAN_RECOVER:
++ case PCI_ERS_RESULT_NEED_RESET:
+ envp[idx++] = "ERROR_EVENT=BEGIN_RECOVERY";
+ envp[idx++] = "DEVICE_ONLINE=0";
+ break;
+diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
+index 5eea14c1f7f5f7..2b231ef1dac94d 100644
+--- a/drivers/pci/pci-sysfs.c
++++ b/drivers/pci/pci-sysfs.c
+@@ -201,8 +201,14 @@ static ssize_t max_link_width_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+ {
+ struct pci_dev *pdev = to_pci_dev(dev);
++ ssize_t ret;
+
+- return sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev));
++ /* We read PCI_EXP_LNKCAP, so we need the device to be accessible. */
++ pci_config_pm_runtime_get(pdev);
++ ret = sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev));
++ pci_config_pm_runtime_put(pdev);
++
++ return ret;
+ }
+ static DEVICE_ATTR_RO(max_link_width);
+
+@@ -214,7 +220,10 @@ static ssize_t current_link_speed_show(struct device *dev,
+ int err;
+ enum pci_bus_speed speed;
+
++ pci_config_pm_runtime_get(pci_dev);
+ err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat);
++ pci_config_pm_runtime_put(pci_dev);
++
+ if (err)
+ return -EINVAL;
+
+@@ -231,7 +240,10 @@ static ssize_t current_link_width_show(struct device *dev,
+ u16 linkstat;
+ int err;
+
++ pci_config_pm_runtime_get(pci_dev);
+ err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat);
++ pci_config_pm_runtime_put(pci_dev);
++
+ if (err)
+ return -EINVAL;
+
+@@ -247,7 +259,10 @@ static ssize_t secondary_bus_number_show(struct device *dev,
+ u8 sec_bus;
+ int err;
+
++ pci_config_pm_runtime_get(pci_dev);
+ err = pci_read_config_byte(pci_dev, PCI_SECONDARY_BUS, &sec_bus);
++ pci_config_pm_runtime_put(pci_dev);
++
+ if (err)
+ return -EINVAL;
+
+@@ -263,7 +278,10 @@ static ssize_t subordinate_bus_number_show(struct device *dev,
+ u8 sub_bus;
+ int err;
+
++ pci_config_pm_runtime_get(pci_dev);
+ err = pci_read_config_byte(pci_dev, PCI_SUBORDINATE_BUS, &sub_bus);
++ pci_config_pm_runtime_put(pci_dev);
++
+ if (err)
+ return -EINVAL;
+
+diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
+index 55abc5e17b8b10..9d23294ceb2f6c 100644
+--- a/drivers/pci/pcie/aer.c
++++ b/drivers/pci/pcie/aer.c
+@@ -43,7 +43,7 @@
+ #define AER_ERROR_SOURCES_MAX 128
+
+ #define AER_MAX_TYPEOF_COR_ERRS 16 /* as per PCI_ERR_COR_STATUS */
+-#define AER_MAX_TYPEOF_UNCOR_ERRS 27 /* as per PCI_ERR_UNCOR_STATUS*/
++#define AER_MAX_TYPEOF_UNCOR_ERRS 32 /* as per PCI_ERR_UNCOR_STATUS*/
+
+ struct aer_err_source {
+ u32 status; /* PCI_ERR_ROOT_STATUS */
+@@ -525,11 +525,11 @@ static const char *aer_uncorrectable_error_string[] = {
+ "AtomicOpBlocked", /* Bit Position 24 */
+ "TLPBlockedErr", /* Bit Position 25 */
+ "PoisonTLPBlocked", /* Bit Position 26 */
+- NULL, /* Bit Position 27 */
+- NULL, /* Bit Position 28 */
+- NULL, /* Bit Position 29 */
+- NULL, /* Bit Position 30 */
+- NULL, /* Bit Position 31 */
++ "DMWrReqBlocked", /* Bit Position 27 */
++ "IDECheck", /* Bit Position 28 */
++ "MisIDETLP", /* Bit Position 29 */
++ "PCRC_CHECK", /* Bit Position 30 */
++ "TLPXlatBlocked", /* Bit Position 31 */
+ };
+
+ static const char *aer_agent_string[] = {
+diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
+index de6381c690f5c2..a4990c9ad493ac 100644
+--- a/drivers/pci/pcie/err.c
++++ b/drivers/pci/pcie/err.c
+@@ -108,6 +108,12 @@ static int report_normal_detected(struct pci_dev *dev, void *data)
+ return report_error_detected(dev, pci_channel_io_normal, data);
+ }
+
++static int report_perm_failure_detected(struct pci_dev *dev, void *data)
++{
++ pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
++ return 0;
++}
++
+ static int report_mmio_enabled(struct pci_dev *dev, void *data)
+ {
+ struct pci_driver *pdrv;
+@@ -269,7 +275,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
+ failed:
+ pci_walk_bridge(bridge, pci_pm_runtime_put, NULL);
+
+- pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT);
++ pci_walk_bridge(bridge, report_perm_failure_detected, NULL);
+
+ pci_info(bridge, "device recovery failed\n");
+
+diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
+index f41128f91ca76a..a56dfa1c9b6ffe 100644
+--- a/drivers/pci/probe.c
++++ b/drivers/pci/probe.c
+@@ -2516,9 +2516,15 @@ static struct platform_device *pci_pwrctrl_create_device(struct pci_bus *bus, in
+ struct device_node *np;
+
+ np = of_pci_find_child_device(dev_of_node(&bus->dev), devfn);
+- if (!np || of_find_device_by_node(np))
++ if (!np)
+ return NULL;
+
++ pdev = of_find_device_by_node(np);
++ if (pdev) {
++ put_device(&pdev->dev);
++ goto err_put_of_node;
++ }
++
+ /*
+ * First check whether the pwrctrl device really needs to be created or
+ * not. This is decided based on at least one of the power supplies
+@@ -2526,17 +2532,24 @@ static struct platform_device *pci_pwrctrl_create_device(struct pci_bus *bus, in
+ */
+ if (!of_pci_supply_present(np)) {
+ pr_debug("PCI/pwrctrl: Skipping OF node: %s\n", np->name);
+- return NULL;
++ goto err_put_of_node;
+ }
+
+ /* Now create the pwrctrl device */
+ pdev = of_platform_device_create(np, NULL, &host->dev);
+ if (!pdev) {
+ pr_err("PCI/pwrctrl: Failed to create pwrctrl device for node: %s\n", np->name);
+- return NULL;
++ goto err_put_of_node;
+ }
+
++ of_node_put(np);
++
+ return pdev;
++
++err_put_of_node:
++ of_node_put(np);
++
++ return NULL;
+ }
+ #else
+ static struct platform_device *pci_pwrctrl_create_device(struct pci_bus *bus, int devfn)
+diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
+index 445afdfa6498ed..16f21edbc29d40 100644
+--- a/drivers/pci/remove.c
++++ b/drivers/pci/remove.c
+@@ -31,6 +31,8 @@ static void pci_pwrctrl_unregister(struct device *dev)
+ return;
+
+ of_device_unregister(pdev);
++ put_device(&pdev->dev);
++
+ of_node_clear_flag(np, OF_POPULATED);
+ }
+
+diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
+index 7853ac6999e2ca..77a566aeae6013 100644
+--- a/drivers/pci/setup-bus.c
++++ b/drivers/pci/setup-bus.c
+@@ -28,6 +28,10 @@
+ #include <linux/acpi.h>
+ #include "pci.h"
+
++#define PCI_RES_TYPE_MASK \
++ (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
++ IORESOURCE_MEM_64)
++
+ unsigned int pci_flags;
+ EXPORT_SYMBOL_GPL(pci_flags);
+
+@@ -384,13 +388,19 @@ static bool pci_need_to_release(unsigned long mask, struct resource *res)
+ }
+
+ /* Return: @true if assignment of a required resource failed. */
+-static bool pci_required_resource_failed(struct list_head *fail_head)
++static bool pci_required_resource_failed(struct list_head *fail_head,
++ unsigned long type)
+ {
+ struct pci_dev_resource *fail_res;
+
++ type &= PCI_RES_TYPE_MASK;
++
+ list_for_each_entry(fail_res, fail_head, list) {
+ int idx = pci_resource_num(fail_res->dev, fail_res->res);
+
++ if (type && (fail_res->flags & PCI_RES_TYPE_MASK) != type)
++ continue;
++
+ if (!pci_resource_is_optional(fail_res->dev, idx))
+ return true;
+ }
+@@ -504,7 +514,7 @@ static void __assign_resources_sorted(struct list_head *head,
+ }
+
+ /* Without realloc_head and only optional fails, nothing more to do. */
+- if (!pci_required_resource_failed(&local_fail_head) &&
++ if (!pci_required_resource_failed(&local_fail_head, 0) &&
+ list_empty(realloc_head)) {
+ list_for_each_entry(save_res, &save_head, list) {
+ struct resource *res = save_res->res;
+@@ -1169,6 +1179,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
+ resource_size_t children_add_size = 0;
+ resource_size_t children_add_align = 0;
+ resource_size_t add_align = 0;
++ resource_size_t relaxed_align;
+
+ if (!b_res)
+ return -ENOSPC;
+@@ -1246,8 +1257,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
+ if (bus->self && size0 &&
+ !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
+ size0, min_align)) {
+- min_align = 1ULL << (max_order + __ffs(SZ_1M));
+- min_align = max(min_align, win_align);
++ relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
++ relaxed_align = max(relaxed_align, win_align);
++ min_align = min(min_align, relaxed_align);
+ size0 = calculate_memsize(size, min_size, 0, 0, resource_size(b_res), win_align);
+ pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n",
+ b_res, &bus->busn_res);
+@@ -1261,8 +1273,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
+ if (bus->self && size1 &&
+ !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
+ size1, add_align)) {
+- min_align = 1ULL << (max_order + __ffs(SZ_1M));
+- min_align = max(min_align, win_align);
++ relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
++ relaxed_align = max(relaxed_align, win_align);
++ min_align = min(min_align, relaxed_align);
+ size1 = calculate_memsize(size, min_size, add_size, children_add_size,
+ resource_size(b_res), win_align);
+ pci_info(bus->self,
+@@ -1704,10 +1717,6 @@ static void __pci_bridge_assign_resources(const struct pci_dev *bridge,
+ }
+ }
+
+-#define PCI_RES_TYPE_MASK \
+- (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
+- IORESOURCE_MEM_64)
+-
+ static void pci_bridge_release_resources(struct pci_bus *bus,
+ unsigned long type)
+ {
+@@ -2446,8 +2455,12 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
+ free_list(&added);
+
+ if (!list_empty(&failed)) {
+- ret = -ENOSPC;
+- goto cleanup;
++ if (pci_required_resource_failed(&failed, type)) {
++ ret = -ENOSPC;
++ goto cleanup;
++ }
++ /* Only resources with unrelated types failed (again) */
++ free_list(&failed);
+ }
+
+ list_for_each_entry(dev_res, &saved, list) {
+diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
+index 11fb2234b10fcf..23245352a3fc0a 100644
+--- a/drivers/perf/arm-cmn.c
++++ b/drivers/perf/arm-cmn.c
+@@ -65,7 +65,7 @@
+ /* PMU registers occupy the 3rd 4KB page of each node's region */
+ #define CMN_PMU_OFFSET 0x2000
+ /* ...except when they don't :( */
+-#define CMN_S3_DTM_OFFSET 0xa000
++#define CMN_S3_R1_DTM_OFFSET 0xa000
+ #define CMN_S3_PMU_OFFSET 0xd900
+
+ /* For most nodes, this is all there is */
+@@ -233,6 +233,9 @@ enum cmn_revision {
+ REV_CMN700_R1P0,
+ REV_CMN700_R2P0,
+ REV_CMN700_R3P0,
++ REV_CMNS3_R0P0 = 0,
++ REV_CMNS3_R0P1,
++ REV_CMNS3_R1P0,
+ REV_CI700_R0P0 = 0,
+ REV_CI700_R1P0,
+ REV_CI700_R2P0,
+@@ -425,8 +428,8 @@ static enum cmn_model arm_cmn_model(const struct arm_cmn *cmn)
+ static int arm_cmn_pmu_offset(const struct arm_cmn *cmn, const struct arm_cmn_node *dn)
+ {
+ if (cmn->part == PART_CMN_S3) {
+- if (dn->type == CMN_TYPE_XP)
+- return CMN_S3_DTM_OFFSET;
++ if (cmn->rev >= REV_CMNS3_R1P0 && dn->type == CMN_TYPE_XP)
++ return CMN_S3_R1_DTM_OFFSET;
+ return CMN_S3_PMU_OFFSET;
+ }
+ return CMN_PMU_OFFSET;
+diff --git a/drivers/pinctrl/samsung/pinctrl-samsung.h b/drivers/pinctrl/samsung/pinctrl-samsung.h
+index 1cabcbe1401a61..a51aee8c5f89f5 100644
+--- a/drivers/pinctrl/samsung/pinctrl-samsung.h
++++ b/drivers/pinctrl/samsung/pinctrl-samsung.h
+@@ -402,10 +402,6 @@ extern const struct samsung_pinctrl_of_match_data exynosautov920_of_data;
+ extern const struct samsung_pinctrl_of_match_data fsd_of_data;
+ extern const struct samsung_pinctrl_of_match_data gs101_of_data;
+ extern const struct samsung_pinctrl_of_match_data s3c64xx_of_data;
+-extern const struct samsung_pinctrl_of_match_data s3c2412_of_data;
+-extern const struct samsung_pinctrl_of_match_data s3c2416_of_data;
+-extern const struct samsung_pinctrl_of_match_data s3c2440_of_data;
+-extern const struct samsung_pinctrl_of_match_data s3c2450_of_data;
+ extern const struct samsung_pinctrl_of_match_data s5pv210_of_data;
+
+ #endif /* __PINCTRL_SAMSUNG_H */
+diff --git a/drivers/power/supply/max77976_charger.c b/drivers/power/supply/max77976_charger.c
+index e6fe68cebc32b6..3d6ff400553305 100644
+--- a/drivers/power/supply/max77976_charger.c
++++ b/drivers/power/supply/max77976_charger.c
+@@ -292,10 +292,10 @@ static int max77976_get_property(struct power_supply *psy,
+ case POWER_SUPPLY_PROP_ONLINE:
+ err = max77976_get_online(chg, &val->intval);
+ break;
+- case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT_MAX:
++ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX:
+ val->intval = MAX77976_CHG_CC_MAX;
+ break;
+- case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT:
++ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+ err = max77976_get_integer(chg, CHG_CC,
+ MAX77976_CHG_CC_MIN,
+ MAX77976_CHG_CC_MAX,
+@@ -330,7 +330,7 @@ static int max77976_set_property(struct power_supply *psy,
+ int err = 0;
+
+ switch (psp) {
+- case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT:
++ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+ err = max77976_set_integer(chg, CHG_CC,
+ MAX77976_CHG_CC_MIN,
+ MAX77976_CHG_CC_MAX,
+@@ -355,7 +355,7 @@ static int max77976_property_is_writeable(struct power_supply *psy,
+ enum power_supply_property psp)
+ {
+ switch (psp) {
+- case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT:
++ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+ case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT:
+ return true;
+ default:
+@@ -368,8 +368,8 @@ static enum power_supply_property max77976_psy_props[] = {
+ POWER_SUPPLY_PROP_CHARGE_TYPE,
+ POWER_SUPPLY_PROP_HEALTH,
+ POWER_SUPPLY_PROP_ONLINE,
+- POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT,
+- POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT_MAX,
++ POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT,
++ POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX,
+ POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT,
+ POWER_SUPPLY_PROP_MODEL_NAME,
+ POWER_SUPPLY_PROP_MANUFACTURER,
+diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
+index 0d66376a83ec35..bff0057d87a42b 100644
+--- a/drivers/pwm/core.c
++++ b/drivers/pwm/core.c
+@@ -276,7 +276,7 @@ int pwm_round_waveform_might_sleep(struct pwm_device *pwm, struct pwm_waveform *
+
+ if (IS_ENABLED(CONFIG_PWM_DEBUG) && ret_fromhw > 0)
+ dev_err(&chip->dev, "Unexpected return value from __pwm_round_waveform_fromhw: requested %llu/%llu [+%llu], return value %d\n",
+- wf_req.duty_length_ns, wf_req.period_length_ns, wf_req.duty_offset_ns, ret_tohw);
++ wf_req.duty_length_ns, wf_req.period_length_ns, wf_req.duty_offset_ns, ret_fromhw);
+
+ if (IS_ENABLED(CONFIG_PWM_DEBUG) &&
+ (ret_tohw == 0) != pwm_check_rounding(&wf_req, wf))
+diff --git a/drivers/pwm/pwm-berlin.c b/drivers/pwm/pwm-berlin.c
+index 831aed228cafcb..858d369913742c 100644
+--- a/drivers/pwm/pwm-berlin.c
++++ b/drivers/pwm/pwm-berlin.c
+@@ -234,7 +234,7 @@ static int berlin_pwm_suspend(struct device *dev)
+ for (i = 0; i < chip->npwm; i++) {
+ struct berlin_pwm_channel *channel = &bpc->channel[i];
+
+- channel->enable = berlin_pwm_readl(bpc, i, BERLIN_PWM_ENABLE);
++ channel->enable = berlin_pwm_readl(bpc, i, BERLIN_PWM_EN);
+ channel->ctrl = berlin_pwm_readl(bpc, i, BERLIN_PWM_CONTROL);
+ channel->duty = berlin_pwm_readl(bpc, i, BERLIN_PWM_DUTY);
+ channel->tcnt = berlin_pwm_readl(bpc, i, BERLIN_PWM_TCNT);
+@@ -262,7 +262,7 @@ static int berlin_pwm_resume(struct device *dev)
+ berlin_pwm_writel(bpc, i, channel->ctrl, BERLIN_PWM_CONTROL);
+ berlin_pwm_writel(bpc, i, channel->duty, BERLIN_PWM_DUTY);
+ berlin_pwm_writel(bpc, i, channel->tcnt, BERLIN_PWM_TCNT);
+- berlin_pwm_writel(bpc, i, channel->enable, BERLIN_PWM_ENABLE);
++ berlin_pwm_writel(bpc, i, channel->enable, BERLIN_PWM_EN);
+ }
+
+ return 0;
+diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
+index dc741ba29fa35f..b8b298efd9a9c3 100644
+--- a/drivers/rtc/interface.c
++++ b/drivers/rtc/interface.c
+@@ -443,6 +443,29 @@ static int __rtc_set_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
+ else
+ err = rtc->ops->set_alarm(rtc->dev.parent, alarm);
+
++ /*
++ * Check for potential race described above. If the waiting for next
++ * second, and the second just ticked since the check above, either
++ *
++ * 1) It ticked after the alarm was set, and an alarm irq should be
++ * generated.
++ *
++ * 2) It ticked before the alarm was set, and alarm irq most likely will
++ * not be generated.
++ *
++ * While we cannot easily check for which of these two scenarios we
++ * are in, we can return -ETIME to signal that the timer has already
++ * expired, which is true in both cases.
++ */
++ if ((scheduled - now) <= 1) {
++ err = __rtc_read_time(rtc, &tm);
++ if (err)
++ return err;
++ now = rtc_tm_to_time64(&tm);
++ if (scheduled <= now)
++ return -ETIME;
++ }
++
+ trace_rtc_set_alarm(rtc_tm_to_time64(&alarm->time), err);
+ return err;
+ }
+@@ -594,6 +617,10 @@ int rtc_update_irq_enable(struct rtc_device *rtc, unsigned int enabled)
+ rtc->uie_rtctimer.node.expires = ktime_add(now, onesec);
+ rtc->uie_rtctimer.period = ktime_set(1, 0);
+ err = rtc_timer_enqueue(rtc, &rtc->uie_rtctimer);
++ if (!err && rtc->ops && rtc->ops->alarm_irq_enable)
++ err = rtc->ops->alarm_irq_enable(rtc->dev.parent, 1);
++ if (err)
++ goto out;
+ } else {
+ rtc_timer_remove(rtc, &rtc->uie_rtctimer);
+ }
+diff --git a/drivers/rtc/rtc-isl12022.c b/drivers/rtc/rtc-isl12022.c
+index 9b44839a7402c9..5fc52dc6421305 100644
+--- a/drivers/rtc/rtc-isl12022.c
++++ b/drivers/rtc/rtc-isl12022.c
+@@ -413,6 +413,7 @@ static int isl12022_setup_irq(struct device *dev, int irq)
+ if (ret)
+ return ret;
+
++ isl12022->irq_enabled = true;
+ ret = devm_request_threaded_irq(dev, irq, NULL,
+ isl12022_rtc_interrupt,
+ IRQF_SHARED | IRQF_ONESHOT,
+diff --git a/drivers/rtc/rtc-optee.c b/drivers/rtc/rtc-optee.c
+index 9f8b5d4a8f6b65..6b77c122fdc109 100644
+--- a/drivers/rtc/rtc-optee.c
++++ b/drivers/rtc/rtc-optee.c
+@@ -320,6 +320,7 @@ static int optee_rtc_remove(struct device *dev)
+ {
+ struct optee_rtc *priv = dev_get_drvdata(dev);
+
++ tee_shm_free(priv->shm);
+ tee_client_close_session(priv->ctx, priv->session_id);
+ tee_client_close_context(priv->ctx);
+
+diff --git a/drivers/rtc/rtc-x1205.c b/drivers/rtc/rtc-x1205.c
+index 4bcd7ca32f27bf..b8a0fccef14e03 100644
+--- a/drivers/rtc/rtc-x1205.c
++++ b/drivers/rtc/rtc-x1205.c
+@@ -669,7 +669,7 @@ static const struct i2c_device_id x1205_id[] = {
+ MODULE_DEVICE_TABLE(i2c, x1205_id);
+
+ static const struct of_device_id x1205_dt_ids[] = {
+- { .compatible = "xircom,x1205", },
++ { .compatible = "xicor,x1205", },
+ {},
+ };
+ MODULE_DEVICE_TABLE(of, x1205_dt_ids);
+diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
+index 506a947d00a51b..417a2dcf15878a 100644
+--- a/drivers/s390/block/dasd.c
++++ b/drivers/s390/block/dasd.c
+@@ -334,6 +334,11 @@ static int dasd_state_basic_to_ready(struct dasd_device *device)
+ lim.max_dev_sectors = device->discipline->max_sectors(block);
+ lim.max_hw_sectors = lim.max_dev_sectors;
+ lim.logical_block_size = block->bp_block;
++ /*
++ * Adjust dma_alignment to match block_size - 1
++ * to ensure proper buffer alignment checks in the block layer.
++ */
++ lim.dma_alignment = lim.logical_block_size - 1;
+
+ if (device->discipline->has_discard) {
+ unsigned int max_bytes;
+@@ -3114,12 +3119,14 @@ static blk_status_t do_dasd_request(struct blk_mq_hw_ctx *hctx,
+ PTR_ERR(cqr) == -ENOMEM ||
+ PTR_ERR(cqr) == -EAGAIN) {
+ rc = BLK_STS_RESOURCE;
+- goto out;
++ } else if (PTR_ERR(cqr) == -EINVAL) {
++ rc = BLK_STS_INVAL;
++ } else {
++ DBF_DEV_EVENT(DBF_ERR, basedev,
++ "CCW creation failed (rc=%ld) on request %p",
++ PTR_ERR(cqr), req);
++ rc = BLK_STS_IOERR;
+ }
+- DBF_DEV_EVENT(DBF_ERR, basedev,
+- "CCW creation failed (rc=%ld) on request %p",
+- PTR_ERR(cqr), req);
+- rc = BLK_STS_IOERR;
+ goto out;
+ }
+ /*
+diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
+index fb2c07cb4d3dd3..4b2dae6eb37609 100644
+--- a/drivers/s390/cio/device.c
++++ b/drivers/s390/cio/device.c
+@@ -1316,23 +1316,34 @@ void ccw_device_schedule_recovery(void)
+ spin_unlock_irqrestore(&recovery_lock, flags);
+ }
+
+-static int purge_fn(struct device *dev, void *data)
++static int purge_fn(struct subchannel *sch, void *data)
+ {
+- struct ccw_device *cdev = to_ccwdev(dev);
+- struct ccw_dev_id *id = &cdev->private->dev_id;
+- struct subchannel *sch = to_subchannel(cdev->dev.parent);
++ struct ccw_device *cdev;
+
+- spin_lock_irq(cdev->ccwlock);
+- if (is_blacklisted(id->ssid, id->devno) &&
+- (cdev->private->state == DEV_STATE_OFFLINE) &&
+- (atomic_cmpxchg(&cdev->private->onoff, 0, 1) == 0)) {
+- CIO_MSG_EVENT(3, "ccw: purging 0.%x.%04x\n", id->ssid,
+- id->devno);
++ spin_lock_irq(&sch->lock);
++ if (sch->st != SUBCHANNEL_TYPE_IO || !sch->schib.pmcw.dnv)
++ goto unlock;
++
++ if (!is_blacklisted(sch->schid.ssid, sch->schib.pmcw.dev))
++ goto unlock;
++
++ cdev = sch_get_cdev(sch);
++ if (cdev) {
++ if (cdev->private->state != DEV_STATE_OFFLINE)
++ goto unlock;
++
++ if (atomic_cmpxchg(&cdev->private->onoff, 0, 1) != 0)
++ goto unlock;
+ ccw_device_sched_todo(cdev, CDEV_TODO_UNREG);
+- css_sched_sch_todo(sch, SCH_TODO_UNREG);
+ atomic_set(&cdev->private->onoff, 0);
+ }
+- spin_unlock_irq(cdev->ccwlock);
++
++ css_sched_sch_todo(sch, SCH_TODO_UNREG);
++ CIO_MSG_EVENT(3, "ccw: purging 0.%x.%04x%s\n", sch->schid.ssid,
++ sch->schib.pmcw.dev, cdev ? "" : " (no cdev)");
++
++unlock:
++ spin_unlock_irq(&sch->lock);
+ /* Abort loop in case of pending signal. */
+ if (signal_pending(current))
+ return -EINTR;
+@@ -1348,7 +1359,7 @@ static int purge_fn(struct device *dev, void *data)
+ int ccw_purge_blacklisted(void)
+ {
+ CIO_MSG_EVENT(2, "ccw: purging blacklisted devices\n");
+- bus_for_each_dev(&ccw_bus_type, NULL, NULL, purge_fn);
++ for_each_subchannel_staged(purge_fn, NULL, NULL);
+ return 0;
+ }
+
+diff --git a/drivers/s390/cio/ioasm.c b/drivers/s390/cio/ioasm.c
+index a540045b64a6ef..8b06b234e1101c 100644
+--- a/drivers/s390/cio/ioasm.c
++++ b/drivers/s390/cio/ioasm.c
+@@ -253,11 +253,10 @@ static inline int __xsch(struct subchannel_id schid)
+ asm volatile(
+ " lgr 1,%[r1]\n"
+ " xsch\n"
+- " ipm %[cc]\n"
+- " srl %[cc],28\n"
+- : [cc] "=&d" (ccode)
++ CC_IPM(cc)
++ : CC_OUT(cc, ccode)
+ : [r1] "d" (r1)
+- : "cc", "1");
++ : CC_CLOBBER_LIST("1"));
+ return CC_TRANSFORM(ccode);
+ }
+
+diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
+index c73a71ac3c2901..1c6161d0b85c26 100644
+--- a/drivers/scsi/hpsa.c
++++ b/drivers/scsi/hpsa.c
+@@ -6522,18 +6522,21 @@ static int hpsa_big_passthru_ioctl(struct ctlr_info *h,
+ while (left) {
+ sz = (left > ioc->malloc_size) ? ioc->malloc_size : left;
+ buff_size[sg_used] = sz;
+- buff[sg_used] = kmalloc(sz, GFP_KERNEL);
+- if (buff[sg_used] == NULL) {
+- status = -ENOMEM;
+- goto cleanup1;
+- }
++
+ if (ioc->Request.Type.Direction & XFER_WRITE) {
+- if (copy_from_user(buff[sg_used], data_ptr, sz)) {
+- status = -EFAULT;
++ buff[sg_used] = memdup_user(data_ptr, sz);
++ if (IS_ERR(buff[sg_used])) {
++ status = PTR_ERR(buff[sg_used]);
+ goto cleanup1;
+ }
+- } else
+- memset(buff[sg_used], 0, sz);
++ } else {
++ buff[sg_used] = kzalloc(sz, GFP_KERNEL);
++ if (!buff[sg_used]) {
++ status = -ENOMEM;
++ goto cleanup1;
++ }
++ }
++
+ left -= sz;
+ data_ptr += sz;
+ sg_used++;
+diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
+index 2c72da6b8cf0c0..7f1ad305eee634 100644
+--- a/drivers/scsi/mvsas/mv_init.c
++++ b/drivers/scsi/mvsas/mv_init.c
+@@ -124,7 +124,7 @@ static void mvs_free(struct mvs_info *mvi)
+ if (mvi->shost)
+ scsi_host_put(mvi->shost);
+ list_for_each_entry(mwq, &mvi->wq_list, entry)
+- cancel_delayed_work(&mwq->work_q);
++ cancel_delayed_work_sync(&mwq->work_q);
+ kfree(mvi->rsvd_tags);
+ kfree(mvi);
+ }
+diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
+index 5b8668accf8e8d..bf12e23f121213 100644
+--- a/drivers/scsi/sd.c
++++ b/drivers/scsi/sd.c
+@@ -3696,10 +3696,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
+ struct scsi_disk *sdkp = scsi_disk(disk);
+ struct scsi_device *sdp = sdkp->device;
+ sector_t old_capacity = sdkp->capacity;
+- struct queue_limits lim;
+- unsigned char *buffer;
++ struct queue_limits *lim = NULL;
++ unsigned char *buffer = NULL;
+ unsigned int dev_max;
+- int err;
++ int err = 0;
+
+ SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
+ "sd_revalidate_disk\n"));
+@@ -3711,6 +3711,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
+ if (!scsi_device_online(sdp))
+ goto out;
+
++ lim = kmalloc(sizeof(*lim), GFP_KERNEL);
++ if (!lim)
++ goto out;
++
+ buffer = kmalloc(SD_BUF_SIZE, GFP_KERNEL);
+ if (!buffer) {
+ sd_printk(KERN_WARNING, sdkp, "sd_revalidate_disk: Memory "
+@@ -3720,14 +3724,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
+
+ sd_spinup_disk(sdkp);
+
+- lim = queue_limits_start_update(sdkp->disk->queue);
++ *lim = queue_limits_start_update(sdkp->disk->queue);
+
+ /*
+ * Without media there is no reason to ask; moreover, some devices
+ * react badly if we do.
+ */
+ if (sdkp->media_present) {
+- sd_read_capacity(sdkp, &lim, buffer);
++ sd_read_capacity(sdkp, lim, buffer);
+ /*
+ * Some USB/UAS devices return generic values for mode pages
+ * until the media has been accessed. Trigger a READ operation
+@@ -3741,17 +3745,17 @@ static int sd_revalidate_disk(struct gendisk *disk)
+ * cause this to be updated correctly and any device which
+ * doesn't support it should be treated as rotational.
+ */
+- lim.features |= (BLK_FEAT_ROTATIONAL | BLK_FEAT_ADD_RANDOM);
++ lim->features |= (BLK_FEAT_ROTATIONAL | BLK_FEAT_ADD_RANDOM);
+
+ if (scsi_device_supports_vpd(sdp)) {
+ sd_read_block_provisioning(sdkp);
+- sd_read_block_limits(sdkp, &lim);
++ sd_read_block_limits(sdkp, lim);
+ sd_read_block_limits_ext(sdkp);
+- sd_read_block_characteristics(sdkp, &lim);
+- sd_zbc_read_zones(sdkp, &lim, buffer);
++ sd_read_block_characteristics(sdkp, lim);
++ sd_zbc_read_zones(sdkp, lim, buffer);
+ }
+
+- sd_config_discard(sdkp, &lim, sd_discard_mode(sdkp));
++ sd_config_discard(sdkp, lim, sd_discard_mode(sdkp));
+
+ sd_print_capacity(sdkp, old_capacity);
+
+@@ -3761,47 +3765,46 @@ static int sd_revalidate_disk(struct gendisk *disk)
+ sd_read_app_tag_own(sdkp, buffer);
+ sd_read_write_same(sdkp, buffer);
+ sd_read_security(sdkp, buffer);
+- sd_config_protection(sdkp, &lim);
++ sd_config_protection(sdkp, lim);
+ }
+
+ /*
+ * We now have all cache related info, determine how we deal
+ * with flush requests.
+ */
+- sd_set_flush_flag(sdkp, &lim);
++ sd_set_flush_flag(sdkp, lim);
+
+ /* Initial block count limit based on CDB TRANSFER LENGTH field size. */
+ dev_max = sdp->use_16_for_rw ? SD_MAX_XFER_BLOCKS : SD_DEF_XFER_BLOCKS;
+
+ /* Some devices report a maximum block count for READ/WRITE requests. */
+ dev_max = min_not_zero(dev_max, sdkp->max_xfer_blocks);
+- lim.max_dev_sectors = logical_to_sectors(sdp, dev_max);
++ lim->max_dev_sectors = logical_to_sectors(sdp, dev_max);
+
+ if (sd_validate_min_xfer_size(sdkp))
+- lim.io_min = logical_to_bytes(sdp, sdkp->min_xfer_blocks);
++ lim->io_min = logical_to_bytes(sdp, sdkp->min_xfer_blocks);
+ else
+- lim.io_min = 0;
++ lim->io_min = 0;
+
+ /*
+ * Limit default to SCSI host optimal sector limit if set. There may be
+ * an impact on performance for when the size of a request exceeds this
+ * host limit.
+ */
+- lim.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
++ lim->io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
+ if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
+- lim.io_opt = min_not_zero(lim.io_opt,
++ lim->io_opt = min_not_zero(lim->io_opt,
+ logical_to_bytes(sdp, sdkp->opt_xfer_blocks));
+ }
+
+ sdkp->first_scan = 0;
+
+ set_capacity_and_notify(disk, logical_to_sectors(sdp, sdkp->capacity));
+- sd_config_write_same(sdkp, &lim);
+- kfree(buffer);
++ sd_config_write_same(sdkp, lim);
+
+- err = queue_limits_commit_update_frozen(sdkp->disk->queue, &lim);
++ err = queue_limits_commit_update_frozen(sdkp->disk->queue, lim);
+ if (err)
+- return err;
++ goto out;
+
+ /*
+ * Query concurrent positioning ranges after
+@@ -3820,7 +3823,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
+ set_capacity_and_notify(disk, 0);
+
+ out:
+- return 0;
++ kfree(buffer);
++ kfree(lim);
++
++ return err;
+ }
+
+ /**
+diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
+index d288e9d9c18739..d1a59120d38457 100644
+--- a/drivers/spi/spi-cadence-quadspi.c
++++ b/drivers/spi/spi-cadence-quadspi.c
+@@ -720,6 +720,7 @@ static int cqspi_read_setup(struct cqspi_flash_pdata *f_pdata,
+ reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
+ reg |= (op->addr.nbytes - 1);
+ writel(reg, reg_base + CQSPI_REG_SIZE);
++ readl(reg_base + CQSPI_REG_SIZE); /* Flush posted write. */
+ return 0;
+ }
+
+@@ -765,6 +766,7 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata,
+ reinit_completion(&cqspi->transfer_complete);
+ writel(CQSPI_REG_INDIRECTRD_START_MASK,
+ reg_base + CQSPI_REG_INDIRECTRD);
++ readl(reg_base + CQSPI_REG_INDIRECTRD); /* Flush posted write. */
+
+ while (remaining > 0) {
+ if (use_irq &&
+@@ -1063,6 +1065,7 @@ static int cqspi_write_setup(struct cqspi_flash_pdata *f_pdata,
+ reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
+ reg |= (op->addr.nbytes - 1);
+ writel(reg, reg_base + CQSPI_REG_SIZE);
++ readl(reg_base + CQSPI_REG_SIZE); /* Flush posted write. */
+ return 0;
+ }
+
+@@ -1091,6 +1094,8 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
+ reinit_completion(&cqspi->transfer_complete);
+ writel(CQSPI_REG_INDIRECTWR_START_MASK,
+ reg_base + CQSPI_REG_INDIRECTWR);
++ readl(reg_base + CQSPI_REG_INDIRECTWR); /* Flush posted write. */
++
+ /*
+ * As per 66AK2G02 TRM SPRUHY8F section 11.15.5.3 Indirect Access
+ * Controller programming sequence, couple of cycles of
+@@ -1722,12 +1727,10 @@ static const struct spi_controller_mem_caps cqspi_mem_caps = {
+
+ static int cqspi_setup_flash(struct cqspi_st *cqspi)
+ {
+- unsigned int max_cs = cqspi->num_chipselect - 1;
+ struct platform_device *pdev = cqspi->pdev;
+ struct device *dev = &pdev->dev;
+ struct cqspi_flash_pdata *f_pdata;
+- unsigned int cs;
+- int ret;
++ int ret, cs, max_cs = -1;
+
+ /* Get flash device data */
+ for_each_available_child_of_node_scoped(dev->of_node, np) {
+@@ -1740,10 +1743,10 @@ static int cqspi_setup_flash(struct cqspi_st *cqspi)
+ if (cs >= cqspi->num_chipselect) {
+ dev_err(dev, "Chip select %d out of range.\n", cs);
+ return -EINVAL;
+- } else if (cs < max_cs) {
+- max_cs = cs;
+ }
+
++ max_cs = max_t(int, cs, max_cs);
++
+ f_pdata = &cqspi->f_pdata[cs];
+ f_pdata->cqspi = cqspi;
+ f_pdata->cs = cs;
+@@ -1753,6 +1756,11 @@ static int cqspi_setup_flash(struct cqspi_st *cqspi)
+ return ret;
+ }
+
++ if (max_cs < 0) {
++ dev_err(dev, "No flash device declared\n");
++ return -ENODEV;
++ }
++
+ cqspi->num_chipselect = max_cs + 1;
+ return 0;
+ }
+diff --git a/drivers/staging/media/ipu7/ipu7-isys-video.c b/drivers/staging/media/ipu7/ipu7-isys-video.c
+index 8756da3a8fb0bf..173afd405d9bad 100644
+--- a/drivers/staging/media/ipu7/ipu7-isys-video.c
++++ b/drivers/staging/media/ipu7/ipu7-isys-video.c
+@@ -946,6 +946,7 @@ void ipu7_isys_fw_close(struct ipu7_isys *isys)
+ ipu7_fw_isys_close(isys);
+
+ mutex_unlock(&isys->mutex);
++ pm_runtime_put(&isys->adev->auxdev.dev);
+ }
+
+ int ipu7_isys_setup_video(struct ipu7_isys_video *av,
+diff --git a/drivers/ufs/core/ufs-sysfs.c b/drivers/ufs/core/ufs-sysfs.c
+index 0086816b27cd90..c040afc6668e8c 100644
+--- a/drivers/ufs/core/ufs-sysfs.c
++++ b/drivers/ufs/core/ufs-sysfs.c
+@@ -1949,7 +1949,7 @@ static umode_t ufs_sysfs_hid_is_visible(struct kobject *kobj,
+ return hba->dev_info.hid_sup ? attr->mode : 0;
+ }
+
+-static const struct attribute_group ufs_sysfs_hid_group = {
++const struct attribute_group ufs_sysfs_hid_group = {
+ .name = "hid",
+ .attrs = ufs_sysfs_hid,
+ .is_visible = ufs_sysfs_hid_is_visible,
+diff --git a/drivers/ufs/core/ufs-sysfs.h b/drivers/ufs/core/ufs-sysfs.h
+index 8d94af3b807719..6efb82a082fdd3 100644
+--- a/drivers/ufs/core/ufs-sysfs.h
++++ b/drivers/ufs/core/ufs-sysfs.h
+@@ -14,5 +14,6 @@ void ufs_sysfs_remove_nodes(struct device *dev);
+
+ extern const struct attribute_group ufs_sysfs_unit_descriptor_group;
+ extern const struct attribute_group ufs_sysfs_lun_attributes_group;
++extern const struct attribute_group ufs_sysfs_hid_group;
+
+ #endif
+diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
+index 96a0f5fcc0e577..465e66dbe08e89 100644
+--- a/drivers/ufs/core/ufshcd.c
++++ b/drivers/ufs/core/ufshcd.c
+@@ -8482,6 +8482,8 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
+ DEVICE_DESC_PARAM_EXT_UFS_FEATURE_SUP) &
+ UFS_DEV_HID_SUPPORT;
+
++ sysfs_update_group(&hba->dev->kobj, &ufs_sysfs_hid_group);
++
+ model_index = desc_buf[DEVICE_DESC_PARAM_PRDCT_NAME];
+
+ err = ufshcd_read_string_desc(hba, model_index,
+diff --git a/drivers/video/fbdev/core/fb_cmdline.c b/drivers/video/fbdev/core/fb_cmdline.c
+index 4d1634c492ec4d..594b60424d1c64 100644
+--- a/drivers/video/fbdev/core/fb_cmdline.c
++++ b/drivers/video/fbdev/core/fb_cmdline.c
+@@ -40,7 +40,7 @@ int fb_get_options(const char *name, char **option)
+ bool enabled;
+
+ if (name)
+- is_of = strncmp(name, "offb", 4);
++ is_of = !strncmp(name, "offb", 4);
+
+ enabled = __video_get_options(name, &options, is_of);
+
+diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
+index 41309d38f78c3c..9478fae014e50f 100644
+--- a/drivers/xen/events/events_base.c
++++ b/drivers/xen/events/events_base.c
+@@ -1314,14 +1314,17 @@ int bind_interdomain_evtchn_to_irq_lateeoi(struct xenbus_device *dev,
+ }
+ EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq_lateeoi);
+
+-static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn)
++static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn,
++ bool percpu)
+ {
+ struct evtchn_status status;
+ evtchn_port_t port;
+- int rc = -ENOENT;
++ bool exists = false;
+
+ memset(&status, 0, sizeof(status));
+ for (port = 0; port < xen_evtchn_max_channels(); port++) {
++ int rc;
++
+ status.dom = DOMID_SELF;
+ status.port = port;
+ rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status);
+@@ -1329,12 +1332,16 @@ static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn)
+ continue;
+ if (status.status != EVTCHNSTAT_virq)
+ continue;
+- if (status.u.virq == virq && status.vcpu == xen_vcpu_nr(cpu)) {
++ if (status.u.virq != virq)
++ continue;
++ if (status.vcpu == xen_vcpu_nr(cpu)) {
+ *evtchn = port;
+- break;
++ return 0;
++ } else if (!percpu) {
++ exists = true;
+ }
+ }
+- return rc;
++ return exists ? -EEXIST : -ENOENT;
+ }
+
+ /**
+@@ -1381,8 +1388,11 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu, bool percpu)
+ evtchn = bind_virq.port;
+ else {
+ if (ret == -EEXIST)
+- ret = find_virq(virq, cpu, &evtchn);
+- BUG_ON(ret < 0);
++ ret = find_virq(virq, cpu, &evtchn, percpu);
++ if (ret) {
++ __unbind_from_irq(info, info->irq);
++ goto out;
++ }
+ }
+
+ ret = xen_irq_info_virq_setup(info, cpu, evtchn, virq);
+@@ -1787,9 +1797,20 @@ static int xen_rebind_evtchn_to_cpu(struct irq_info *info, unsigned int tcpu)
+ * virq or IPI channel, which don't actually need to be rebound. Ignore
+ * it, but don't do the xenlinux-level rebind in that case.
+ */
+- if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
++ if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) {
++ int old_cpu = info->cpu;
++
+ bind_evtchn_to_cpu(info, tcpu, false);
+
++ if (info->type == IRQT_VIRQ) {
++ int virq = info->u.virq;
++ int irq = per_cpu(virq_to_irq, old_cpu)[virq];
++
++ per_cpu(virq_to_irq, old_cpu)[virq] = -1;
++ per_cpu(virq_to_irq, tcpu)[virq] = irq;
++ }
++ }
++
+ do_unmask(info, EVT_MASK_REASON_TEMPORARY);
+
+ return 0;
+diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
+index 841afa4933c7a6..e20c40a62e64e2 100644
+--- a/drivers/xen/manage.c
++++ b/drivers/xen/manage.c
+@@ -11,6 +11,7 @@
+ #include <linux/reboot.h>
+ #include <linux/sysrq.h>
+ #include <linux/stop_machine.h>
++#include <linux/suspend.h>
+ #include <linux/freezer.h>
+ #include <linux/syscore_ops.h>
+ #include <linux/export.h>
+@@ -95,10 +96,16 @@ static void do_suspend(void)
+
+ shutting_down = SHUTDOWN_SUSPEND;
+
++ if (!mutex_trylock(&system_transition_mutex))
++ {
++ pr_err("%s: failed to take system_transition_mutex\n", __func__);
++ goto out;
++ }
++
+ err = freeze_processes();
+ if (err) {
+ pr_err("%s: freeze processes failed %d\n", __func__, err);
+- goto out;
++ goto out_unlock;
+ }
+
+ err = freeze_kernel_threads();
+@@ -110,7 +117,7 @@ static void do_suspend(void)
+ err = dpm_suspend_start(PMSG_FREEZE);
+ if (err) {
+ pr_err("%s: dpm_suspend_start %d\n", __func__, err);
+- goto out_thaw;
++ goto out_resume_end;
+ }
+
+ printk(KERN_DEBUG "suspending xenstore...\n");
+@@ -150,10 +157,13 @@ static void do_suspend(void)
+ else
+ xs_suspend_cancel();
+
++out_resume_end:
+ dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE);
+
+ out_thaw:
+ thaw_processes();
++out_unlock:
++ mutex_unlock(&system_transition_mutex);
+ out:
+ shutting_down = SHUTDOWN_INVALID;
+ }
+diff --git a/fs/attr.c b/fs/attr.c
+index 5425c1dbbff92f..795f231d00e8ea 100644
+--- a/fs/attr.c
++++ b/fs/attr.c
+@@ -286,20 +286,12 @@ static void setattr_copy_mgtime(struct inode *inode, const struct iattr *attr)
+ unsigned int ia_valid = attr->ia_valid;
+ struct timespec64 now;
+
+- if (ia_valid & ATTR_CTIME) {
+- /*
+- * In the case of an update for a write delegation, we must respect
+- * the value in ia_ctime and not use the current time.
+- */
+- if (ia_valid & ATTR_DELEG)
+- now = inode_set_ctime_deleg(inode, attr->ia_ctime);
+- else
+- now = inode_set_ctime_current(inode);
+- } else {
+- /* If ATTR_CTIME isn't set, then ATTR_MTIME shouldn't be either. */
+- WARN_ON_ONCE(ia_valid & ATTR_MTIME);
++ if (ia_valid & ATTR_CTIME_SET)
++ now = inode_set_ctime_deleg(inode, attr->ia_ctime);
++ else if (ia_valid & ATTR_CTIME)
++ now = inode_set_ctime_current(inode);
++ else
+ now = current_time(inode);
+- }
+
+ if (ia_valid & ATTR_ATIME_SET)
+ inode_set_atime_to_ts(inode, attr->ia_atime);
+@@ -359,12 +351,11 @@ void setattr_copy(struct mnt_idmap *idmap, struct inode *inode,
+ inode_set_atime_to_ts(inode, attr->ia_atime);
+ if (ia_valid & ATTR_MTIME)
+ inode_set_mtime_to_ts(inode, attr->ia_mtime);
+- if (ia_valid & ATTR_CTIME) {
+- if (ia_valid & ATTR_DELEG)
+- inode_set_ctime_deleg(inode, attr->ia_ctime);
+- else
+- inode_set_ctime_to_ts(inode, attr->ia_ctime);
+- }
++
++ if (ia_valid & ATTR_CTIME_SET)
++ inode_set_ctime_deleg(inode, attr->ia_ctime);
++ else if (ia_valid & ATTR_CTIME)
++ inode_set_ctime_to_ts(inode, attr->ia_ctime);
+ }
+ EXPORT_SYMBOL(setattr_copy);
+
+@@ -463,15 +454,18 @@ int notify_change(struct mnt_idmap *idmap, struct dentry *dentry,
+
+ now = current_time(inode);
+
+- attr->ia_ctime = now;
+- if (!(ia_valid & ATTR_ATIME_SET))
+- attr->ia_atime = now;
+- else
++ if (ia_valid & ATTR_ATIME_SET)
+ attr->ia_atime = timestamp_truncate(attr->ia_atime, inode);
+- if (!(ia_valid & ATTR_MTIME_SET))
+- attr->ia_mtime = now;
+ else
++ attr->ia_atime = now;
++ if (ia_valid & ATTR_CTIME_SET)
++ attr->ia_ctime = timestamp_truncate(attr->ia_ctime, inode);
++ else
++ attr->ia_ctime = now;
++ if (ia_valid & ATTR_MTIME_SET)
+ attr->ia_mtime = timestamp_truncate(attr->ia_mtime, inode);
++ else
++ attr->ia_mtime = now;
+
+ if (ia_valid & ATTR_KILL_PRIV) {
+ error = security_inode_need_killpriv(dentry);
+diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
+index 7fc8a3200b4005..851ac862f0e55e 100644
+--- a/fs/btrfs/export.c
++++ b/fs/btrfs/export.c
+@@ -23,7 +23,11 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
+ int type;
+
+ if (parent && (len < BTRFS_FID_SIZE_CONNECTABLE)) {
+- *max_len = BTRFS_FID_SIZE_CONNECTABLE;
++ if (btrfs_root_id(BTRFS_I(inode)->root) !=
++ btrfs_root_id(BTRFS_I(parent)->root))
++ *max_len = BTRFS_FID_SIZE_CONNECTABLE_ROOT;
++ else
++ *max_len = BTRFS_FID_SIZE_CONNECTABLE;
+ return FILEID_INVALID;
+ } else if (len < BTRFS_FID_SIZE_NON_CONNECTABLE) {
+ *max_len = BTRFS_FID_SIZE_NON_CONNECTABLE;
+@@ -45,6 +49,8 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
+ parent_root_id = btrfs_root_id(BTRFS_I(parent)->root);
+
+ if (parent_root_id != fid->root_objectid) {
++ if (*max_len < BTRFS_FID_SIZE_CONNECTABLE_ROOT)
++ return FILEID_INVALID;
+ fid->parent_root_objectid = parent_root_id;
+ len = BTRFS_FID_SIZE_CONNECTABLE_ROOT;
+ type = FILEID_BTRFS_WITH_PARENT_ROOT;
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index 4eafe3817e11c8..e6d2557ac37b09 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -345,6 +345,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
+ /* step one, find a bunch of delalloc bytes starting at start */
+ delalloc_start = *start;
+ delalloc_end = 0;
++
++ /*
++ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
++ * return early without handling any dirty ranges.
++ */
++ ASSERT(max_bytes >= fs_info->sectorsize);
++
+ found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
+ max_bytes, &cached_state);
+ if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
+@@ -375,13 +382,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
+ delalloc_end);
+ ASSERT(!ret || ret == -EAGAIN);
+ if (ret == -EAGAIN) {
+- /* some of the folios are gone, lets avoid looping by
+- * shortening the size of the delalloc range we're searching
++ /*
++ * Some of the folios are gone, lets avoid looping by
++ * shortening the size of the delalloc range we're searching.
+ */
+ btrfs_free_extent_state(cached_state);
+ cached_state = NULL;
+ if (!loops) {
+- max_bytes = PAGE_SIZE;
++ max_bytes = fs_info->sectorsize;
+ loops = 1;
+ goto again;
+ } else {
+diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
+index 56c8005b24a344..ca54bf24b719f1 100644
+--- a/fs/cramfs/inode.c
++++ b/fs/cramfs/inode.c
+@@ -116,9 +116,18 @@ static struct inode *get_cramfs_inode(struct super_block *sb,
+ inode_nohighmem(inode);
+ inode->i_data.a_ops = &cramfs_aops;
+ break;
+- default:
++ case S_IFCHR:
++ case S_IFBLK:
++ case S_IFIFO:
++ case S_IFSOCK:
+ init_special_inode(inode, cramfs_inode->mode,
+ old_decode_dev(cramfs_inode->size));
++ break;
++ default:
++ printk(KERN_DEBUG "CRAMFS: Invalid file type 0%04o for inode %lu.\n",
++ inode->i_mode, inode->i_ino);
++ iget_failed(inode);
++ return ERR_PTR(-EIO);
+ }
+
+ inode->i_mode = cramfs_inode->mode;
+diff --git a/fs/eventpoll.c b/fs/eventpoll.c
+index b22d6f819f782d..ee7c4b683ec3d2 100644
+--- a/fs/eventpoll.c
++++ b/fs/eventpoll.c
+@@ -46,10 +46,10 @@
+ *
+ * 1) epnested_mutex (mutex)
+ * 2) ep->mtx (mutex)
+- * 3) ep->lock (rwlock)
++ * 3) ep->lock (spinlock)
+ *
+ * The acquire order is the one listed above, from 1 to 3.
+- * We need a rwlock (ep->lock) because we manipulate objects
++ * We need a spinlock (ep->lock) because we manipulate objects
+ * from inside the poll callback, that might be triggered from
+ * a wake_up() that in turn might be called from IRQ context.
+ * So we can't sleep inside the poll callback and hence we need
+@@ -195,7 +195,7 @@ struct eventpoll {
+ struct list_head rdllist;
+
+ /* Lock which protects rdllist and ovflist */
+- rwlock_t lock;
++ spinlock_t lock;
+
+ /* RB tree root used to store monitored fd structs */
+ struct rb_root_cached rbr;
+@@ -741,10 +741,10 @@ static void ep_start_scan(struct eventpoll *ep, struct list_head *txlist)
+ * in a lockless way.
+ */
+ lockdep_assert_irqs_enabled();
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+ list_splice_init(&ep->rdllist, txlist);
+ WRITE_ONCE(ep->ovflist, NULL);
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+ }
+
+ static void ep_done_scan(struct eventpoll *ep,
+@@ -752,7 +752,7 @@ static void ep_done_scan(struct eventpoll *ep,
+ {
+ struct epitem *epi, *nepi;
+
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+ /*
+ * During the time we spent inside the "sproc" callback, some
+ * other events might have been queued by the poll callback.
+@@ -793,7 +793,7 @@ static void ep_done_scan(struct eventpoll *ep,
+ wake_up(&ep->wq);
+ }
+
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+ }
+
+ static void ep_get(struct eventpoll *ep)
+@@ -868,10 +868,10 @@ static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
+
+ rb_erase_cached(&epi->rbn, &ep->rbr);
+
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+ if (ep_is_linked(epi))
+ list_del_init(&epi->rdllink);
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+
+ wakeup_source_unregister(ep_wakeup_source(epi));
+ /*
+@@ -1152,7 +1152,7 @@ static int ep_alloc(struct eventpoll **pep)
+ return -ENOMEM;
+
+ mutex_init(&ep->mtx);
+- rwlock_init(&ep->lock);
++ spin_lock_init(&ep->lock);
+ init_waitqueue_head(&ep->wq);
+ init_waitqueue_head(&ep->poll_wait);
+ INIT_LIST_HEAD(&ep->rdllist);
+@@ -1239,100 +1239,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
+ }
+ #endif /* CONFIG_KCMP */
+
+-/*
+- * Adds a new entry to the tail of the list in a lockless way, i.e.
+- * multiple CPUs are allowed to call this function concurrently.
+- *
+- * Beware: it is necessary to prevent any other modifications of the
+- * existing list until all changes are completed, in other words
+- * concurrent list_add_tail_lockless() calls should be protected
+- * with a read lock, where write lock acts as a barrier which
+- * makes sure all list_add_tail_lockless() calls are fully
+- * completed.
+- *
+- * Also an element can be locklessly added to the list only in one
+- * direction i.e. either to the tail or to the head, otherwise
+- * concurrent access will corrupt the list.
+- *
+- * Return: %false if element has been already added to the list, %true
+- * otherwise.
+- */
+-static inline bool list_add_tail_lockless(struct list_head *new,
+- struct list_head *head)
+-{
+- struct list_head *prev;
+-
+- /*
+- * This is simple 'new->next = head' operation, but cmpxchg()
+- * is used in order to detect that same element has been just
+- * added to the list from another CPU: the winner observes
+- * new->next == new.
+- */
+- if (!try_cmpxchg(&new->next, &new, head))
+- return false;
+-
+- /*
+- * Initially ->next of a new element must be updated with the head
+- * (we are inserting to the tail) and only then pointers are atomically
+- * exchanged. XCHG guarantees memory ordering, thus ->next should be
+- * updated before pointers are actually swapped and pointers are
+- * swapped before prev->next is updated.
+- */
+-
+- prev = xchg(&head->prev, new);
+-
+- /*
+- * It is safe to modify prev->next and new->prev, because a new element
+- * is added only to the tail and new->next is updated before XCHG.
+- */
+-
+- prev->next = new;
+- new->prev = prev;
+-
+- return true;
+-}
+-
+-/*
+- * Chains a new epi entry to the tail of the ep->ovflist in a lockless way,
+- * i.e. multiple CPUs are allowed to call this function concurrently.
+- *
+- * Return: %false if epi element has been already chained, %true otherwise.
+- */
+-static inline bool chain_epi_lockless(struct epitem *epi)
+-{
+- struct eventpoll *ep = epi->ep;
+-
+- /* Fast preliminary check */
+- if (epi->next != EP_UNACTIVE_PTR)
+- return false;
+-
+- /* Check that the same epi has not been just chained from another CPU */
+- if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR)
+- return false;
+-
+- /* Atomically exchange tail */
+- epi->next = xchg(&ep->ovflist, epi);
+-
+- return true;
+-}
+-
+ /*
+ * This is the callback that is passed to the wait queue wakeup
+ * mechanism. It is called by the stored file descriptors when they
+ * have events to report.
+- *
+- * This callback takes a read lock in order not to contend with concurrent
+- * events from another file descriptor, thus all modifications to ->rdllist
+- * or ->ovflist are lockless. Read lock is paired with the write lock from
+- * ep_start/done_scan(), which stops all list modifications and guarantees
+- * that lists state is seen correctly.
+- *
+- * Another thing worth to mention is that ep_poll_callback() can be called
+- * concurrently for the same @epi from different CPUs if poll table was inited
+- * with several wait queues entries. Plural wakeup from different CPUs of a
+- * single wait queue is serialized by wq.lock, but the case when multiple wait
+- * queues are used should be detected accordingly. This is detected using
+- * cmpxchg() operation.
+ */
+ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
+ {
+@@ -1343,7 +1253,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
+ unsigned long flags;
+ int ewake = 0;
+
+- read_lock_irqsave(&ep->lock, flags);
++ spin_lock_irqsave(&ep->lock, flags);
+
+ ep_set_busy_poll_napi_id(epi);
+
+@@ -1372,12 +1282,15 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
+ * chained in ep->ovflist and requeued later on.
+ */
+ if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) {
+- if (chain_epi_lockless(epi))
++ if (epi->next == EP_UNACTIVE_PTR) {
++ epi->next = READ_ONCE(ep->ovflist);
++ WRITE_ONCE(ep->ovflist, epi);
+ ep_pm_stay_awake_rcu(epi);
++ }
+ } else if (!ep_is_linked(epi)) {
+ /* In the usual case, add event to ready list. */
+- if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist))
+- ep_pm_stay_awake_rcu(epi);
++ list_add_tail(&epi->rdllink, &ep->rdllist);
++ ep_pm_stay_awake_rcu(epi);
+ }
+
+ /*
+@@ -1410,7 +1323,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
+ pwake++;
+
+ out_unlock:
+- read_unlock_irqrestore(&ep->lock, flags);
++ spin_unlock_irqrestore(&ep->lock, flags);
+
+ /* We have to call this outside the lock */
+ if (pwake)
+@@ -1745,7 +1658,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
+ }
+
+ /* We have to drop the new item inside our item list to keep track of it */
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+
+ /* record NAPI ID of new item if present */
+ ep_set_busy_poll_napi_id(epi);
+@@ -1762,7 +1675,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
+ pwake++;
+ }
+
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+
+ /* We have to call this outside the lock */
+ if (pwake)
+@@ -1826,7 +1739,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
+ * list, push it inside.
+ */
+ if (ep_item_poll(epi, &pt, 1)) {
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+ if (!ep_is_linked(epi)) {
+ list_add_tail(&epi->rdllink, &ep->rdllist);
+ ep_pm_stay_awake(epi);
+@@ -1837,7 +1750,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
+ if (waitqueue_active(&ep->poll_wait))
+ pwake++;
+ }
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+ }
+
+ /* We have to call this outside the lock */
+@@ -2089,7 +2002,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
+ init_wait(&wait);
+ wait.func = ep_autoremove_wake_function;
+
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+ /*
+ * Barrierless variant, waitqueue_active() is called under
+ * the same lock on wakeup ep_poll_callback() side, so it
+@@ -2108,7 +2021,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
+ if (!eavail)
+ __add_wait_queue_exclusive(&ep->wq, &wait);
+
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+
+ if (!eavail)
+ timed_out = !ep_schedule_timeout(to) ||
+@@ -2124,7 +2037,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
+ eavail = 1;
+
+ if (!list_empty_careful(&wait.entry)) {
+- write_lock_irq(&ep->lock);
++ spin_lock_irq(&ep->lock);
+ /*
+ * If the thread timed out and is not on the wait queue,
+ * it means that the thread was woken up after its
+@@ -2135,7 +2048,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
+ if (timed_out)
+ eavail = list_empty(&wait.entry);
+ __remove_wait_queue(&ep->wq, &wait);
+- write_unlock_irq(&ep->lock);
++ spin_unlock_irq(&ep->lock);
+ }
+ }
+ }
+diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
+index 72e02df72c4c94..fd4b28300b4af5 100644
+--- a/fs/ext4/ext4.h
++++ b/fs/ext4/ext4.h
+@@ -3144,6 +3144,8 @@ extern struct buffer_head *ext4_sb_bread(struct super_block *sb,
+ sector_t block, blk_opf_t op_flags);
+ extern struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
+ sector_t block);
++extern struct buffer_head *ext4_sb_bread_nofail(struct super_block *sb,
++ sector_t block);
+ extern void ext4_read_bh_nowait(struct buffer_head *bh, blk_opf_t op_flags,
+ bh_end_io_t *end_io, bool simu_fail);
+ extern int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags,
+diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
+index 91185c40f755a5..22fc333244ef73 100644
+--- a/fs/ext4/fsmap.c
++++ b/fs/ext4/fsmap.c
+@@ -74,7 +74,8 @@ static int ext4_getfsmap_dev_compare(const void *p1, const void *p2)
+ static bool ext4_getfsmap_rec_before_low_key(struct ext4_getfsmap_info *info,
+ struct ext4_fsmap *rec)
+ {
+- return rec->fmr_physical < info->gfi_low.fmr_physical;
++ return rec->fmr_physical + rec->fmr_length <=
++ info->gfi_low.fmr_physical;
+ }
+
+ /*
+@@ -200,15 +201,18 @@ static int ext4_getfsmap_meta_helper(struct super_block *sb,
+ ext4_group_first_block_no(sb, agno));
+ fs_end = fs_start + EXT4_C2B(sbi, len);
+
+- /* Return relevant extents from the meta_list */
++ /*
++ * Return relevant extents from the meta_list. We emit all extents that
++ * partially/fully overlap with the query range
++ */
+ list_for_each_entry_safe(p, tmp, &info->gfi_meta_list, fmr_list) {
+- if (p->fmr_physical < info->gfi_next_fsblk) {
++ if (p->fmr_physical + p->fmr_length <= info->gfi_next_fsblk) {
+ list_del(&p->fmr_list);
+ kfree(p);
+ continue;
+ }
+- if (p->fmr_physical <= fs_start ||
+- p->fmr_physical + p->fmr_length <= fs_end) {
++ if (p->fmr_physical <= fs_end &&
++ p->fmr_physical + p->fmr_length > fs_start) {
+ /* Emit the retained free extent record if present */
+ if (info->gfi_lastfree.fmr_owner) {
+ error = ext4_getfsmap_helper(sb, info,
+diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
+index d45124318200d8..da76353b3a5750 100644
+--- a/fs/ext4/indirect.c
++++ b/fs/ext4/indirect.c
+@@ -1025,7 +1025,7 @@ static void ext4_free_branches(handle_t *handle, struct inode *inode,
+ }
+
+ /* Go read the buffer for the next level down */
+- bh = ext4_sb_bread(inode->i_sb, nr, 0);
++ bh = ext4_sb_bread_nofail(inode->i_sb, nr);
+
+ /*
+ * A read failure? Report error and clear slot
+diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
+index 5230452e29dd8b..f9e4ac87211ec1 100644
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -3872,47 +3872,12 @@ static int ext4_iomap_overwrite_begin(struct inode *inode, loff_t offset,
+ return ret;
+ }
+
+-static inline bool ext4_want_directio_fallback(unsigned flags, ssize_t written)
+-{
+- /* must be a directio to fall back to buffered */
+- if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
+- (IOMAP_WRITE | IOMAP_DIRECT))
+- return false;
+-
+- /* atomic writes are all-or-nothing */
+- if (flags & IOMAP_ATOMIC)
+- return false;
+-
+- /* can only try again if we wrote nothing */
+- return written == 0;
+-}
+-
+-static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
+- ssize_t written, unsigned flags, struct iomap *iomap)
+-{
+- /*
+- * Check to see whether an error occurred while writing out the data to
+- * the allocated blocks. If so, return the magic error code for
+- * non-atomic write so that we fallback to buffered I/O and attempt to
+- * complete the remainder of the I/O.
+- * For non-atomic writes, any blocks that may have been
+- * allocated in preparation for the direct I/O will be reused during
+- * buffered I/O. For atomic write, we never fallback to buffered-io.
+- */
+- if (ext4_want_directio_fallback(flags, written))
+- return -ENOTBLK;
+-
+- return 0;
+-}
+-
+ const struct iomap_ops ext4_iomap_ops = {
+ .iomap_begin = ext4_iomap_begin,
+- .iomap_end = ext4_iomap_end,
+ };
+
+ const struct iomap_ops ext4_iomap_overwrite_ops = {
+ .iomap_begin = ext4_iomap_overwrite_begin,
+- .iomap_end = ext4_iomap_end,
+ };
+
+ static int ext4_iomap_begin_report(struct inode *inode, loff_t offset,
+@@ -4287,7 +4252,11 @@ int ext4_can_truncate(struct inode *inode)
+ * We have to make sure i_disksize gets properly updated before we truncate
+ * page cache due to hole punching or zero range. Otherwise i_disksize update
+ * can get lost as it may have been postponed to submission of writeback but
+- * that will never happen after we truncate page cache.
++ * that will never happen if we remove the folio containing i_size from the
++ * page cache. Also if we punch hole within i_size but above i_disksize,
++ * following ext4_page_mkwrite() may mistakenly allocate written blocks over
++ * the hole and thus introduce allocated blocks beyond i_disksize which is
++ * not allowed (e2fsck would complain in case of crash).
+ */
+ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset,
+ loff_t len)
+@@ -4298,9 +4267,11 @@ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset,
+ loff_t size = i_size_read(inode);
+
+ WARN_ON(!inode_is_locked(inode));
+- if (offset > size || offset + len < size)
++ if (offset > size)
+ return 0;
+
++ if (offset + len < size)
++ size = offset + len;
+ if (EXT4_I(inode)->i_disksize >= size)
+ return 0;
+
+diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
+index adae3caf175a93..4b091c21908fda 100644
+--- a/fs/ext4/move_extent.c
++++ b/fs/ext4/move_extent.c
+@@ -225,7 +225,7 @@ static int mext_page_mkuptodate(struct folio *folio, size_t from, size_t to)
+ do {
+ if (bh_offset(bh) + blocksize <= from)
+ continue;
+- if (bh_offset(bh) > to)
++ if (bh_offset(bh) >= to)
+ break;
+ wait_on_buffer(bh);
+ if (buffer_uptodate(bh))
+diff --git a/fs/ext4/orphan.c b/fs/ext4/orphan.c
+index 0fbcce67ffd4e4..82d5e750145559 100644
+--- a/fs/ext4/orphan.c
++++ b/fs/ext4/orphan.c
+@@ -513,7 +513,7 @@ void ext4_release_orphan_info(struct super_block *sb)
+ return;
+ for (i = 0; i < oi->of_blocks; i++)
+ brelse(oi->of_binfo[i].ob_bh);
+- kfree(oi->of_binfo);
++ kvfree(oi->of_binfo);
+ }
+
+ static struct ext4_orphan_block_tail *ext4_orphan_block_tail(
+@@ -583,9 +583,20 @@ int ext4_init_orphan_info(struct super_block *sb)
+ ext4_msg(sb, KERN_ERR, "get orphan inode failed");
+ return PTR_ERR(inode);
+ }
++ /*
++ * This is just an artificial limit to prevent corrupted fs from
++ * consuming absurd amounts of memory when pinning blocks of orphan
++ * file in memory.
++ */
++ if (inode->i_size > 8 << 20) {
++ ext4_msg(sb, KERN_ERR, "orphan file too big: %llu",
++ (unsigned long long)inode->i_size);
++ ret = -EFSCORRUPTED;
++ goto out_put;
++ }
+ oi->of_blocks = inode->i_size >> sb->s_blocksize_bits;
+ oi->of_csum_seed = EXT4_I(inode)->i_csum_seed;
+- oi->of_binfo = kmalloc_array(oi->of_blocks,
++ oi->of_binfo = kvmalloc_array(oi->of_blocks,
+ sizeof(struct ext4_orphan_block),
+ GFP_KERNEL);
+ if (!oi->of_binfo) {
+@@ -626,7 +637,7 @@ int ext4_init_orphan_info(struct super_block *sb)
+ out_free:
+ for (i--; i >= 0; i--)
+ brelse(oi->of_binfo[i].ob_bh);
+- kfree(oi->of_binfo);
++ kvfree(oi->of_binfo);
+ out_put:
+ iput(inode);
+ return ret;
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index ba497387b9c863..4e4d068b761d71 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -265,6 +265,15 @@ struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
+ return __ext4_sb_bread_gfp(sb, block, 0, gfp);
+ }
+
++struct buffer_head *ext4_sb_bread_nofail(struct super_block *sb,
++ sector_t block)
++{
++ gfp_t gfp = mapping_gfp_constraint(sb->s_bdev->bd_mapping,
++ ~__GFP_FS) | __GFP_MOVABLE | __GFP_NOFAIL;
++
++ return __ext4_sb_bread_gfp(sb, block, 0, gfp);
++}
++
+ void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
+ {
+ struct buffer_head *bh = bdev_getblk(sb->s_bdev, block,
+@@ -2460,7 +2469,7 @@ static int parse_apply_sb_mount_options(struct super_block *sb,
+ struct ext4_fs_context *m_ctx)
+ {
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+- char *s_mount_opts = NULL;
++ char s_mount_opts[65];
+ struct ext4_fs_context *s_ctx = NULL;
+ struct fs_context *fc = NULL;
+ int ret = -ENOMEM;
+@@ -2468,15 +2477,11 @@ static int parse_apply_sb_mount_options(struct super_block *sb,
+ if (!sbi->s_es->s_mount_opts[0])
+ return 0;
+
+- s_mount_opts = kstrndup(sbi->s_es->s_mount_opts,
+- sizeof(sbi->s_es->s_mount_opts),
+- GFP_KERNEL);
+- if (!s_mount_opts)
+- return ret;
++ strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts);
+
+ fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL);
+ if (!fc)
+- goto out_free;
++ return -ENOMEM;
+
+ s_ctx = kzalloc(sizeof(struct ext4_fs_context), GFP_KERNEL);
+ if (!s_ctx)
+@@ -2508,11 +2513,8 @@ static int parse_apply_sb_mount_options(struct super_block *sb,
+ ret = 0;
+
+ out_free:
+- if (fc) {
+- ext4_fc_free(fc);
+- kfree(fc);
+- }
+- kfree(s_mount_opts);
++ ext4_fc_free(fc);
++ kfree(fc);
+ return ret;
+ }
+
+diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
+index 5a6fe1513fd205..b0e60a44dae9dc 100644
+--- a/fs/ext4/xattr.c
++++ b/fs/ext4/xattr.c
+@@ -251,6 +251,10 @@ check_xattrs(struct inode *inode, struct buffer_head *bh,
+ err_str = "invalid ea_ino";
+ goto errout;
+ }
++ if (ea_ino && !size) {
++ err_str = "invalid size in ea xattr";
++ goto errout;
++ }
+ if (size > EXT4_XATTR_SIZE_MAX) {
+ err_str = "e_value size too large";
+ goto errout;
+@@ -1019,7 +1023,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
+ int ref_change)
+ {
+ struct ext4_iloc iloc;
+- s64 ref_count;
++ u64 ref_count;
+ int ret;
+
+ inode_lock_nested(ea_inode, I_MUTEX_XATTR);
+@@ -1029,13 +1033,17 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
+ goto out;
+
+ ref_count = ext4_xattr_inode_get_ref(ea_inode);
++ if ((ref_count == 0 && ref_change < 0) || (ref_count == U64_MAX && ref_change > 0)) {
++ ext4_error_inode(ea_inode, __func__, __LINE__, 0,
++ "EA inode %lu ref wraparound: ref_count=%lld ref_change=%d",
++ ea_inode->i_ino, ref_count, ref_change);
++ ret = -EFSCORRUPTED;
++ goto out;
++ }
+ ref_count += ref_change;
+ ext4_xattr_inode_set_ref(ea_inode, ref_count);
+
+ if (ref_change > 0) {
+- WARN_ONCE(ref_count <= 0, "EA inode %lu ref_count=%lld",
+- ea_inode->i_ino, ref_count);
+-
+ if (ref_count == 1) {
+ WARN_ONCE(ea_inode->i_nlink, "EA inode %lu i_nlink=%u",
+ ea_inode->i_ino, ea_inode->i_nlink);
+@@ -1044,9 +1052,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
+ ext4_orphan_del(handle, ea_inode);
+ }
+ } else {
+- WARN_ONCE(ref_count < 0, "EA inode %lu ref_count=%lld",
+- ea_inode->i_ino, ref_count);
+-
+ if (ref_count == 0) {
+ WARN_ONCE(ea_inode->i_nlink != 1,
+ "EA inode %lu i_nlink=%u",
+diff --git a/fs/file.c b/fs/file.c
+index 6d2275c3be9c69..28743b742e3cf6 100644
+--- a/fs/file.c
++++ b/fs/file.c
+@@ -1330,7 +1330,10 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
+ err = expand_files(files, fd);
+ if (unlikely(err < 0))
+ goto out_unlock;
+- return do_dup2(files, file, fd, flags);
++ err = do_dup2(files, file, fd, flags);
++ if (err < 0)
++ return err;
++ return 0;
+
+ out_unlock:
+ spin_unlock(&files->file_lock);
+diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
+index a07b8cf73ae271..3bfc430ef74dcf 100644
+--- a/fs/fs-writeback.c
++++ b/fs/fs-writeback.c
+@@ -445,22 +445,23 @@ static bool inode_do_switch_wbs(struct inode *inode,
+ * Transfer to @new_wb's IO list if necessary. If the @inode is dirty,
+ * the specific list @inode was on is ignored and the @inode is put on
+ * ->b_dirty which is always correct including from ->b_dirty_time.
+- * The transfer preserves @inode->dirtied_when ordering. If the @inode
+- * was clean, it means it was on the b_attached list, so move it onto
+- * the b_attached list of @new_wb.
++ * If the @inode was clean, it means it was on the b_attached list, so
++ * move it onto the b_attached list of @new_wb.
+ */
+ if (!list_empty(&inode->i_io_list)) {
+ inode->i_wb = new_wb;
+
+ if (inode->i_state & I_DIRTY_ALL) {
+- struct inode *pos;
+-
+- list_for_each_entry(pos, &new_wb->b_dirty, i_io_list)
+- if (time_after_eq(inode->dirtied_when,
+- pos->dirtied_when))
+- break;
++ /*
++ * We need to keep b_dirty list sorted by
++ * dirtied_time_when. However properly sorting the
++ * inode in the list gets too expensive when switching
++ * many inodes. So just attach inode at the end of the
++ * dirty list and clobber the dirtied_time_when.
++ */
++ inode->dirtied_time_when = jiffies;
+ inode_io_list_move_locked(inode, new_wb,
+- pos->i_io_list.prev);
++ &new_wb->b_dirty);
+ } else {
+ inode_cgwb_move_to_attached(inode, new_wb);
+ }
+@@ -502,6 +503,7 @@ static void inode_switch_wbs_work_fn(struct work_struct *work)
+ */
+ down_read(&bdi->wb_switch_rwsem);
+
++ inodep = isw->inodes;
+ /*
+ * By the time control reaches here, RCU grace period has passed
+ * since I_WB_SWITCH assertion and all wb stat update transactions
+@@ -512,6 +514,7 @@ static void inode_switch_wbs_work_fn(struct work_struct *work)
+ * gives us exclusion against all wb related operations on @inode
+ * including IO list manipulations and stat updates.
+ */
++relock:
+ if (old_wb < new_wb) {
+ spin_lock(&old_wb->list_lock);
+ spin_lock_nested(&new_wb->list_lock, SINGLE_DEPTH_NESTING);
+@@ -520,10 +523,17 @@ static void inode_switch_wbs_work_fn(struct work_struct *work)
+ spin_lock_nested(&old_wb->list_lock, SINGLE_DEPTH_NESTING);
+ }
+
+- for (inodep = isw->inodes; *inodep; inodep++) {
++ while (*inodep) {
+ WARN_ON_ONCE((*inodep)->i_wb != old_wb);
+ if (inode_do_switch_wbs(*inodep, old_wb, new_wb))
+ nr_switched++;
++ inodep++;
++ if (*inodep && need_resched()) {
++ spin_unlock(&new_wb->list_lock);
++ spin_unlock(&old_wb->list_lock);
++ cond_resched();
++ goto relock;
++ }
+ }
+
+ spin_unlock(&new_wb->list_lock);
+diff --git a/fs/fsopen.c b/fs/fsopen.c
+index 1aaf4cb2afb29e..f645c99204eb06 100644
+--- a/fs/fsopen.c
++++ b/fs/fsopen.c
+@@ -18,50 +18,56 @@
+ #include "internal.h"
+ #include "mount.h"
+
++static inline const char *fetch_message_locked(struct fc_log *log, size_t len,
++ bool *need_free)
++{
++ const char *p;
++ int index;
++
++ if (unlikely(log->head == log->tail))
++ return ERR_PTR(-ENODATA);
++
++ index = log->tail & (ARRAY_SIZE(log->buffer) - 1);
++ p = log->buffer[index];
++ if (unlikely(strlen(p) > len))
++ return ERR_PTR(-EMSGSIZE);
++
++ log->buffer[index] = NULL;
++ *need_free = log->need_free & (1 << index);
++ log->need_free &= ~(1 << index);
++ log->tail++;
++
++ return p;
++}
++
+ /*
+ * Allow the user to read back any error, warning or informational messages.
++ * Only one message is returned for each read(2) call.
+ */
+ static ssize_t fscontext_read(struct file *file,
+ char __user *_buf, size_t len, loff_t *pos)
+ {
+ struct fs_context *fc = file->private_data;
+- struct fc_log *log = fc->log.log;
+- unsigned int logsize = ARRAY_SIZE(log->buffer);
+- ssize_t ret;
+- char *p;
++ ssize_t err;
++ const char *p __free(kfree) = NULL, *message;
+ bool need_free;
+- int index, n;
++ int n;
+
+- ret = mutex_lock_interruptible(&fc->uapi_mutex);
+- if (ret < 0)
+- return ret;
+-
+- if (log->head == log->tail) {
+- mutex_unlock(&fc->uapi_mutex);
+- return -ENODATA;
+- }
+-
+- index = log->tail & (logsize - 1);
+- p = log->buffer[index];
+- need_free = log->need_free & (1 << index);
+- log->buffer[index] = NULL;
+- log->need_free &= ~(1 << index);
+- log->tail++;
++ err = mutex_lock_interruptible(&fc->uapi_mutex);
++ if (err < 0)
++ return err;
++ message = fetch_message_locked(fc->log.log, len, &need_free);
+ mutex_unlock(&fc->uapi_mutex);
++ if (IS_ERR(message))
++ return PTR_ERR(message);
+
+- ret = -EMSGSIZE;
+- n = strlen(p);
+- if (n > len)
+- goto err_free;
+- ret = -EFAULT;
+- if (copy_to_user(_buf, p, n) != 0)
+- goto err_free;
+- ret = n;
+-
+-err_free:
+ if (need_free)
+- kfree(p);
+- return ret;
++ p = message;
++
++ n = strlen(message);
++ if (copy_to_user(_buf, message, n))
++ return -EFAULT;
++ return n;
+ }
+
+ static int fscontext_release(struct inode *inode, struct file *file)
+diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
+index 5150aa25e64be9..dbf53c7bc85354 100644
+--- a/fs/fuse/dev.c
++++ b/fs/fuse/dev.c
+@@ -2156,7 +2156,7 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
+ */
+ if (!oh.unique) {
+ err = fuse_notify(fc, oh.error, nbytes - sizeof(oh), cs);
+- goto out;
++ goto copy_finish;
+ }
+
+ err = -EINVAL;
+diff --git a/fs/fuse/file.c b/fs/fuse/file.c
+index c7351ca0706524..a52cf1b9cfc650 100644
+--- a/fs/fuse/file.c
++++ b/fs/fuse/file.c
+@@ -356,8 +356,14 @@ void fuse_file_release(struct inode *inode, struct fuse_file *ff,
+ * Make the release synchronous if this is a fuseblk mount,
+ * synchronous RELEASE is allowed (and desirable) in this case
+ * because the server can be trusted not to screw up.
++ *
++ * Always use the asynchronous file put because the current thread
++ * might be the fuse server. This can happen if a process starts some
++ * aio and closes the fd before the aio completes. Since aio takes its
++ * own ref to the file, the IO completion has to drop the ref, which is
++ * how the fuse server can end up closing its clients' files.
+ */
+- fuse_file_put(ff, ff->fm->fc->destroy);
++ fuse_file_put(ff, false);
+ }
+
+ void fuse_release_common(struct file *file, bool isdir)
+diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
+index fd827398afd2ff..6fa653d83f703a 100644
+--- a/fs/iomap/buffered-io.c
++++ b/fs/iomap/buffered-io.c
+@@ -304,6 +304,9 @@ static int iomap_read_inline_data(const struct iomap_iter *iter,
+ size_t size = i_size_read(iter->inode) - iomap->offset;
+ size_t offset = offset_in_folio(folio, iomap->offset);
+
++ if (WARN_ON_ONCE(!iomap->inline_data))
++ return -EIO;
++
+ if (folio_test_uptodate(folio))
+ return 0;
+
+@@ -894,7 +897,7 @@ static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
+ return true;
+ }
+
+-static void iomap_write_end_inline(const struct iomap_iter *iter,
++static bool iomap_write_end_inline(const struct iomap_iter *iter,
+ struct folio *folio, loff_t pos, size_t copied)
+ {
+ const struct iomap *iomap = &iter->iomap;
+@@ -903,12 +906,16 @@ static void iomap_write_end_inline(const struct iomap_iter *iter,
+ WARN_ON_ONCE(!folio_test_uptodate(folio));
+ BUG_ON(!iomap_inline_data_valid(iomap));
+
++ if (WARN_ON_ONCE(!iomap->inline_data))
++ return false;
++
+ flush_dcache_folio(folio);
+ addr = kmap_local_folio(folio, pos);
+ memcpy(iomap_inline_data(iomap, pos), addr, copied);
+ kunmap_local(addr);
+
+ mark_inode_dirty(iter->inode);
++ return true;
+ }
+
+ /*
+@@ -921,10 +928,8 @@ static bool iomap_write_end(struct iomap_iter *iter, size_t len, size_t copied,
+ const struct iomap *srcmap = iomap_iter_srcmap(iter);
+ loff_t pos = iter->pos;
+
+- if (srcmap->type == IOMAP_INLINE) {
+- iomap_write_end_inline(iter, folio, pos, copied);
+- return true;
+- }
++ if (srcmap->type == IOMAP_INLINE)
++ return iomap_write_end_inline(iter, folio, pos, copied);
+
+ if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
+ size_t bh_written;
+diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
+index b84f6af2eb4c88..46aa85af13dc56 100644
+--- a/fs/iomap/direct-io.c
++++ b/fs/iomap/direct-io.c
+@@ -519,6 +519,9 @@ static int iomap_dio_inline_iter(struct iomap_iter *iomi, struct iomap_dio *dio)
+ loff_t pos = iomi->pos;
+ u64 copied;
+
++ if (WARN_ON_ONCE(!inline_data))
++ return -EIO;
++
+ if (WARN_ON_ONCE(!iomap_inline_data_valid(iomap)))
+ return -EIO;
+
+diff --git a/fs/minix/inode.c b/fs/minix/inode.c
+index df9d11479caf1e..32db676127a9ed 100644
+--- a/fs/minix/inode.c
++++ b/fs/minix/inode.c
+@@ -492,8 +492,14 @@ void minix_set_inode(struct inode *inode, dev_t rdev)
+ inode->i_op = &minix_symlink_inode_operations;
+ inode_nohighmem(inode);
+ inode->i_mapping->a_ops = &minix_aops;
+- } else
++ } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
++ S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
+ init_special_inode(inode, inode->i_mode, rdev);
++ } else {
++ printk(KERN_DEBUG "MINIX-fs: Invalid file type 0%04o for inode %lu.\n",
++ inode->i_mode, inode->i_ino);
++ make_bad_inode(inode);
++ }
+ }
+
+ /*
+diff --git a/fs/namei.c b/fs/namei.c
+index cd43ff89fbaa38..35b8b3e6672df0 100644
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -1449,6 +1449,10 @@ static int follow_automount(struct path *path, int *count, unsigned lookup_flags
+ dentry->d_inode)
+ return -EISDIR;
+
++ /* No need to trigger automounts if mountpoint crossing is disabled. */
++ if (lookup_flags & LOOKUP_NO_XDEV)
++ return -EXDEV;
++
+ if (count && (*count)++ >= MAXSYMLINKS)
+ return -ELOOP;
+
+@@ -1472,6 +1476,10 @@ static int __traverse_mounts(struct path *path, unsigned flags, bool *jumped,
+ /* Allow the filesystem to manage the transit without i_rwsem
+ * being held. */
+ if (flags & DCACHE_MANAGE_TRANSIT) {
++ if (lookup_flags & LOOKUP_NO_XDEV) {
++ ret = -EXDEV;
++ break;
++ }
+ ret = path->dentry->d_op->d_manage(path, false);
+ flags = smp_load_acquire(&path->dentry->d_flags);
+ if (ret < 0)
+diff --git a/fs/namespace.c b/fs/namespace.c
+index 51f77c65c0c61e..c8c2376bb24245 100644
+--- a/fs/namespace.c
++++ b/fs/namespace.c
+@@ -65,6 +65,15 @@ static int __init set_mphash_entries(char *str)
+ }
+ __setup("mphash_entries=", set_mphash_entries);
+
++static char * __initdata initramfs_options;
++static int __init initramfs_options_setup(char *str)
++{
++ initramfs_options = str;
++ return 1;
++}
++
++__setup("initramfs_options=", initramfs_options_setup);
++
+ static u64 event;
+ static DEFINE_XARRAY_FLAGS(mnt_id_xa, XA_FLAGS_ALLOC);
+ static DEFINE_IDA(mnt_group_ida);
+@@ -171,7 +180,7 @@ static void mnt_ns_tree_add(struct mnt_namespace *ns)
+ static void mnt_ns_release(struct mnt_namespace *ns)
+ {
+ /* keep alive for {list,stat}mount() */
+- if (refcount_dec_and_test(&ns->passive)) {
++ if (ns && refcount_dec_and_test(&ns->passive)) {
+ fsnotify_mntns_delete(ns);
+ put_user_ns(ns->user_ns);
+ kfree(ns);
+@@ -187,7 +196,7 @@ static void mnt_ns_release_rcu(struct rcu_head *rcu)
+ static void mnt_ns_tree_remove(struct mnt_namespace *ns)
+ {
+ /* remove from global mount namespace list */
+- if (!is_anon_ns(ns)) {
++ if (!RB_EMPTY_NODE(&ns->mnt_ns_tree_node)) {
+ mnt_ns_tree_write_lock();
+ rb_erase(&ns->mnt_ns_tree_node, &mnt_ns_tree);
+ list_bidir_del_rcu(&ns->mnt_ns_list);
+@@ -5711,7 +5720,6 @@ static int grab_requested_root(struct mnt_namespace *ns, struct path *root)
+ static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
+ struct mnt_namespace *ns)
+ {
+- struct path root __free(path_put) = {};
+ struct mount *m;
+ int err;
+
+@@ -5723,7 +5731,7 @@ static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
+ if (!s->mnt)
+ return -ENOENT;
+
+- err = grab_requested_root(ns, &root);
++ err = grab_requested_root(ns, &s->root);
+ if (err)
+ return err;
+
+@@ -5732,7 +5740,7 @@ static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
+ * mounts to show users.
+ */
+ m = real_mount(s->mnt);
+- if (!is_path_reachable(m, m->mnt.mnt_root, &root) &&
++ if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
+ !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
+ return -EPERM;
+
+@@ -5740,8 +5748,6 @@ static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
+ if (err)
+ return err;
+
+- s->root = root;
+-
+ /*
+ * Note that mount properties in mnt->mnt_flags, mnt->mnt_idmap
+ * can change concurrently as we only hold the read-side of the
+@@ -5963,28 +5969,40 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
+ if (!ret)
+ ret = copy_statmount_to_user(ks);
+ kvfree(ks->seq.buf);
++ path_put(&ks->root);
+ if (retry_statmount(ret, &seq_size))
+ goto retry;
+ return ret;
+ }
+
+-static ssize_t do_listmount(struct mnt_namespace *ns, u64 mnt_parent_id,
+- u64 last_mnt_id, u64 *mnt_ids, size_t nr_mnt_ids,
+- bool reverse)
++struct klistmount {
++ u64 last_mnt_id;
++ u64 mnt_parent_id;
++ u64 *kmnt_ids;
++ u32 nr_mnt_ids;
++ struct mnt_namespace *ns;
++ struct path root;
++};
++
++static ssize_t do_listmount(struct klistmount *kls, bool reverse)
+ {
+- struct path root __free(path_put) = {};
++ struct mnt_namespace *ns = kls->ns;
++ u64 mnt_parent_id = kls->mnt_parent_id;
++ u64 last_mnt_id = kls->last_mnt_id;
++ u64 *mnt_ids = kls->kmnt_ids;
++ size_t nr_mnt_ids = kls->nr_mnt_ids;
+ struct path orig;
+ struct mount *r, *first;
+ ssize_t ret;
+
+ rwsem_assert_held(&namespace_sem);
+
+- ret = grab_requested_root(ns, &root);
++ ret = grab_requested_root(ns, &kls->root);
+ if (ret)
+ return ret;
+
+ if (mnt_parent_id == LSMT_ROOT) {
+- orig = root;
++ orig = kls->root;
+ } else {
+ orig.mnt = lookup_mnt_in_ns(mnt_parent_id, ns);
+ if (!orig.mnt)
+@@ -5996,7 +6014,7 @@ static ssize_t do_listmount(struct mnt_namespace *ns, u64 mnt_parent_id,
+ * Don't trigger audit denials. We just want to determine what
+ * mounts to show users.
+ */
+- if (!is_path_reachable(real_mount(orig.mnt), orig.dentry, &root) &&
++ if (!is_path_reachable(real_mount(orig.mnt), orig.dentry, &kls->root) &&
+ !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
+ return -EPERM;
+
+@@ -6029,14 +6047,45 @@ static ssize_t do_listmount(struct mnt_namespace *ns, u64 mnt_parent_id,
+ return ret;
+ }
+
++static void __free_klistmount_free(const struct klistmount *kls)
++{
++ path_put(&kls->root);
++ kvfree(kls->kmnt_ids);
++ mnt_ns_release(kls->ns);
++}
++
++static inline int prepare_klistmount(struct klistmount *kls, struct mnt_id_req *kreq,
++ size_t nr_mnt_ids)
++{
++
++ u64 last_mnt_id = kreq->param;
++
++ /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */
++ if (last_mnt_id != 0 && last_mnt_id <= MNT_UNIQUE_ID_OFFSET)
++ return -EINVAL;
++
++ kls->last_mnt_id = last_mnt_id;
++
++ kls->nr_mnt_ids = nr_mnt_ids;
++ kls->kmnt_ids = kvmalloc_array(nr_mnt_ids, sizeof(*kls->kmnt_ids),
++ GFP_KERNEL_ACCOUNT);
++ if (!kls->kmnt_ids)
++ return -ENOMEM;
++
++ kls->ns = grab_requested_mnt_ns(kreq);
++ if (!kls->ns)
++ return -ENOENT;
++
++ kls->mnt_parent_id = kreq->mnt_id;
++ return 0;
++}
++
+ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
+ u64 __user *, mnt_ids, size_t, nr_mnt_ids, unsigned int, flags)
+ {
+- u64 *kmnt_ids __free(kvfree) = NULL;
++ struct klistmount kls __free(klistmount_free) = {};
+ const size_t maxcount = 1000000;
+- struct mnt_namespace *ns __free(mnt_ns_release) = NULL;
+ struct mnt_id_req kreq;
+- u64 last_mnt_id;
+ ssize_t ret;
+
+ if (flags & ~LISTMOUNT_REVERSE)
+@@ -6057,22 +6106,12 @@ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
+ if (ret)
+ return ret;
+
+- last_mnt_id = kreq.param;
+- /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */
+- if (last_mnt_id != 0 && last_mnt_id <= MNT_UNIQUE_ID_OFFSET)
+- return -EINVAL;
+-
+- kmnt_ids = kvmalloc_array(nr_mnt_ids, sizeof(*kmnt_ids),
+- GFP_KERNEL_ACCOUNT);
+- if (!kmnt_ids)
+- return -ENOMEM;
+-
+- ns = grab_requested_mnt_ns(&kreq);
+- if (!ns)
+- return -ENOENT;
++ ret = prepare_klistmount(&kls, &kreq, nr_mnt_ids);
++ if (ret)
++ return ret;
+
+- if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) &&
+- !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
++ if (kreq.mnt_ns_id && (kls.ns != current->nsproxy->mnt_ns) &&
++ !ns_capable_noaudit(kls.ns->user_ns, CAP_SYS_ADMIN))
+ return -ENOENT;
+
+ /*
+@@ -6080,12 +6119,11 @@ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
+ * listmount() doesn't care about any mount properties.
+ */
+ scoped_guard(rwsem_read, &namespace_sem)
+- ret = do_listmount(ns, kreq.mnt_id, last_mnt_id, kmnt_ids,
+- nr_mnt_ids, (flags & LISTMOUNT_REVERSE));
++ ret = do_listmount(&kls, (flags & LISTMOUNT_REVERSE));
+ if (ret <= 0)
+ return ret;
+
+- if (copy_to_user(mnt_ids, kmnt_ids, ret * sizeof(*mnt_ids)))
++ if (copy_to_user(mnt_ids, kls.kmnt_ids, ret * sizeof(*mnt_ids)))
+ return -EFAULT;
+
+ return ret;
+@@ -6098,7 +6136,7 @@ static void __init init_mount_tree(void)
+ struct mnt_namespace *ns;
+ struct path root;
+
+- mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL);
++ mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", initramfs_options);
+ if (IS_ERR(mnt))
+ panic("Can't create rootfs");
+
+diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
+index cadfc2bae60ea1..95b5681152c47a 100644
+--- a/fs/nfsd/export.c
++++ b/fs/nfsd/export.c
+@@ -1082,50 +1082,62 @@ static struct svc_export *exp_find(struct cache_detail *cd,
+ }
+
+ /**
+- * check_nfsd_access - check if access to export is allowed.
++ * check_xprtsec_policy - check if access to export is allowed by the
++ * xprtsec policy
+ * @exp: svc_export that is being accessed.
+- * @rqstp: svc_rqst attempting to access @exp (will be NULL for LOCALIO).
+- * @may_bypass_gss: reduce strictness of authorization check
++ * @rqstp: svc_rqst attempting to access @exp.
++ *
++ * Helper function for check_nfsd_access(). Note that callers should be
++ * using check_nfsd_access() instead of calling this function directly. The
++ * one exception is __fh_verify() since it has logic that may result in one
++ * or both of the helpers being skipped.
+ *
+ * Return values:
+ * %nfs_ok if access is granted, or
+ * %nfserr_wrongsec if access is denied
+ */
+-__be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
+- bool may_bypass_gss)
++__be32 check_xprtsec_policy(struct svc_export *exp, struct svc_rqst *rqstp)
+ {
+- struct exp_flavor_info *f, *end = exp->ex_flavors + exp->ex_nflavors;
+- struct svc_xprt *xprt;
+-
+- /*
+- * If rqstp is NULL, this is a LOCALIO request which will only
+- * ever use a filehandle/credential pair for which access has
+- * been affirmed (by ACCESS or OPEN NFS requests) over the
+- * wire. So there is no need for further checks here.
+- */
+- if (!rqstp)
+- return nfs_ok;
+-
+- xprt = rqstp->rq_xprt;
++ struct svc_xprt *xprt = rqstp->rq_xprt;
+
+ if (exp->ex_xprtsec_modes & NFSEXP_XPRTSEC_NONE) {
+ if (!test_bit(XPT_TLS_SESSION, &xprt->xpt_flags))
+- goto ok;
++ return nfs_ok;
+ }
+ if (exp->ex_xprtsec_modes & NFSEXP_XPRTSEC_TLS) {
+ if (test_bit(XPT_TLS_SESSION, &xprt->xpt_flags) &&
+ !test_bit(XPT_PEER_AUTH, &xprt->xpt_flags))
+- goto ok;
++ return nfs_ok;
+ }
+ if (exp->ex_xprtsec_modes & NFSEXP_XPRTSEC_MTLS) {
+ if (test_bit(XPT_TLS_SESSION, &xprt->xpt_flags) &&
+ test_bit(XPT_PEER_AUTH, &xprt->xpt_flags))
+- goto ok;
++ return nfs_ok;
+ }
+- if (!may_bypass_gss)
+- goto denied;
++ return nfserr_wrongsec;
++}
++
++/**
++ * check_security_flavor - check if access to export is allowed by the
++ * security flavor
++ * @exp: svc_export that is being accessed.
++ * @rqstp: svc_rqst attempting to access @exp.
++ * @may_bypass_gss: reduce strictness of authorization check
++ *
++ * Helper function for check_nfsd_access(). Note that callers should be
++ * using check_nfsd_access() instead of calling this function directly. The
++ * one exception is __fh_verify() since it has logic that may result in one
++ * or both of the helpers being skipped.
++ *
++ * Return values:
++ * %nfs_ok if access is granted, or
++ * %nfserr_wrongsec if access is denied
++ */
++__be32 check_security_flavor(struct svc_export *exp, struct svc_rqst *rqstp,
++ bool may_bypass_gss)
++{
++ struct exp_flavor_info *f, *end = exp->ex_flavors + exp->ex_nflavors;
+
+-ok:
+ /* legacy gss-only clients are always OK: */
+ if (exp->ex_client == rqstp->rq_gssclient)
+ return nfs_ok;
+@@ -1167,10 +1179,30 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
+ }
+ }
+
+-denied:
+ return nfserr_wrongsec;
+ }
+
++/**
++ * check_nfsd_access - check if access to export is allowed.
++ * @exp: svc_export that is being accessed.
++ * @rqstp: svc_rqst attempting to access @exp.
++ * @may_bypass_gss: reduce strictness of authorization check
++ *
++ * Return values:
++ * %nfs_ok if access is granted, or
++ * %nfserr_wrongsec if access is denied
++ */
++__be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
++ bool may_bypass_gss)
++{
++ __be32 status;
++
++ status = check_xprtsec_policy(exp, rqstp);
++ if (status != nfs_ok)
++ return status;
++ return check_security_flavor(exp, rqstp, may_bypass_gss);
++}
++
+ /*
+ * Uses rq_client and rq_gssclient to find an export; uses rq_client (an
+ * auth_unix client) if it's available and has secinfo information;
+diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
+index b9c0adb3ce0918..ef5581911d5b87 100644
+--- a/fs/nfsd/export.h
++++ b/fs/nfsd/export.h
+@@ -101,6 +101,9 @@ struct svc_expkey {
+
+ struct svc_cred;
+ int nfsexp_flags(struct svc_cred *cred, struct svc_export *exp);
++__be32 check_xprtsec_policy(struct svc_export *exp, struct svc_rqst *rqstp);
++__be32 check_security_flavor(struct svc_export *exp, struct svc_rqst *rqstp,
++ bool may_bypass_gss);
+ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
+ bool may_bypass_gss);
+
+diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
+index edc9f75dc75c6d..6b042218668b82 100644
+--- a/fs/nfsd/lockd.c
++++ b/fs/nfsd/lockd.c
+@@ -57,6 +57,21 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
+ switch (nfserr) {
+ case nfs_ok:
+ return 0;
++ case nfserr_jukebox:
++ /* this error can indicate a presence of a conflicting
++ * delegation to an NLM lock request. Options are:
++ * (1) For now, drop this request and make the client
++ * retry. When delegation is returned, client's lock retry
++ * will complete.
++ * (2) NLM4_DENIED as per "spec" signals to the client
++ * that the lock is unavailable now but client can retry.
++ * Linux client implementation does not. It treats
++ * NLM4_DENIED same as NLM4_FAILED and errors the request.
++ * (3) For the future, treat this as blocked lock and try
++ * to callback when the delegation is returned but might
++ * not have a proper lock request to block on.
++ */
++ fallthrough;
+ case nfserr_dropit:
+ return nlm_drop_reply;
+ case nfserr_stale:
+diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
+index 71b428efcbb594..75abdd7c6ef84b 100644
+--- a/fs/nfsd/nfs4proc.c
++++ b/fs/nfsd/nfs4proc.c
+@@ -1133,6 +1133,33 @@ nfsd4_secinfo_no_name_release(union nfsd4_op_u *u)
+ exp_put(u->secinfo_no_name.sin_exp);
+ }
+
++/*
++ * Validate that the requested timestamps are within the acceptable range. If
++ * timestamp appears to be in the future, then it will be clamped to
++ * current_time().
++ */
++static void
++vet_deleg_attrs(struct nfsd4_setattr *setattr, struct nfs4_delegation *dp)
++{
++ struct timespec64 now = current_time(dp->dl_stid.sc_file->fi_inode);
++ struct iattr *iattr = &setattr->sa_iattr;
++
++ if ((setattr->sa_bmval[2] & FATTR4_WORD2_TIME_DELEG_ACCESS) &&
++ !nfsd4_vet_deleg_time(&iattr->ia_atime, &dp->dl_atime, &now))
++ iattr->ia_valid &= ~(ATTR_ATIME | ATTR_ATIME_SET);
++
++ if (setattr->sa_bmval[2] & FATTR4_WORD2_TIME_DELEG_MODIFY) {
++ if (nfsd4_vet_deleg_time(&iattr->ia_mtime, &dp->dl_mtime, &now)) {
++ iattr->ia_ctime = iattr->ia_mtime;
++ if (!nfsd4_vet_deleg_time(&iattr->ia_ctime, &dp->dl_ctime, &now))
++ iattr->ia_valid &= ~(ATTR_CTIME | ATTR_CTIME_SET);
++ } else {
++ iattr->ia_valid &= ~(ATTR_CTIME | ATTR_CTIME_SET |
++ ATTR_MTIME | ATTR_MTIME_SET);
++ }
++ }
++}
++
+ static __be32
+ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
+ union nfsd4_op_u *u)
+@@ -1170,8 +1197,10 @@ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
+ struct nfs4_delegation *dp = delegstateid(st);
+
+ /* Only for *_ATTRS_DELEG flavors */
+- if (deleg_attrs_deleg(dp->dl_type))
++ if (deleg_attrs_deleg(dp->dl_type)) {
++ vet_deleg_attrs(setattr, dp);
+ status = nfs_ok;
++ }
+ }
+ }
+ if (st)
+@@ -1469,7 +1498,7 @@ static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr,
+ return 0;
+ }
+ if (work) {
+- strscpy(work->nsui_ipaddr, ipaddr, sizeof(work->nsui_ipaddr) - 1);
++ strscpy(work->nsui_ipaddr, ipaddr, sizeof(work->nsui_ipaddr));
+ refcount_set(&work->nsui_refcnt, 2);
+ work->nsui_busy = true;
+ list_add_tail(&work->nsui_list, &nn->nfsd_ssc_mount_list);
+diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
+index 88c347957da5b8..205ee8cc6fa2b9 100644
+--- a/fs/nfsd/nfs4state.c
++++ b/fs/nfsd/nfs4state.c
+@@ -6157,7 +6157,8 @@ nfs4_delegation_stat(struct nfs4_delegation *dp, struct svc_fh *currentfh,
+ path.dentry = file_dentry(nf->nf_file);
+
+ rc = vfs_getattr(&path, stat,
+- (STATX_MODE | STATX_SIZE | STATX_CTIME | STATX_CHANGE_COOKIE),
++ STATX_MODE | STATX_SIZE | STATX_ATIME |
++ STATX_MTIME | STATX_CTIME | STATX_CHANGE_COOKIE,
+ AT_STATX_SYNC_AS_STAT);
+
+ nfsd_file_put(nf);
+@@ -6274,10 +6275,14 @@ nfs4_open_delegation(struct svc_rqst *rqstp, struct nfsd4_open *open,
+ OPEN_DELEGATE_WRITE;
+ dp->dl_cb_fattr.ncf_cur_fsize = stat.size;
+ dp->dl_cb_fattr.ncf_initial_cinfo = nfsd4_change_attribute(&stat);
++ dp->dl_atime = stat.atime;
++ dp->dl_ctime = stat.ctime;
++ dp->dl_mtime = stat.mtime;
+ trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
+ } else {
+- open->op_delegate_type = deleg_ts ? OPEN_DELEGATE_READ_ATTRS_DELEG :
+- OPEN_DELEGATE_READ;
++ open->op_delegate_type = deleg_ts && nfs4_delegation_stat(dp, currentfh, &stat) ?
++ OPEN_DELEGATE_READ_ATTRS_DELEG : OPEN_DELEGATE_READ;
++ dp->dl_atime = stat.atime;
+ trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
+ }
+ nfs4_put_stid(&dp->dl_stid);
+@@ -9130,25 +9135,25 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
+ }
+
+ /**
+- * set_cb_time - vet and set the timespec for a cb_getattr update
+- * @cb: timestamp from the CB_GETATTR response
++ * nfsd4_vet_deleg_time - vet and set the timespec for a delegated timestamp update
++ * @req: timestamp from the client
+ * @orig: original timestamp in the inode
+ * @now: current time
+ *
+- * Given a timestamp in a CB_GETATTR response, check it against the
++ * Given a timestamp from the client response, check it against the
+ * current timestamp in the inode and the current time. Returns true
+ * if the inode's timestamp needs to be updated, and false otherwise.
+- * @cb may also be changed if the timestamp needs to be clamped.
++ * @req may also be changed if the timestamp needs to be clamped.
+ */
+-static bool set_cb_time(struct timespec64 *cb, const struct timespec64 *orig,
+- const struct timespec64 *now)
++bool nfsd4_vet_deleg_time(struct timespec64 *req, const struct timespec64 *orig,
++ const struct timespec64 *now)
+ {
+
+ /*
+ * "When the time presented is before the original time, then the
+ * update is ignored." Also no need to update if there is no change.
+ */
+- if (timespec64_compare(cb, orig) <= 0)
++ if (timespec64_compare(req, orig) <= 0)
+ return false;
+
+ /*
+@@ -9156,10 +9161,8 @@ static bool set_cb_time(struct timespec64 *cb, const struct timespec64 *orig,
+ * clamp the new time to the current time, or it may
+ * return NFS4ERR_DELAY to the client, allowing it to retry."
+ */
+- if (timespec64_compare(cb, now) > 0) {
+- /* clamp it */
+- *cb = *now;
+- }
++ if (timespec64_compare(req, now) > 0)
++ *req = *now;
+
+ return true;
+ }
+@@ -9167,28 +9170,27 @@ static bool set_cb_time(struct timespec64 *cb, const struct timespec64 *orig,
+ static int cb_getattr_update_times(struct dentry *dentry, struct nfs4_delegation *dp)
+ {
+ struct inode *inode = d_inode(dentry);
+- struct timespec64 now = current_time(inode);
+ struct nfs4_cb_fattr *ncf = &dp->dl_cb_fattr;
+ struct iattr attrs = { };
+ int ret;
+
+ if (deleg_attrs_deleg(dp->dl_type)) {
+- struct timespec64 atime = inode_get_atime(inode);
+- struct timespec64 mtime = inode_get_mtime(inode);
++ struct timespec64 now = current_time(inode);
+
+ attrs.ia_atime = ncf->ncf_cb_atime;
+ attrs.ia_mtime = ncf->ncf_cb_mtime;
+
+- if (set_cb_time(&attrs.ia_atime, &atime, &now))
++ if (nfsd4_vet_deleg_time(&attrs.ia_atime, &dp->dl_atime, &now))
+ attrs.ia_valid |= ATTR_ATIME | ATTR_ATIME_SET;
+
+- if (set_cb_time(&attrs.ia_mtime, &mtime, &now)) {
+- attrs.ia_valid |= ATTR_CTIME | ATTR_MTIME | ATTR_MTIME_SET;
++ if (nfsd4_vet_deleg_time(&attrs.ia_mtime, &dp->dl_mtime, &now)) {
++ attrs.ia_valid |= ATTR_MTIME | ATTR_MTIME_SET;
+ attrs.ia_ctime = attrs.ia_mtime;
++ if (nfsd4_vet_deleg_time(&attrs.ia_ctime, &dp->dl_ctime, &now))
++ attrs.ia_valid |= ATTR_CTIME | ATTR_CTIME_SET;
+ }
+ } else {
+ attrs.ia_valid |= ATTR_MTIME | ATTR_CTIME;
+- attrs.ia_mtime = attrs.ia_ctime = now;
+ }
+
+ if (!attrs.ia_valid)
+diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
+index ea91bad4eee2cd..a00300b2877541 100644
+--- a/fs/nfsd/nfs4xdr.c
++++ b/fs/nfsd/nfs4xdr.c
+@@ -538,8 +538,9 @@ nfsd4_decode_fattr4(struct nfsd4_compoundargs *argp, u32 *bmval, u32 bmlen,
+ iattr->ia_mtime.tv_sec = modify.seconds;
+ iattr->ia_mtime.tv_nsec = modify.nseconds;
+ iattr->ia_ctime.tv_sec = modify.seconds;
+- iattr->ia_ctime.tv_nsec = modify.seconds;
+- iattr->ia_valid |= ATTR_CTIME | ATTR_MTIME | ATTR_MTIME_SET | ATTR_DELEG;
++ iattr->ia_ctime.tv_nsec = modify.nseconds;
++ iattr->ia_valid |= ATTR_CTIME | ATTR_CTIME_SET |
++ ATTR_MTIME | ATTR_MTIME_SET | ATTR_DELEG;
+ }
+
+ /* request sanity: did attrlist4 contain the expected number of words? */
+diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
+index 74cf1f4de17410..1078a4c763b071 100644
+--- a/fs/nfsd/nfsfh.c
++++ b/fs/nfsd/nfsfh.c
+@@ -364,10 +364,30 @@ __fh_verify(struct svc_rqst *rqstp,
+ if (error)
+ goto out;
+
++ /*
++ * If rqstp is NULL, this is a LOCALIO request which will only
++ * ever use a filehandle/credential pair for which access has
++ * been affirmed (by ACCESS or OPEN NFS requests) over the
++ * wire. Skip both the xprtsec policy and the security flavor
++ * checks.
++ */
++ if (!rqstp)
++ goto check_permissions;
++
+ if ((access & NFSD_MAY_NLM) && (exp->ex_flags & NFSEXP_NOAUTHNLM))
+ /* NLM is allowed to fully bypass authentication */
+ goto out;
+
++ /*
++ * NLM is allowed to bypass the xprtsec policy check because lockd
++ * doesn't support xprtsec.
++ */
++ if (!(access & NFSD_MAY_NLM)) {
++ error = check_xprtsec_policy(exp, rqstp);
++ if (error)
++ goto out;
++ }
++
+ if (access & NFSD_MAY_BYPASS_GSS)
+ may_bypass_gss = true;
+ /*
+@@ -379,13 +399,15 @@ __fh_verify(struct svc_rqst *rqstp,
+ && exp->ex_path.dentry == dentry)
+ may_bypass_gss = true;
+
+- error = check_nfsd_access(exp, rqstp, may_bypass_gss);
++ error = check_security_flavor(exp, rqstp, may_bypass_gss);
+ if (error)
+ goto out;
++
+ /* During LOCALIO call to fh_verify will be called with a NULL rqstp */
+ if (rqstp)
+ svc_xprt_set_valid(rqstp->rq_xprt);
+
++check_permissions:
+ /* Finally, check access permissions. */
+ error = nfsd_permission(cred, exp, dentry, access);
+ out:
+diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
+index 8adc2550129e67..bf9436cdb93c5d 100644
+--- a/fs/nfsd/state.h
++++ b/fs/nfsd/state.h
+@@ -224,6 +224,11 @@ struct nfs4_delegation {
+
+ /* for CB_GETATTR */
+ struct nfs4_cb_fattr dl_cb_fattr;
++
++ /* For delegated timestamps */
++ struct timespec64 dl_atime;
++ struct timespec64 dl_mtime;
++ struct timespec64 dl_ctime;
+ };
+
+ static inline bool deleg_is_read(u32 dl_type)
+@@ -242,6 +247,9 @@ static inline bool deleg_attrs_deleg(u32 dl_type)
+ dl_type == OPEN_DELEGATE_WRITE_ATTRS_DELEG;
+ }
+
++bool nfsd4_vet_deleg_time(struct timespec64 *cb, const struct timespec64 *orig,
++ const struct timespec64 *now);
++
+ #define cb_to_delegation(cb) \
+ container_of(cb, struct nfs4_delegation, dl_recall)
+
+diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
+index edf050766e5705..3cd3b9e069f4af 100644
+--- a/fs/nfsd/vfs.c
++++ b/fs/nfsd/vfs.c
+@@ -467,7 +467,7 @@ static int __nfsd_setattr(struct dentry *dentry, struct iattr *iap)
+ return 0;
+ }
+
+- if (!iap->ia_valid)
++ if ((iap->ia_valid & ~ATTR_DELEG) == 0)
+ return 0;
+
+ /*
+diff --git a/fs/nsfs.c b/fs/nsfs.c
+index 59aa801347a7de..34f0b35d3ead76 100644
+--- a/fs/nsfs.c
++++ b/fs/nsfs.c
+@@ -169,9 +169,11 @@ static bool nsfs_ioctl_valid(unsigned int cmd)
+ /* Extensible ioctls require some extra handling. */
+ switch (_IOC_NR(cmd)) {
+ case _IOC_NR(NS_MNT_GET_INFO):
++ return extensible_ioctl_valid(cmd, NS_MNT_GET_INFO, MNT_NS_INFO_SIZE_VER0);
+ case _IOC_NR(NS_MNT_GET_NEXT):
++ return extensible_ioctl_valid(cmd, NS_MNT_GET_NEXT, MNT_NS_INFO_SIZE_VER0);
+ case _IOC_NR(NS_MNT_GET_PREV):
+- return (_IOC_TYPE(cmd) == _IOC_TYPE(cmd));
++ return extensible_ioctl_valid(cmd, NS_MNT_GET_PREV, MNT_NS_INFO_SIZE_VER0);
+ }
+
+ return false;
+diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
+index 04107b95071707..65d05e6a056650 100644
+--- a/fs/ntfs3/bitmap.c
++++ b/fs/ntfs3/bitmap.c
+@@ -1371,6 +1371,7 @@ int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
+ mark_buffer_dirty(bh);
+ unlock_buffer(bh);
+ /* err = sync_dirty_buffer(bh); */
++ put_bh(bh);
+
+ b0 = 0;
+ bits -= op;
+diff --git a/fs/pidfs.c b/fs/pidfs.c
+index 108e7527f837fd..2c9c7636253af0 100644
+--- a/fs/pidfs.c
++++ b/fs/pidfs.c
+@@ -440,7 +440,7 @@ static bool pidfs_ioctl_valid(unsigned int cmd)
+ * erronously mistook the file descriptor for a pidfd.
+ * This is not perfect but will catch most cases.
+ */
+- return (_IOC_TYPE(cmd) == _IOC_TYPE(PIDFD_GET_INFO));
++ return extensible_ioctl_valid(cmd, PIDFD_GET_INFO, PIDFD_INFO_SIZE_VER0);
+ }
+
+ return false;
+diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
+index df4a9b34876965..6c4a6ee1fa2b6f 100644
+--- a/fs/quota/dquot.c
++++ b/fs/quota/dquot.c
+@@ -162,6 +162,9 @@ static struct quota_module_name module_names[] = INIT_QUOTA_MODULE_NAMES;
+ /* SLAB cache for dquot structures */
+ static struct kmem_cache *dquot_cachep;
+
++/* workqueue for work quota_release_work*/
++static struct workqueue_struct *quota_unbound_wq;
++
+ void register_quota_format(struct quota_format_type *fmt)
+ {
+ spin_lock(&dq_list_lock);
+@@ -881,7 +884,7 @@ void dqput(struct dquot *dquot)
+ put_releasing_dquots(dquot);
+ atomic_dec(&dquot->dq_count);
+ spin_unlock(&dq_list_lock);
+- queue_delayed_work(system_unbound_wq, "a_release_work, 1);
++ queue_delayed_work(quota_unbound_wq, "a_release_work, 1);
+ }
+ EXPORT_SYMBOL(dqput);
+
+@@ -3041,6 +3044,11 @@ static int __init dquot_init(void)
+
+ shrinker_register(dqcache_shrinker);
+
++ quota_unbound_wq = alloc_workqueue("quota_events_unbound",
++ WQ_UNBOUND | WQ_MEM_RECLAIM, WQ_MAX_ACTIVE);
++ if (!quota_unbound_wq)
++ panic("Cannot create quota_unbound_wq\n");
++
+ return 0;
+ }
+ fs_initcall(dquot_init);
+diff --git a/fs/read_write.c b/fs/read_write.c
+index c5b6265d984bae..833bae068770a4 100644
+--- a/fs/read_write.c
++++ b/fs/read_write.c
+@@ -1576,6 +1576,13 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
+ if (len == 0)
+ return 0;
+
++ /*
++ * Make sure return value doesn't overflow in 32bit compat mode. Also
++ * limit the size for all cases except when calling ->copy_file_range().
++ */
++ if (splice || !file_out->f_op->copy_file_range || in_compat_syscall())
++ len = min_t(size_t, MAX_RW_COUNT, len);
++
+ file_start_write(file_out);
+
+ /*
+@@ -1589,9 +1596,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
+ len, flags);
+ } else if (!splice && file_in->f_op->remap_file_range && samesb) {
+ ret = file_in->f_op->remap_file_range(file_in, pos_in,
+- file_out, pos_out,
+- min_t(loff_t, MAX_RW_COUNT, len),
+- REMAP_FILE_CAN_SHORTEN);
++ file_out, pos_out, len, REMAP_FILE_CAN_SHORTEN);
+ /* fallback to splice */
+ if (ret <= 0)
+ splice = true;
+@@ -1624,8 +1629,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
+ * to splicing from input file, while file_start_write() is held on
+ * the output file on a different sb.
+ */
+- ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
+- min_t(size_t, len, MAX_RW_COUNT), 0);
++ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out, len, 0);
+ done:
+ if (ret > 0) {
+ fsnotify_access(file_in);
+diff --git a/fs/smb/client/dir.c b/fs/smb/client/dir.c
+index 5223edf6d11a5b..26117b147eac6e 100644
+--- a/fs/smb/client/dir.c
++++ b/fs/smb/client/dir.c
+@@ -329,6 +329,7 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
+ parent_cfid->fid.lease_key,
+ SMB2_LEASE_KEY_SIZE);
+ parent_cfid->dirents.is_valid = false;
++ parent_cfid->dirents.is_failed = true;
+ }
+ break;
+ }
+diff --git a/fs/smb/client/smb1ops.c b/fs/smb/client/smb1ops.c
+index a02d41d1ce4a3f..3fdbb71036cff2 100644
+--- a/fs/smb/client/smb1ops.c
++++ b/fs/smb/client/smb1ops.c
+@@ -651,14 +651,72 @@ static int cifs_query_path_info(const unsigned int xid,
+ }
+
+ #ifdef CONFIG_CIFS_XATTR
++ /*
++ * For non-symlink WSL reparse points it is required to fetch
++ * EA $LXMOD which contains in its S_DT part the mandatory file type.
++ */
++ if (!rc && data->reparse_point) {
++ struct smb2_file_full_ea_info *ea;
++ u32 next = 0;
++
++ ea = (struct smb2_file_full_ea_info *)data->wsl.eas;
++ do {
++ ea = (void *)((u8 *)ea + next);
++ next = le32_to_cpu(ea->next_entry_offset);
++ } while (next);
++ if (le16_to_cpu(ea->ea_value_length)) {
++ ea->next_entry_offset = cpu_to_le32(ALIGN(sizeof(*ea) +
++ ea->ea_name_length + 1 +
++ le16_to_cpu(ea->ea_value_length), 4));
++ ea = (void *)((u8 *)ea + le32_to_cpu(ea->next_entry_offset));
++ }
++
++ rc = CIFSSMBQAllEAs(xid, tcon, full_path, SMB2_WSL_XATTR_MODE,
++ &ea->ea_data[SMB2_WSL_XATTR_NAME_LEN + 1],
++ SMB2_WSL_XATTR_MODE_SIZE, cifs_sb);
++ if (rc == SMB2_WSL_XATTR_MODE_SIZE) {
++ ea->next_entry_offset = cpu_to_le32(0);
++ ea->flags = 0;
++ ea->ea_name_length = SMB2_WSL_XATTR_NAME_LEN;
++ ea->ea_value_length = cpu_to_le16(SMB2_WSL_XATTR_MODE_SIZE);
++ memcpy(&ea->ea_data[0], SMB2_WSL_XATTR_MODE, SMB2_WSL_XATTR_NAME_LEN + 1);
++ data->wsl.eas_len += ALIGN(sizeof(*ea) + SMB2_WSL_XATTR_NAME_LEN + 1 +
++ SMB2_WSL_XATTR_MODE_SIZE, 4);
++ rc = 0;
++ } else if (rc >= 0) {
++ /* It is an error if EA $LXMOD has wrong size. */
++ rc = -EINVAL;
++ } else {
++ /*
++ * In all other cases ignore error if fetching
++ * of EA $LXMOD failed. It is needed only for
++ * non-symlink WSL reparse points and wsl_to_fattr()
++ * handle the case when EA is missing.
++ */
++ rc = 0;
++ }
++ }
++
+ /*
+ * For WSL CHR and BLK reparse points it is required to fetch
+ * EA $LXDEV which contains major and minor device numbers.
+ */
+ if (!rc && data->reparse_point) {
+ struct smb2_file_full_ea_info *ea;
++ u32 next = 0;
+
+ ea = (struct smb2_file_full_ea_info *)data->wsl.eas;
++ do {
++ ea = (void *)((u8 *)ea + next);
++ next = le32_to_cpu(ea->next_entry_offset);
++ } while (next);
++ if (le16_to_cpu(ea->ea_value_length)) {
++ ea->next_entry_offset = cpu_to_le32(ALIGN(sizeof(*ea) +
++ ea->ea_name_length + 1 +
++ le16_to_cpu(ea->ea_value_length), 4));
++ ea = (void *)((u8 *)ea + le32_to_cpu(ea->next_entry_offset));
++ }
++
+ rc = CIFSSMBQAllEAs(xid, tcon, full_path, SMB2_WSL_XATTR_DEV,
+ &ea->ea_data[SMB2_WSL_XATTR_NAME_LEN + 1],
+ SMB2_WSL_XATTR_DEV_SIZE, cifs_sb);
+@@ -668,8 +726,8 @@ static int cifs_query_path_info(const unsigned int xid,
+ ea->ea_name_length = SMB2_WSL_XATTR_NAME_LEN;
+ ea->ea_value_length = cpu_to_le16(SMB2_WSL_XATTR_DEV_SIZE);
+ memcpy(&ea->ea_data[0], SMB2_WSL_XATTR_DEV, SMB2_WSL_XATTR_NAME_LEN + 1);
+- data->wsl.eas_len = sizeof(*ea) + SMB2_WSL_XATTR_NAME_LEN + 1 +
+- SMB2_WSL_XATTR_DEV_SIZE;
++ data->wsl.eas_len += ALIGN(sizeof(*ea) + SMB2_WSL_XATTR_NAME_LEN + 1 +
++ SMB2_WSL_XATTR_MODE_SIZE, 4);
+ rc = 0;
+ } else if (rc >= 0) {
+ /* It is an error if EA $LXDEV has wrong size. */
+diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c
+index 0985db9f86e510..e441fa2e768979 100644
+--- a/fs/smb/client/smb2inode.c
++++ b/fs/smb/client/smb2inode.c
+@@ -1382,31 +1382,33 @@ int
+ smb2_set_file_info(struct inode *inode, const char *full_path,
+ FILE_BASIC_INFO *buf, const unsigned int xid)
+ {
+- struct cifs_open_parms oparms;
++ struct kvec in_iov = { .iov_base = buf, .iov_len = sizeof(*buf), };
+ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
++ struct cifsFileInfo *cfile = NULL;
++ struct cifs_open_parms oparms;
+ struct tcon_link *tlink;
+ struct cifs_tcon *tcon;
+- struct cifsFileInfo *cfile;
+- struct kvec in_iov = { .iov_base = buf, .iov_len = sizeof(*buf), };
+- int rc;
+-
+- if ((buf->CreationTime == 0) && (buf->LastAccessTime == 0) &&
+- (buf->LastWriteTime == 0) && (buf->ChangeTime == 0) &&
+- (buf->Attributes == 0))
+- return 0; /* would be a no op, no sense sending this */
++ int rc = 0;
+
+ tlink = cifs_sb_tlink(cifs_sb);
+ if (IS_ERR(tlink))
+ return PTR_ERR(tlink);
+ tcon = tlink_tcon(tlink);
+
+- cifs_get_writable_path(tcon, full_path, FIND_WR_ANY, &cfile);
++ if ((buf->CreationTime == 0) && (buf->LastAccessTime == 0) &&
++ (buf->LastWriteTime == 0) && (buf->ChangeTime == 0)) {
++ if (buf->Attributes == 0)
++ goto out; /* would be a no op, no sense sending this */
++ cifs_get_writable_path(tcon, full_path, FIND_WR_ANY, &cfile);
++ }
++
+ oparms = CIFS_OPARMS(cifs_sb, tcon, full_path, FILE_WRITE_ATTRIBUTES,
+ FILE_OPEN, 0, ACL_NO_MODE);
+ rc = smb2_compound_op(xid, tcon, cifs_sb,
+ full_path, &oparms, &in_iov,
+ &(int){SMB2_OP_SET_INFO}, 1,
+ cfile, NULL, NULL, NULL);
++out:
+ cifs_put_tlink(tlink);
+ return rc;
+ }
+diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
+index 68286673afc999..328fdeecae29a6 100644
+--- a/fs/smb/client/smb2ops.c
++++ b/fs/smb/client/smb2ops.c
+@@ -4653,7 +4653,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
+ unsigned int pad_len;
+ struct cifs_io_subrequest *rdata = mid->callback_data;
+ struct smb2_hdr *shdr = (struct smb2_hdr *)buf;
+- int length;
++ size_t copied;
+ bool use_rdma_mr = false;
+
+ if (shdr->Command != SMB2_READ) {
+@@ -4766,10 +4766,10 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
+ } else if (buf_len >= data_offset + data_len) {
+ /* read response payload is in buf */
+ WARN_ONCE(buffer, "read data can be either in buf or in buffer");
+- length = copy_to_iter(buf + data_offset, data_len, &rdata->subreq.io_iter);
+- if (length < 0)
+- return length;
+- rdata->got_bytes = data_len;
++ copied = copy_to_iter(buf + data_offset, data_len, &rdata->subreq.io_iter);
++ if (copied == 0)
++ return -EIO;
++ rdata->got_bytes = copied;
+ } else {
+ /* read response payload cannot be in both buf and pages */
+ WARN_ONCE(1, "buf can not contain only a part of read data");
+diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
+index 53104f25de5116..f5dcb8353f862f 100644
+--- a/fs/squashfs/inode.c
++++ b/fs/squashfs/inode.c
+@@ -140,8 +140,17 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ if (err < 0)
+ goto failed_read;
+
++ inode->i_size = le32_to_cpu(sqsh_ino->file_size);
+ frag = le32_to_cpu(sqsh_ino->fragment);
+ if (frag != SQUASHFS_INVALID_FRAG) {
++ /*
++ * the file cannot have a fragment (tailend) and have a
++ * file size a multiple of the block size
++ */
++ if ((inode->i_size & (msblk->block_size - 1)) == 0) {
++ err = -EINVAL;
++ goto failed_read;
++ }
+ frag_offset = le32_to_cpu(sqsh_ino->offset);
+ frag_size = squashfs_frag_lookup(sb, frag, &frag_blk);
+ if (frag_size < 0) {
+@@ -155,7 +164,6 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ }
+
+ set_nlink(inode, 1);
+- inode->i_size = le32_to_cpu(sqsh_ino->file_size);
+ inode->i_fop = &generic_ro_fops;
+ inode->i_mode |= S_IFREG;
+ inode->i_blocks = ((inode->i_size - 1) >> 9) + 1;
+@@ -184,8 +192,21 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+ if (err < 0)
+ goto failed_read;
+
++ inode->i_size = le64_to_cpu(sqsh_ino->file_size);
++ if (inode->i_size < 0) {
++ err = -EINVAL;
++ goto failed_read;
++ }
+ frag = le32_to_cpu(sqsh_ino->fragment);
+ if (frag != SQUASHFS_INVALID_FRAG) {
++ /*
++ * the file cannot have a fragment (tailend) and have a
++ * file size a multiple of the block size
++ */
++ if ((inode->i_size & (msblk->block_size - 1)) == 0) {
++ err = -EINVAL;
++ goto failed_read;
++ }
+ frag_offset = le32_to_cpu(sqsh_ino->offset);
+ frag_size = squashfs_frag_lookup(sb, frag, &frag_blk);
+ if (frag_size < 0) {
+@@ -200,7 +221,6 @@ int squashfs_read_inode(struct inode *inode, long long ino)
+
+ xattr_id = le32_to_cpu(sqsh_ino->xattr);
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
+- inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ inode->i_op = &squashfs_inode_ops;
+ inode->i_fop = &generic_ro_fops;
+ inode->i_mode |= S_IFREG;
+diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
+index 8703897c0a9ccb..86d3d104b8d950 100644
+--- a/fs/xfs/scrub/reap.c
++++ b/fs/xfs/scrub/reap.c
+@@ -416,8 +416,6 @@ xreap_agextent_iter(
+ trace_xreap_dispose_unmap_extent(pag_group(sc->sa.pag), agbno,
+ *aglenp);
+
+- rs->force_roll = true;
+-
+ if (rs->oinfo == &XFS_RMAP_OINFO_COW) {
+ /*
+ * If we're unmapping CoW staging extents, remove the
+@@ -426,11 +424,14 @@ xreap_agextent_iter(
+ */
+ xfs_refcount_free_cow_extent(sc->tp, false, fsbno,
+ *aglenp);
++ rs->force_roll = true;
+ return 0;
+ }
+
+- return xfs_rmap_free(sc->tp, sc->sa.agf_bp, sc->sa.pag, agbno,
+- *aglenp, rs->oinfo);
++ xfs_rmap_free_extent(sc->tp, false, fsbno, *aglenp,
++ rs->oinfo->oi_owner);
++ rs->deferred++;
++ return 0;
+ }
+
+ trace_xreap_dispose_free_extent(pag_group(sc->sa.pag), agbno, *aglenp);
+diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
+index b49396aa405812..97c25ae8a36e3d 100644
+--- a/include/acpi/acpixf.h
++++ b/include/acpi/acpixf.h
+@@ -213,6 +213,12 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_osi_data, 0);
+ */
+ ACPI_INIT_GLOBAL(u8, acpi_gbl_reduced_hardware, FALSE);
+
++/*
++ * ACPI Global Lock is mainly used for systems with SMM, so no-SMM systems
++ * (such as loong_arch) may not have and not use Global Lock.
++ */
++ACPI_INIT_GLOBAL(u8, acpi_gbl_use_global_lock, TRUE);
++
+ /*
+ * Maximum timeout for While() loop iterations before forced method abort.
+ * This mechanism is intended to prevent infinite loops during interpreter
+diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
+index 11abad6c87e155..ca5a1ce6f0f891 100644
+--- a/include/asm-generic/io.h
++++ b/include/asm-generic/io.h
+@@ -75,6 +75,7 @@
+ #if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__))
+ #include <linux/tracepoint-defs.h>
+
++#define rwmmio_tracepoint_enabled(tracepoint) tracepoint_enabled(tracepoint)
+ DECLARE_TRACEPOINT(rwmmio_write);
+ DECLARE_TRACEPOINT(rwmmio_post_write);
+ DECLARE_TRACEPOINT(rwmmio_read);
+@@ -91,6 +92,7 @@ void log_post_read_mmio(u64 val, u8 width, const volatile void __iomem *addr,
+
+ #else
+
++#define rwmmio_tracepoint_enabled(tracepoint) false
+ static inline void log_write_mmio(u64 val, u8 width, volatile void __iomem *addr,
+ unsigned long caller_addr, unsigned long caller_addr0) {}
+ static inline void log_post_write_mmio(u64 val, u8 width, volatile void __iomem *addr,
+@@ -189,11 +191,13 @@ static inline u8 readb(const volatile void __iomem *addr)
+ {
+ u8 val;
+
+- log_read_mmio(8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(8, addr, _THIS_IP_, _RET_IP_);
+ __io_br();
+ val = __raw_readb(addr);
+ __io_ar(val);
+- log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -204,11 +208,13 @@ static inline u16 readw(const volatile void __iomem *addr)
+ {
+ u16 val;
+
+- log_read_mmio(16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(16, addr, _THIS_IP_, _RET_IP_);
+ __io_br();
+ val = __le16_to_cpu((__le16 __force)__raw_readw(addr));
+ __io_ar(val);
+- log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -219,11 +225,13 @@ static inline u32 readl(const volatile void __iomem *addr)
+ {
+ u32 val;
+
+- log_read_mmio(32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(32, addr, _THIS_IP_, _RET_IP_);
+ __io_br();
+ val = __le32_to_cpu((__le32 __force)__raw_readl(addr));
+ __io_ar(val);
+- log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -235,11 +243,13 @@ static inline u64 readq(const volatile void __iomem *addr)
+ {
+ u64 val;
+
+- log_read_mmio(64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(64, addr, _THIS_IP_, _RET_IP_);
+ __io_br();
+ val = __le64_to_cpu((__le64 __force)__raw_readq(addr));
+ __io_ar(val);
+- log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -249,11 +259,13 @@ static inline u64 readq(const volatile void __iomem *addr)
+ #define writeb writeb
+ static inline void writeb(u8 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
+ __io_bw();
+ __raw_writeb(value, addr);
+ __io_aw();
+- log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+@@ -261,11 +273,13 @@ static inline void writeb(u8 value, volatile void __iomem *addr)
+ #define writew writew
+ static inline void writew(u16 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
+ __io_bw();
+ __raw_writew((u16 __force)cpu_to_le16(value), addr);
+ __io_aw();
+- log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+@@ -273,11 +287,13 @@ static inline void writew(u16 value, volatile void __iomem *addr)
+ #define writel writel
+ static inline void writel(u32 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
+ __io_bw();
+ __raw_writel((u32 __force)__cpu_to_le32(value), addr);
+ __io_aw();
+- log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+@@ -286,11 +302,13 @@ static inline void writel(u32 value, volatile void __iomem *addr)
+ #define writeq writeq
+ static inline void writeq(u64 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
+ __io_bw();
+ __raw_writeq((u64 __force)__cpu_to_le64(value), addr);
+ __io_aw();
+- log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+ #endif /* CONFIG_64BIT */
+@@ -306,9 +324,11 @@ static inline u8 readb_relaxed(const volatile void __iomem *addr)
+ {
+ u8 val;
+
+- log_read_mmio(8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(8, addr, _THIS_IP_, _RET_IP_);
+ val = __raw_readb(addr);
+- log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -319,9 +339,11 @@ static inline u16 readw_relaxed(const volatile void __iomem *addr)
+ {
+ u16 val;
+
+- log_read_mmio(16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(16, addr, _THIS_IP_, _RET_IP_);
+ val = __le16_to_cpu((__le16 __force)__raw_readw(addr));
+- log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -332,9 +354,11 @@ static inline u32 readl_relaxed(const volatile void __iomem *addr)
+ {
+ u32 val;
+
+- log_read_mmio(32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(32, addr, _THIS_IP_, _RET_IP_);
+ val = __le32_to_cpu((__le32 __force)__raw_readl(addr));
+- log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -345,9 +369,11 @@ static inline u64 readq_relaxed(const volatile void __iomem *addr)
+ {
+ u64 val;
+
+- log_read_mmio(64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_read))
++ log_read_mmio(64, addr, _THIS_IP_, _RET_IP_);
+ val = __le64_to_cpu((__le64 __force)__raw_readq(addr));
+- log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_read))
++ log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_);
+ return val;
+ }
+ #endif
+@@ -356,9 +382,11 @@ static inline u64 readq_relaxed(const volatile void __iomem *addr)
+ #define writeb_relaxed writeb_relaxed
+ static inline void writeb_relaxed(u8 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
+ __raw_writeb(value, addr);
+- log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+@@ -366,9 +394,11 @@ static inline void writeb_relaxed(u8 value, volatile void __iomem *addr)
+ #define writew_relaxed writew_relaxed
+ static inline void writew_relaxed(u16 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
+ __raw_writew((u16 __force)cpu_to_le16(value), addr);
+- log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+@@ -376,9 +406,11 @@ static inline void writew_relaxed(u16 value, volatile void __iomem *addr)
+ #define writel_relaxed writel_relaxed
+ static inline void writel_relaxed(u32 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
+ __raw_writel((u32 __force)__cpu_to_le32(value), addr);
+- log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+@@ -386,9 +418,11 @@ static inline void writel_relaxed(u32 value, volatile void __iomem *addr)
+ #define writeq_relaxed writeq_relaxed
+ static inline void writeq_relaxed(u64 value, volatile void __iomem *addr)
+ {
+- log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_write))
++ log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
+ __raw_writeq((u64 __force)__cpu_to_le64(value), addr);
+- log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
++ if (rwmmio_tracepoint_enabled(rwmmio_post_write))
++ log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_);
+ }
+ #endif
+
+diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
+index 8efbe8c4874ee8..d61a02fce72742 100644
+--- a/include/asm-generic/vmlinux.lds.h
++++ b/include/asm-generic/vmlinux.lds.h
+@@ -832,6 +832,7 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
+
+ /* Required sections not related to debugging. */
+ #define ELF_DETAILS \
++ .modinfo : { *(.modinfo) } \
+ .comment 0 : { *(.comment) } \
+ .symtab 0 : { *(.symtab) } \
+ .strtab 0 : { *(.strtab) } \
+@@ -1045,7 +1046,6 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
+ *(.discard.*) \
+ *(.export_symbol) \
+ *(.no_trim_symbol) \
+- *(.modinfo) \
+ /* ld.bfd warns about .gnu.version* even when not emitted */ \
+ *(.gnu.version*) \
+
+diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
+index 95f3807c8c551d..5ed8781f3905e5 100644
+--- a/include/linux/cpufreq.h
++++ b/include/linux/cpufreq.h
+@@ -32,6 +32,9 @@
+ */
+
+ #define CPUFREQ_ETERNAL (-1)
++
++#define CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS NSEC_PER_MSEC
++
+ #define CPUFREQ_NAME_LEN 16
+ /* Print length for names. Extra 1 space for accommodating '\n' in prints */
+ #define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
+diff --git a/include/linux/fs.h b/include/linux/fs.h
+index 601d036a6c78ef..ed027152610369 100644
+--- a/include/linux/fs.h
++++ b/include/linux/fs.h
+@@ -238,6 +238,7 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
+ #define ATTR_ATIME_SET (1 << 7)
+ #define ATTR_MTIME_SET (1 << 8)
+ #define ATTR_FORCE (1 << 9) /* Not a change, but a change it */
++#define ATTR_CTIME_SET (1 << 10)
+ #define ATTR_KILL_SUID (1 << 11)
+ #define ATTR_KILL_SGID (1 << 12)
+ #define ATTR_FILE (1 << 13)
+@@ -4024,4 +4025,18 @@ static inline bool vfs_empty_path(int dfd, const char __user *path)
+
+ int generic_atomic_write_valid(struct kiocb *iocb, struct iov_iter *iter);
+
++static inline bool extensible_ioctl_valid(unsigned int cmd_a,
++ unsigned int cmd_b, size_t min_size)
++{
++ if (_IOC_DIR(cmd_a) != _IOC_DIR(cmd_b))
++ return false;
++ if (_IOC_TYPE(cmd_a) != _IOC_TYPE(cmd_b))
++ return false;
++ if (_IOC_NR(cmd_a) != _IOC_NR(cmd_b))
++ return false;
++ if (_IOC_SIZE(cmd_a) < min_size)
++ return false;
++ return true;
++}
++
+ #endif /* _LINUX_FS_H */
+diff --git a/include/linux/iio/frequency/adf4350.h b/include/linux/iio/frequency/adf4350.h
+index de45cf2ee1e4f8..ce2086f97e3fcf 100644
+--- a/include/linux/iio/frequency/adf4350.h
++++ b/include/linux/iio/frequency/adf4350.h
+@@ -51,7 +51,7 @@
+
+ /* REG3 Bit Definitions */
+ #define ADF4350_REG3_12BIT_CLKDIV(x) ((x) << 3)
+-#define ADF4350_REG3_12BIT_CLKDIV_MODE(x) ((x) << 16)
++#define ADF4350_REG3_12BIT_CLKDIV_MODE(x) ((x) << 15)
+ #define ADF4350_REG3_12BIT_CSR_EN (1 << 18)
+ #define ADF4351_REG3_CHARGE_CANCELLATION_EN (1 << 21)
+ #define ADF4351_REG3_ANTI_BACKLASH_3ns_EN (1 << 22)
+diff --git a/include/linux/ksm.h b/include/linux/ksm.h
+index c17b955e7b0b0e..8d61d045429369 100644
+--- a/include/linux/ksm.h
++++ b/include/linux/ksm.h
+@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
+ static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+ {
+ /* Adding mm to ksm is best effort on fork. */
+- if (test_bit(MMF_VM_MERGEABLE, &oldmm->flags))
++ if (test_bit(MMF_VM_MERGEABLE, &oldmm->flags)) {
++ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
++
++ mm->ksm_merging_pages = 0;
++ mm->ksm_rmap_items = 0;
++ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
+ __ksm_enter(mm);
++ }
+ }
+
+ static inline int ksm_execve(struct mm_struct *mm)
+diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
+index 25921fbec68566..d3ebc9b73c6fb3 100644
+--- a/include/linux/memcontrol.h
++++ b/include/linux/memcontrol.h
+@@ -987,22 +987,28 @@ static inline void count_memcg_event_mm(struct mm_struct *mm,
+ count_memcg_events_mm(mm, idx, 1);
+ }
+
+-static inline void memcg_memory_event(struct mem_cgroup *memcg,
+- enum memcg_memory_event event)
++static inline void __memcg_memory_event(struct mem_cgroup *memcg,
++ enum memcg_memory_event event,
++ bool allow_spinning)
+ {
+ bool swap_event = event == MEMCG_SWAP_HIGH || event == MEMCG_SWAP_MAX ||
+ event == MEMCG_SWAP_FAIL;
+
++ /* For now only MEMCG_MAX can happen with !allow_spinning context. */
++ VM_WARN_ON_ONCE(!allow_spinning && event != MEMCG_MAX);
++
+ atomic_long_inc(&memcg->memory_events_local[event]);
+- if (!swap_event)
++ if (!swap_event && allow_spinning)
+ cgroup_file_notify(&memcg->events_local_file);
+
+ do {
+ atomic_long_inc(&memcg->memory_events[event]);
+- if (swap_event)
+- cgroup_file_notify(&memcg->swap_events_file);
+- else
+- cgroup_file_notify(&memcg->events_file);
++ if (allow_spinning) {
++ if (swap_event)
++ cgroup_file_notify(&memcg->swap_events_file);
++ else
++ cgroup_file_notify(&memcg->events_file);
++ }
+
+ if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+ break;
+@@ -1012,6 +1018,12 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg,
+ !mem_cgroup_is_root(memcg));
+ }
+
++static inline void memcg_memory_event(struct mem_cgroup *memcg,
++ enum memcg_memory_event event)
++{
++ __memcg_memory_event(memcg, event, true);
++}
++
+ static inline void memcg_memory_event_mm(struct mm_struct *mm,
+ enum memcg_memory_event event)
+ {
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index c6794d0e24eb6c..f23dd28f193ffe 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -4159,14 +4159,13 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
+ * since this value becomes part of PP_SIGNATURE; meaning we can just use the
+ * space between the PP_SIGNATURE value (without POISON_POINTER_DELTA), and the
+ * lowest bits of POISON_POINTER_DELTA. On arches where POISON_POINTER_DELTA is
+- * 0, we make sure that we leave the two topmost bits empty, as that guarantees
+- * we won't mistake a valid kernel pointer for a value we set, regardless of the
+- * VMSPLIT setting.
++ * 0, we use the lowest bit of PAGE_OFFSET as the boundary if that value is
++ * known at compile-time.
+ *
+- * Altogether, this means that the number of bits available is constrained by
+- * the size of an unsigned long (at the upper end, subtracting two bits per the
+- * above), and the definition of PP_SIGNATURE (with or without
+- * POISON_POINTER_DELTA).
++ * If the value of PAGE_OFFSET is not known at compile time, or if it is too
++ * small to leave at least 8 bits available above PP_SIGNATURE, we define the
++ * number of bits to be 0, which turns off the DMA index tracking altogether
++ * (see page_pool_register_dma_index()).
+ */
+ #define PP_DMA_INDEX_SHIFT (1 + __fls(PP_SIGNATURE - POISON_POINTER_DELTA))
+ #if POISON_POINTER_DELTA > 0
+@@ -4175,8 +4174,13 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
+ */
+ #define PP_DMA_INDEX_BITS MIN(32, __ffs(POISON_POINTER_DELTA) - PP_DMA_INDEX_SHIFT)
+ #else
+-/* Always leave out the topmost two; see above. */
+-#define PP_DMA_INDEX_BITS MIN(32, BITS_PER_LONG - PP_DMA_INDEX_SHIFT - 2)
++/* Use the lowest bit of PAGE_OFFSET if there's at least 8 bits available; see above */
++#define PP_DMA_INDEX_MIN_OFFSET (1 << (PP_DMA_INDEX_SHIFT + 8))
++#define PP_DMA_INDEX_BITS ((__builtin_constant_p(PAGE_OFFSET) && \
++ PAGE_OFFSET >= PP_DMA_INDEX_MIN_OFFSET && \
++ !(PAGE_OFFSET & (PP_DMA_INDEX_MIN_OFFSET - 1))) ? \
++ MIN(32, __ffs(PAGE_OFFSET) - PP_DMA_INDEX_SHIFT) : 0)
++
+ #endif
+
+ #define PP_DMA_INDEX_MASK GENMASK(PP_DMA_INDEX_BITS + PP_DMA_INDEX_SHIFT - 1, \
+diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
+index d88d6b6ccf5b20..d1ff76e0e2d077 100644
+--- a/include/linux/pm_runtime.h
++++ b/include/linux/pm_runtime.h
+@@ -350,13 +350,12 @@ static inline int pm_runtime_force_resume(struct device *dev) { return -ENXIO; }
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero, Runtime PM status change ongoing
+- * or device not in %RPM_ACTIVE state.
++ * * -EAGAIN: Runtime PM usage counter non-zero, Runtime PM status change
++ * ongoing or device not in %RPM_ACTIVE state.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ * Other values and conditions for the above values are possible as returned by
+ * Runtime PM idle and suspend callbacks.
+ */
+@@ -370,14 +369,15 @@ static inline int pm_runtime_idle(struct device *dev)
+ * @dev: Target device.
+ *
+ * Return:
++ * * 1: Success; device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter non-zero or Runtime PM status change
++ * ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ * Other values and conditions for the above values are possible as returned by
+ * Runtime PM suspend callbacks.
+ */
+@@ -396,14 +396,15 @@ static inline int pm_runtime_suspend(struct device *dev)
+ * engaging its "idle check" callback.
+ *
+ * Return:
++ * * 1: Success; device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter non-zero or Runtime PM status change
++ * ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ * Other values and conditions for the above values are possible as returned by
+ * Runtime PM suspend callbacks.
+ */
+@@ -433,13 +434,12 @@ static inline int pm_runtime_resume(struct device *dev)
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero, Runtime PM status change ongoing
+- * or device not in %RPM_ACTIVE state.
++ * * -EAGAIN: Runtime PM usage counter non-zero, Runtime PM status change
++ * ongoing or device not in %RPM_ACTIVE state.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ */
+ static inline int pm_request_idle(struct device *dev)
+ {
+@@ -464,15 +464,16 @@ static inline int pm_request_resume(struct device *dev)
+ * equivalent pm_runtime_autosuspend() for @dev asynchronously.
+ *
+ * Return:
++ * * 1: Success; device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter non-zero or Runtime PM status change
++ * ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ */
+ static inline int pm_request_autosuspend(struct device *dev)
+ {
+@@ -540,15 +541,16 @@ static inline int pm_runtime_resume_and_get(struct device *dev)
+ * equal to 0, queue up a work item for @dev like in pm_request_idle().
+ *
+ * Return:
++ * * 1: Success. Usage counter dropped to zero, but device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter became non-zero or Runtime PM status
++ * change ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ */
+ static inline int pm_runtime_put(struct device *dev)
+ {
+@@ -565,15 +567,16 @@ DEFINE_FREE(pm_runtime_put, struct device *, if (_T) pm_runtime_put(_T))
+ * equal to 0, queue up a work item for @dev like in pm_request_autosuspend().
+ *
+ * Return:
++ * * 1: Success. Usage counter dropped to zero, but device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter became non-zero or Runtime PM status
++ * change ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ */
+ static inline int __pm_runtime_put_autosuspend(struct device *dev)
+ {
+@@ -590,15 +593,16 @@ static inline int __pm_runtime_put_autosuspend(struct device *dev)
+ * in pm_request_autosuspend().
+ *
+ * Return:
++ * * 1: Success. Usage counter dropped to zero, but device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter became non-zero or Runtime PM status
++ * change ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ */
+ static inline int pm_runtime_put_autosuspend(struct device *dev)
+ {
+@@ -619,14 +623,15 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
+ * if it returns an error code.
+ *
+ * Return:
++ * * 1: Success. Usage counter dropped to zero, but device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter became non-zero or Runtime PM status
++ * change ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ * Other values and conditions for the above values are possible as returned by
+ * Runtime PM suspend callbacks.
+ */
+@@ -646,15 +651,15 @@ static inline int pm_runtime_put_sync(struct device *dev)
+ * if it returns an error code.
+ *
+ * Return:
++ * * 1: Success. Usage counter dropped to zero, but device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
+- * * -EAGAIN: usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter became non-zero or Runtime PM status
++ * change ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ * Other values and conditions for the above values are possible as returned by
+ * Runtime PM suspend callbacks.
+ */
+@@ -677,15 +682,16 @@ static inline int pm_runtime_put_sync_suspend(struct device *dev)
+ * if it returns an error code.
+ *
+ * Return:
++ * * 1: Success. Usage counter dropped to zero, but device was already suspended.
+ * * 0: Success.
+ * * -EINVAL: Runtime PM error.
+ * * -EACCES: Runtime PM disabled.
+- * * -EAGAIN: Runtime PM usage_count non-zero or Runtime PM status change ongoing.
++ * * -EAGAIN: Runtime PM usage counter became non-zero or Runtime PM status
++ * change ongoing.
+ * * -EBUSY: Runtime PM child_count non-zero.
+ * * -EPERM: Device PM QoS resume latency 0.
+ * * -EINPROGRESS: Suspend already in progress.
+ * * -ENOSYS: CONFIG_PM not enabled.
+- * * 1: Device already suspended.
+ * Other values and conditions for the above values are possible as returned by
+ * Runtime PM suspend callbacks.
+ */
+diff --git a/include/linux/rseq.h b/include/linux/rseq.h
+index bc8af3eb559876..1fbeb61babeb8b 100644
+--- a/include/linux/rseq.h
++++ b/include/linux/rseq.h
+@@ -7,6 +7,12 @@
+ #include <linux/preempt.h>
+ #include <linux/sched.h>
+
++#ifdef CONFIG_MEMBARRIER
++# define RSEQ_EVENT_GUARD irq
++#else
++# define RSEQ_EVENT_GUARD preempt
++#endif
++
+ /*
+ * Map the event mask on the user-space ABI enum rseq_cs_flags
+ * for direct mask checks.
+@@ -41,9 +47,8 @@ static inline void rseq_handle_notify_resume(struct ksignal *ksig,
+ static inline void rseq_signal_deliver(struct ksignal *ksig,
+ struct pt_regs *regs)
+ {
+- preempt_disable();
+- __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask);
+- preempt_enable();
++ scoped_guard(RSEQ_EVENT_GUARD)
++ __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask);
+ rseq_handle_notify_resume(ksig, regs);
+ }
+
+diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
+index 369a89aea18618..2b886f7eb29562 100644
+--- a/include/linux/sunrpc/svc_xprt.h
++++ b/include/linux/sunrpc/svc_xprt.h
+@@ -104,6 +104,9 @@ enum {
+ * it has access to. It is NOT counted
+ * in ->sv_tmpcnt.
+ */
++ XPT_RPCB_UNREG, /* transport that needs unregistering
++ * with rpcbind (TCP, UDP) on destroy
++ */
+ };
+
+ /*
+diff --git a/include/media/v4l2-subdev.h b/include/media/v4l2-subdev.h
+index 5dcf4065708f32..398b574616771b 100644
+--- a/include/media/v4l2-subdev.h
++++ b/include/media/v4l2-subdev.h
+@@ -1962,19 +1962,23 @@ extern const struct v4l2_subdev_ops v4l2_subdev_call_wrappers;
+ *
+ * Note: only legacy non-MC drivers may need this macro.
+ */
+-#define v4l2_subdev_call_state_try(sd, o, f, args...) \
+- ({ \
+- int __result; \
+- static struct lock_class_key __key; \
+- const char *name = KBUILD_BASENAME \
+- ":" __stringify(__LINE__) ":state->lock"; \
+- struct v4l2_subdev_state *state = \
+- __v4l2_subdev_state_alloc(sd, name, &__key); \
+- v4l2_subdev_lock_state(state); \
+- __result = v4l2_subdev_call(sd, o, f, state, ##args); \
+- v4l2_subdev_unlock_state(state); \
+- __v4l2_subdev_state_free(state); \
+- __result; \
++#define v4l2_subdev_call_state_try(sd, o, f, args...) \
++ ({ \
++ int __result; \
++ static struct lock_class_key __key; \
++ const char *name = KBUILD_BASENAME \
++ ":" __stringify(__LINE__) ":state->lock"; \
++ struct v4l2_subdev_state *state = \
++ __v4l2_subdev_state_alloc(sd, name, &__key); \
++ if (IS_ERR(state)) { \
++ __result = PTR_ERR(state); \
++ } else { \
++ v4l2_subdev_lock_state(state); \
++ __result = v4l2_subdev_call(sd, o, f, state, ##args); \
++ v4l2_subdev_unlock_state(state); \
++ __v4l2_subdev_state_free(state); \
++ } \
++ __result; \
+ })
+
+ /**
+diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
+index d8ddc27b6a7c8a..945fcbaae77e9d 100644
+--- a/include/trace/events/dma.h
++++ b/include/trace/events/dma.h
+@@ -134,6 +134,7 @@ DECLARE_EVENT_CLASS(dma_alloc_class,
+ __entry->dma_addr = dma_addr;
+ __entry->size = size;
+ __entry->flags = flags;
++ __entry->dir = dir;
+ __entry->attrs = attrs;
+ ),
+
+diff --git a/init/main.c b/init/main.c
+index 5753e9539ae6fb..40d25d4f30999a 100644
+--- a/init/main.c
++++ b/init/main.c
+@@ -544,6 +544,12 @@ static int __init unknown_bootoption(char *param, char *val,
+ const char *unused, void *arg)
+ {
+ size_t len = strlen(param);
++ /*
++ * Well-known bootloader identifiers:
++ * 1. LILO/Grub pass "BOOT_IMAGE=...";
++ * 2. kexec/kdump (kexec-tools) pass "kexec".
++ */
++ const char *bootloader[] = { "BOOT_IMAGE=", "kexec", NULL };
+
+ /* Handle params aliased to sysctls */
+ if (sysctl_is_alias(param))
+@@ -551,6 +557,12 @@ static int __init unknown_bootoption(char *param, char *val,
+
+ repair_env_string(param, val);
+
++ /* Handle bootloader identifier */
++ for (int i = 0; bootloader[i]; i++) {
++ if (strstarts(param, bootloader[i]))
++ return 0;
++ }
++
+ /* Handle obsolete-style parameters */
+ if (obsolete_checksetup(param))
+ return 0;
+diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
+index 643a69f9ffe2ae..2035c77a163575 100644
+--- a/io_uring/zcrx.c
++++ b/io_uring/zcrx.c
+@@ -993,6 +993,7 @@ static ssize_t io_copy_page(struct io_copy_cache *cc, struct page *src_page,
+
+ cc->size -= n;
+ cc->offset += n;
++ src_offset += n;
+ len -= n;
+ copied += n;
+ }
+diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
+index 5c2e96b19392ae..1a31c87234877a 100644
+--- a/kernel/bpf/inode.c
++++ b/kernel/bpf/inode.c
+@@ -775,7 +775,7 @@ static int bpf_show_options(struct seq_file *m, struct dentry *root)
+ return 0;
+ }
+
+-static void bpf_free_inode(struct inode *inode)
++static void bpf_destroy_inode(struct inode *inode)
+ {
+ enum bpf_type type;
+
+@@ -790,7 +790,7 @@ const struct super_operations bpf_super_ops = {
+ .statfs = simple_statfs,
+ .drop_inode = generic_delete_inode,
+ .show_options = bpf_show_options,
+- .free_inode = bpf_free_inode,
++ .destroy_inode = bpf_destroy_inode,
+ };
+
+ enum {
+diff --git a/kernel/fork.c b/kernel/fork.c
+index 6ca8689a83b5bd..bb86c57cc0d987 100644
+--- a/kernel/fork.c
++++ b/kernel/fork.c
+@@ -1596,7 +1596,7 @@ static int copy_files(unsigned long clone_flags, struct task_struct *tsk,
+ return 0;
+ }
+
+-static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk)
++static int copy_sighand(u64 clone_flags, struct task_struct *tsk)
+ {
+ struct sighand_struct *sig;
+
+diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c
+index ecd1ac210dbd74..c58afd23a241fe 100644
+--- a/kernel/kexec_handover.c
++++ b/kernel/kexec_handover.c
+@@ -1233,7 +1233,7 @@ int kho_fill_kimage(struct kimage *image)
+ int err = 0;
+ struct kexec_buf scratch;
+
+- if (!kho_enable)
++ if (!kho_out.finalized)
+ return 0;
+
+ image->kho.fdt = page_to_phys(kho_out.ser.fdt);
+diff --git a/kernel/padata.c b/kernel/padata.c
+index f85f8bd788d0da..833740d7548374 100644
+--- a/kernel/padata.c
++++ b/kernel/padata.c
+@@ -291,8 +291,12 @@ static void padata_reorder(struct padata_priv *padata)
+ struct padata_serial_queue *squeue;
+ int cb_cpu;
+
+- cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu);
+ processed++;
++ /* When sequence wraps around, reset to the first CPU. */
++ if (unlikely(processed == 0))
++ cpu = cpumask_first(pd->cpumask.pcpu);
++ else
++ cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu);
+
+ cb_cpu = padata->cb_cpu;
+ squeue = per_cpu_ptr(pd->squeue, cb_cpu);
+diff --git a/kernel/pid.c b/kernel/pid.c
+index d94ce025050127..296cd04c24bae0 100644
+--- a/kernel/pid.c
++++ b/kernel/pid.c
+@@ -491,7 +491,7 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
+ struct upid *upid;
+ pid_t nr = 0;
+
+- if (pid && ns->level <= pid->level) {
++ if (pid && ns && ns->level <= pid->level) {
+ upid = &pid->numbers[ns->level];
+ if (upid->ns == ns)
+ nr = upid->nr;
+diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
+index 8df55397414a12..5f17d2e8e95420 100644
+--- a/kernel/power/energy_model.c
++++ b/kernel/power/energy_model.c
+@@ -799,7 +799,7 @@ void em_adjust_cpu_capacity(unsigned int cpu)
+ static void em_check_capacity_update(void)
+ {
+ cpumask_var_t cpu_done_mask;
+- int cpu;
++ int cpu, failed_cpus = 0;
+
+ if (!zalloc_cpumask_var(&cpu_done_mask, GFP_KERNEL)) {
+ pr_warn("no free memory\n");
+@@ -817,10 +817,8 @@ static void em_check_capacity_update(void)
+
+ policy = cpufreq_cpu_get(cpu);
+ if (!policy) {
+- pr_debug("Accessing cpu%d policy failed\n", cpu);
+- schedule_delayed_work(&em_update_work,
+- msecs_to_jiffies(1000));
+- break;
++ failed_cpus++;
++ continue;
+ }
+ cpufreq_cpu_put(policy);
+
+@@ -835,6 +833,9 @@ static void em_check_capacity_update(void)
+ em_adjust_new_capacity(cpu, dev, pd);
+ }
+
++ if (failed_cpus)
++ schedule_delayed_work(&em_update_work, msecs_to_jiffies(1000));
++
+ free_cpumask_var(cpu_done_mask);
+ }
+
+diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
+index 2f66ab45382319..26e0e662e8f2a7 100644
+--- a/kernel/power/hibernate.c
++++ b/kernel/power/hibernate.c
+@@ -695,12 +695,16 @@ static void power_down(void)
+
+ #ifdef CONFIG_SUSPEND
+ if (hibernation_mode == HIBERNATION_SUSPEND) {
++ pm_restore_gfp_mask();
+ error = suspend_devices_and_enter(mem_sleep_current);
+ if (error) {
+ hibernation_mode = hibernation_ops ?
+ HIBERNATION_PLATFORM :
+ HIBERNATION_SHUTDOWN;
+ } else {
++ /* Match pm_restore_gfp_mask() call in hibernate() */
++ pm_restrict_gfp_mask();
++
+ /* Restore swap signature. */
+ error = swsusp_unmark();
+ if (error)
+@@ -718,6 +722,8 @@ static void power_down(void)
+ case HIBERNATION_PLATFORM:
+ error = hibernation_platform_enter();
+ if (error == -EAGAIN || error == -EBUSY) {
++ /* Match pm_restore_gfp_mask() in hibernate(). */
++ pm_restrict_gfp_mask();
+ swsusp_unmark();
+ events_check_enabled = false;
+ pr_info("Wakeup event detected during hibernation, rolling back.\n");
+diff --git a/kernel/rseq.c b/kernel/rseq.c
+index b7a1ec327e8117..2452b7366b00e9 100644
+--- a/kernel/rseq.c
++++ b/kernel/rseq.c
+@@ -342,12 +342,12 @@ static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
+
+ /*
+ * Load and clear event mask atomically with respect to
+- * scheduler preemption.
++ * scheduler preemption and membarrier IPIs.
+ */
+- preempt_disable();
+- event_mask = t->rseq_event_mask;
+- t->rseq_event_mask = 0;
+- preempt_enable();
++ scoped_guard(RSEQ_EVENT_GUARD) {
++ event_mask = t->rseq_event_mask;
++ t->rseq_event_mask = 0;
++ }
+
+ return !!event_mask;
+ }
+diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
+index 72c1f72463c758..615411a0a8813d 100644
+--- a/kernel/sched/deadline.c
++++ b/kernel/sched/deadline.c
+@@ -2551,6 +2551,25 @@ static int find_later_rq(struct task_struct *task)
+ return -1;
+ }
+
++static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
++{
++ struct task_struct *p;
++
++ if (!has_pushable_dl_tasks(rq))
++ return NULL;
++
++ p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
++
++ WARN_ON_ONCE(rq->cpu != task_cpu(p));
++ WARN_ON_ONCE(task_current(rq, p));
++ WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
++
++ WARN_ON_ONCE(!task_on_rq_queued(p));
++ WARN_ON_ONCE(!dl_task(p));
++
++ return p;
++}
++
+ /* Locks the rq it finds */
+ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
+ {
+@@ -2578,12 +2597,37 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
+
+ /* Retry if something changed. */
+ if (double_lock_balance(rq, later_rq)) {
+- if (unlikely(task_rq(task) != rq ||
++ /*
++ * double_lock_balance had to release rq->lock, in the
++ * meantime, task may no longer be fit to be migrated.
++ * Check the following to ensure that the task is
++ * still suitable for migration:
++ * 1. It is possible the task was scheduled,
++ * migrate_disabled was set and then got preempted,
++ * so we must check the task migration disable
++ * flag.
++ * 2. The CPU picked is in the task's affinity.
++ * 3. For throttled task (dl_task_offline_migration),
++ * check the following:
++ * - the task is not on the rq anymore (it was
++ * migrated)
++ * - the task is not on CPU anymore
++ * - the task is still a dl task
++ * - the task is not queued on the rq anymore
++ * 4. For the non-throttled task (push_dl_task), the
++ * check to ensure that this task is still at the
++ * head of the pushable tasks list is enough.
++ */
++ if (unlikely(is_migration_disabled(task) ||
+ !cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
+- task_on_cpu(rq, task) ||
+- !dl_task(task) ||
+- is_migration_disabled(task) ||
+- !task_on_rq_queued(task))) {
++ (task->dl.dl_throttled &&
++ (task_rq(task) != rq ||
++ task_on_cpu(rq, task) ||
++ !dl_task(task) ||
++ !task_on_rq_queued(task))) ||
++ (!task->dl.dl_throttled &&
++ task != pick_next_pushable_dl_task(rq)))) {
++
+ double_unlock_balance(rq, later_rq);
+ later_rq = NULL;
+ break;
+@@ -2606,25 +2650,6 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
+ return later_rq;
+ }
+
+-static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
+-{
+- struct task_struct *p;
+-
+- if (!has_pushable_dl_tasks(rq))
+- return NULL;
+-
+- p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
+-
+- WARN_ON_ONCE(rq->cpu != task_cpu(p));
+- WARN_ON_ONCE(task_current(rq, p));
+- WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
+-
+- WARN_ON_ONCE(!task_on_rq_queued(p));
+- WARN_ON_ONCE(!dl_task(p));
+-
+- return p;
+-}
+-
+ /*
+ * See if the non running -deadline tasks on this rq
+ * can be sent to some other CPU where they can preempt
+diff --git a/kernel/sys.c b/kernel/sys.c
+index 1e28b40053ce20..36d66ff4161176 100644
+--- a/kernel/sys.c
++++ b/kernel/sys.c
+@@ -1734,6 +1734,7 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, unsigned int, resource,
+ struct rlimit old, new;
+ struct task_struct *tsk;
+ unsigned int checkflags = 0;
++ bool need_tasklist;
+ int ret;
+
+ if (old_rlim)
+@@ -1760,8 +1761,25 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, unsigned int, resource,
+ get_task_struct(tsk);
+ rcu_read_unlock();
+
+- ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL,
+- old_rlim ? &old : NULL);
++ need_tasklist = !same_thread_group(tsk, current);
++ if (need_tasklist) {
++ /*
++ * Ensure we can't race with group exit or de_thread(),
++ * so tsk->group_leader can't be freed or changed until
++ * read_unlock(tasklist_lock) below.
++ */
++ read_lock(&tasklist_lock);
++ if (!pid_alive(tsk))
++ ret = -ESRCH;
++ }
++
++ if (!ret) {
++ ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL,
++ old_rlim ? &old : NULL);
++ }
++
++ if (need_tasklist)
++ read_unlock(&tasklist_lock);
+
+ if (!ret && old_rlim) {
+ rlim_to_rlim64(&old, &old64);
+diff --git a/lib/genalloc.c b/lib/genalloc.c
+index 4fa5635bf81bd6..841f2978383334 100644
+--- a/lib/genalloc.c
++++ b/lib/genalloc.c
+@@ -899,8 +899,11 @@ struct gen_pool *of_gen_pool_get(struct device_node *np,
+ if (!name)
+ name = of_node_full_name(np_pool);
+ }
+- if (pdev)
++ if (pdev) {
+ pool = gen_pool_get(&pdev->dev, name);
++ put_device(&pdev->dev);
++ }
++
+ of_node_put(np_pool);
+
+ return pool;
+diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
+index b5a5ed16a7a5db..26d6a2adb9cdd3 100644
+--- a/mm/damon/lru_sort.c
++++ b/mm/damon/lru_sort.c
+@@ -203,7 +203,7 @@ static int damon_lru_sort_apply_parameters(void)
+ goto out;
+ }
+
+- err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs);
++ err = damon_set_attrs(param_ctx, &damon_lru_sort_mon_attrs);
+ if (err)
+ goto out;
+
+diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
+index 87e825349bdf4c..1172390b76b492 100644
+--- a/mm/damon/vaddr.c
++++ b/mm/damon/vaddr.c
+@@ -328,10 +328,8 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
+ }
+
+ pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+- if (!pte) {
+- walk->action = ACTION_AGAIN;
++ if (!pte)
+ return 0;
+- }
+ if (!pte_present(ptep_get(pte)))
+ goto out;
+ damon_ptep_mkold(pte, walk->vma, addr);
+@@ -481,10 +479,8 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr,
+ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+ pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+- if (!pte) {
+- walk->action = ACTION_AGAIN;
++ if (!pte)
+ return 0;
+- }
+ ptent = ptep_get(pte);
+ if (!pte_present(ptent))
+ goto out;
+diff --git a/mm/huge_memory.c b/mm/huge_memory.c
+index 9c38a95e9f091b..fceaf965f264ea 100644
+--- a/mm/huge_memory.c
++++ b/mm/huge_memory.c
+@@ -4115,32 +4115,23 @@ static unsigned long deferred_split_count(struct shrinker *shrink,
+ static bool thp_underused(struct folio *folio)
+ {
+ int num_zero_pages = 0, num_filled_pages = 0;
+- void *kaddr;
+ int i;
+
+ if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
+ return false;
+
+ for (i = 0; i < folio_nr_pages(folio); i++) {
+- kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
+- if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
+- num_zero_pages++;
+- if (num_zero_pages > khugepaged_max_ptes_none) {
+- kunmap_local(kaddr);
++ if (pages_identical(folio_page(folio, i), ZERO_PAGE(0))) {
++ if (++num_zero_pages > khugepaged_max_ptes_none)
+ return true;
+- }
+ } else {
+ /*
+ * Another path for early exit once the number
+ * of non-zero filled pages exceeds threshold.
+ */
+- num_filled_pages++;
+- if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) {
+- kunmap_local(kaddr);
++ if (++num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none)
+ return false;
+- }
+ }
+- kunmap_local(kaddr);
+ }
+ return false;
+ }
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 8f19d0f293e090..ef8b841286aa3e 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -3654,6 +3654,9 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
+ return;
+ }
+
++ if (!h->max_huge_pages)
++ return;
++
+ /* do node specific alloc */
+ if (hugetlb_hstate_alloc_pages_specific_nodes(h))
+ return;
+diff --git a/mm/memcontrol.c b/mm/memcontrol.c
+index 46713b9ece0638..2b9cc91eea1165 100644
+--- a/mm/memcontrol.c
++++ b/mm/memcontrol.c
+@@ -2309,12 +2309,13 @@ static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
+ bool drained = false;
+ bool raised_max_event = false;
+ unsigned long pflags;
++ bool allow_spinning = gfpflags_allow_spinning(gfp_mask);
+
+ retry:
+ if (consume_stock(memcg, nr_pages))
+ return 0;
+
+- if (!gfpflags_allow_spinning(gfp_mask))
++ if (!allow_spinning)
+ /* Avoid the refill and flush of the older stock */
+ batch = nr_pages;
+
+@@ -2350,7 +2351,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
+ if (!gfpflags_allow_blocking(gfp_mask))
+ goto nomem;
+
+- memcg_memory_event(mem_over_limit, MEMCG_MAX);
++ __memcg_memory_event(mem_over_limit, MEMCG_MAX, allow_spinning);
+ raised_max_event = true;
+
+ psi_memstall_enter(&pflags);
+@@ -2417,7 +2418,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
+ * a MEMCG_MAX event.
+ */
+ if (!raised_max_event)
+- memcg_memory_event(mem_over_limit, MEMCG_MAX);
++ __memcg_memory_event(mem_over_limit, MEMCG_MAX, allow_spinning);
+
+ /*
+ * The allocation either can't fail or will lead to more memory
+diff --git a/mm/migrate.c b/mm/migrate.c
+index 9e5ef39ce73af0..4ff6eea0ef7ed6 100644
+--- a/mm/migrate.c
++++ b/mm/migrate.c
+@@ -297,19 +297,16 @@ bool isolate_folio_to_list(struct folio *folio, struct list_head *list)
+ }
+
+ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
+- struct folio *folio,
+- unsigned long idx)
++ struct folio *folio, pte_t old_pte, unsigned long idx)
+ {
+ struct page *page = folio_page(folio, idx);
+- bool contains_data;
+ pte_t newpte;
+- void *addr;
+
+ if (PageCompound(page))
+ return false;
+ VM_BUG_ON_PAGE(!PageAnon(page), page);
+ VM_BUG_ON_PAGE(!PageLocked(page), page);
+- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
++ VM_BUG_ON_PAGE(pte_present(old_pte), page);
+
+ if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
+ mm_forbids_zeropage(pvmw->vma->vm_mm))
+@@ -320,15 +317,17 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
+ * this subpage has been non present. If the subpage is only zero-filled
+ * then map it to the shared zeropage.
+ */
+- addr = kmap_local_page(page);
+- contains_data = memchr_inv(addr, 0, PAGE_SIZE);
+- kunmap_local(addr);
+-
+- if (contains_data)
++ if (!pages_identical(page, ZERO_PAGE(0)))
+ return false;
+
+ newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
+ pvmw->vma->vm_page_prot));
++
++ if (pte_swp_soft_dirty(old_pte))
++ newpte = pte_mksoft_dirty(newpte);
++ if (pte_swp_uffd_wp(old_pte))
++ newpte = pte_mkuffd_wp(newpte);
++
+ set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
+
+ dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
+@@ -371,13 +370,13 @@ static bool remove_migration_pte(struct folio *folio,
+ continue;
+ }
+ #endif
++ old_pte = ptep_get(pvmw.pte);
+ if (rmap_walk_arg->map_unused_to_zeropage &&
+- try_to_map_unused_to_zeropage(&pvmw, folio, idx))
++ try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
+ continue;
+
+ folio_get(folio);
+ pte = mk_pte(new, READ_ONCE(vma->vm_page_prot));
+- old_pte = ptep_get(pvmw.pte);
+
+ entry = pte_to_swp_entry(old_pte);
+ if (!is_migration_entry_young(entry))
+diff --git a/mm/page_alloc.c b/mm/page_alloc.c
+index d1d037f97c5fc7..09241bb7663e03 100644
+--- a/mm/page_alloc.c
++++ b/mm/page_alloc.c
+@@ -4408,7 +4408,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
+ if (!(gfp_mask & __GFP_NOMEMALLOC)) {
+ alloc_flags |= ALLOC_NON_BLOCK;
+
+- if (order > 0)
++ if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
+ alloc_flags |= ALLOC_HIGHATOMIC;
+ }
+
+diff --git a/mm/slab.h b/mm/slab.h
+index 248b34c839b7ca..0dd3f33c096345 100644
+--- a/mm/slab.h
++++ b/mm/slab.h
+@@ -526,8 +526,12 @@ static inline struct slabobj_ext *slab_obj_exts(struct slab *slab)
+ unsigned long obj_exts = READ_ONCE(slab->obj_exts);
+
+ #ifdef CONFIG_MEMCG
+- VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS),
+- slab_page(slab));
++ /*
++ * obj_exts should be either NULL, a valid pointer with
++ * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
++ */
++ VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
++ obj_exts != OBJEXTS_ALLOC_FAIL, slab_page(slab));
+ VM_BUG_ON_PAGE(obj_exts & MEMCG_DATA_KMEM, slab_page(slab));
+ #endif
+ return (struct slabobj_ext *)(obj_exts & ~OBJEXTS_FLAGS_MASK);
+diff --git a/mm/slub.c b/mm/slub.c
+index 264fc76455d739..9bdadf9909e066 100644
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -2034,8 +2034,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
+ slab_nid(slab));
+ if (!vec) {
+ /* Mark vectors which failed to allocate */
+- if (new_slab)
+- mark_failed_objexts_alloc(slab);
++ mark_failed_objexts_alloc(slab);
+
+ return -ENOMEM;
+ }
+diff --git a/mm/util.c b/mm/util.c
+index f814e6a59ab1d3..98d301875b7559 100644
+--- a/mm/util.c
++++ b/mm/util.c
+@@ -566,6 +566,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
+ unsigned long len, unsigned long prot,
+ unsigned long flag, unsigned long pgoff)
+ {
++ loff_t off = (loff_t)pgoff << PAGE_SHIFT;
+ unsigned long ret;
+ struct mm_struct *mm = current->mm;
+ unsigned long populate;
+@@ -573,7 +574,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
+
+ ret = security_mmap_file(file, prot, flag);
+ if (!ret)
+- ret = fsnotify_mmap_perm(file, prot, pgoff >> PAGE_SHIFT, len);
++ ret = fsnotify_mmap_perm(file, prot, off, len);
+ if (!ret) {
+ if (mmap_write_lock_killable(mm))
+ return -EINTR;
+diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
+index 939a3aa78d5c46..54993a05037c14 100644
+--- a/net/bridge/br_vlan.c
++++ b/net/bridge/br_vlan.c
+@@ -1455,7 +1455,7 @@ void br_vlan_fill_forward_path_pvid(struct net_bridge *br,
+ if (!br_opt_get(br, BROPT_VLAN_ENABLED))
+ return;
+
+- vg = br_vlan_group(br);
++ vg = br_vlan_group_rcu(br);
+
+ if (idx >= 0 &&
+ ctx->vlan[idx].proto == br->vlan_proto) {
+diff --git a/net/core/filter.c b/net/core/filter.c
+index 2d326d35c38716..c5cdf3b08341a5 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -2281,6 +2281,7 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
+ if (IS_ERR(dst))
+ goto out_drop;
+
++ skb_dst_drop(skb);
+ skb_dst_set(skb, dst);
+ } else if (nh->nh_family != AF_INET6) {
+ goto out_drop;
+@@ -2389,6 +2390,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
+ goto out_drop;
+ }
+
++ skb_dst_drop(skb);
+ skb_dst_set(skb, &rt->dst);
+ }
+
+diff --git a/net/core/page_pool.c b/net/core/page_pool.c
+index ba70569bd4b051..19c92aa04e5491 100644
+--- a/net/core/page_pool.c
++++ b/net/core/page_pool.c
+@@ -472,11 +472,60 @@ page_pool_dma_sync_for_device(const struct page_pool *pool,
+ }
+ }
+
++static int page_pool_register_dma_index(struct page_pool *pool,
++ netmem_ref netmem, gfp_t gfp)
++{
++ int err = 0;
++ u32 id;
++
++ if (unlikely(!PP_DMA_INDEX_BITS))
++ goto out;
++
++ if (in_softirq())
++ err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem),
++ PP_DMA_INDEX_LIMIT, gfp);
++ else
++ err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem),
++ PP_DMA_INDEX_LIMIT, gfp);
++ if (err) {
++ WARN_ONCE(err != -ENOMEM, "couldn't track DMA mapping, please report to netdev@");
++ goto out;
++ }
++
++ netmem_set_dma_index(netmem, id);
++out:
++ return err;
++}
++
++static int page_pool_release_dma_index(struct page_pool *pool,
++ netmem_ref netmem)
++{
++ struct page *old, *page = netmem_to_page(netmem);
++ unsigned long id;
++
++ if (unlikely(!PP_DMA_INDEX_BITS))
++ return 0;
++
++ id = netmem_get_dma_index(netmem);
++ if (!id)
++ return -1;
++
++ if (in_softirq())
++ old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0);
++ else
++ old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0);
++ if (old != page)
++ return -1;
++
++ netmem_set_dma_index(netmem, 0);
++
++ return 0;
++}
++
+ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t gfp)
+ {
+ dma_addr_t dma;
+ int err;
+- u32 id;
+
+ /* Setup DMA mapping: use 'struct page' area for storing DMA-addr
+ * since dma_addr_t can be either 32 or 64 bits and does not always fit
+@@ -495,18 +544,10 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
+ goto unmap_failed;
+ }
+
+- if (in_softirq())
+- err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem),
+- PP_DMA_INDEX_LIMIT, gfp);
+- else
+- err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem),
+- PP_DMA_INDEX_LIMIT, gfp);
+- if (err) {
+- WARN_ONCE(err != -ENOMEM, "couldn't track DMA mapping, please report to netdev@");
++ err = page_pool_register_dma_index(pool, netmem, gfp);
++ if (err)
+ goto unset_failed;
+- }
+
+- netmem_set_dma_index(netmem, id);
+ page_pool_dma_sync_for_device(pool, netmem, pool->p.max_len);
+
+ return true;
+@@ -678,8 +719,6 @@ void page_pool_clear_pp_info(netmem_ref netmem)
+ static __always_inline void __page_pool_release_netmem_dma(struct page_pool *pool,
+ netmem_ref netmem)
+ {
+- struct page *old, *page = netmem_to_page(netmem);
+- unsigned long id;
+ dma_addr_t dma;
+
+ if (!pool->dma_map)
+@@ -688,15 +727,7 @@ static __always_inline void __page_pool_release_netmem_dma(struct page_pool *poo
+ */
+ return;
+
+- id = netmem_get_dma_index(netmem);
+- if (!id)
+- return;
+-
+- if (in_softirq())
+- old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0);
+- else
+- old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0);
+- if (old != page)
++ if (page_pool_release_dma_index(pool, netmem))
+ return;
+
+ dma = page_pool_get_dma_addr_netmem(netmem);
+@@ -706,7 +737,6 @@ static __always_inline void __page_pool_release_netmem_dma(struct page_pool *poo
+ PAGE_SIZE << pool->p.order, pool->p.dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING);
+ page_pool_set_dma_addr_netmem(netmem, 0);
+- netmem_set_dma_index(netmem, 0);
+ }
+
+ /* Disconnects a page (from a page_pool). API users can have a need
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index 89040007c7b709..ba36f558f144c6 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -1771,6 +1771,7 @@ EXPORT_IPV6_MOD(tcp_peek_len);
+ /* Make sure sk_rcvbuf is big enough to satisfy SO_RCVLOWAT hint */
+ int tcp_set_rcvlowat(struct sock *sk, int val)
+ {
++ struct tcp_sock *tp = tcp_sk(sk);
+ int space, cap;
+
+ if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
+@@ -1789,7 +1790,9 @@ int tcp_set_rcvlowat(struct sock *sk, int val)
+ space = tcp_space_from_win(sk, val);
+ if (space > sk->sk_rcvbuf) {
+ WRITE_ONCE(sk->sk_rcvbuf, space);
+- WRITE_ONCE(tcp_sk(sk)->window_clamp, val);
++
++ if (tp->window_clamp && tp->window_clamp < val)
++ WRITE_ONCE(tp->window_clamp, val);
+ }
+ return 0;
+ }
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 64f93668a8452b..a88e82f7ec4858 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -7275,7 +7275,6 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
+ &foc, TCP_SYNACK_FASTOPEN, skb);
+ /* Add the child socket directly into the accept queue */
+ if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) {
+- reqsk_fastopen_remove(fastopen_sk, req, false);
+ bh_unlock_sock(fastopen_sk);
+ sock_put(fastopen_sk);
+ goto drop_and_free;
+diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
+index e8ffa62ec183f3..d96130e49942e2 100644
+--- a/net/mptcp/ctrl.c
++++ b/net/mptcp/ctrl.c
+@@ -507,7 +507,7 @@ void mptcp_active_enable(struct sock *sk)
+ rcu_read_lock();
+ dst = __sk_dst_get(sk);
+ dev = dst ? dst_dev_rcu(dst) : NULL;
+- if (dev && (dev->flags & IFF_LOOPBACK))
++ if (!(dev && (dev->flags & IFF_LOOPBACK)))
+ atomic_set(&pernet->active_disable_times, 0);
+ rcu_read_unlock();
+ }
+diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
+index 136a380602cae8..c31c4b19c54bf9 100644
+--- a/net/mptcp/pm.c
++++ b/net/mptcp/pm.c
+@@ -617,9 +617,12 @@ void mptcp_pm_add_addr_received(const struct sock *ssk,
+ } else {
+ __MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP);
+ }
+- /* id0 should not have a different address */
++ /* - id0 should not have a different address
++ * - special case for C-flag: linked to fill_local_addresses_vec()
++ */
+ } else if ((addr->id == 0 && !mptcp_pm_is_init_remote_addr(msk, addr)) ||
+- (addr->id > 0 && !READ_ONCE(pm->accept_addr))) {
++ (addr->id > 0 && !READ_ONCE(pm->accept_addr) &&
++ !mptcp_pm_add_addr_c_flag_case(msk))) {
+ mptcp_pm_announce_addr(msk, addr, true);
+ mptcp_pm_add_addr_send_ack(msk);
+ } else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) {
+diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
+index 667803d72b643a..8c46493a0835b0 100644
+--- a/net/mptcp/pm_kernel.c
++++ b/net/mptcp/pm_kernel.c
+@@ -389,10 +389,12 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
+ struct mptcp_addr_info mpc_addr;
+ struct pm_nl_pernet *pernet;
+ unsigned int subflows_max;
++ bool c_flag_case;
+ int i = 0;
+
+ pernet = pm_nl_get_pernet_from_msk(msk);
+ subflows_max = mptcp_pm_get_subflows_max(msk);
++ c_flag_case = remote->id && mptcp_pm_add_addr_c_flag_case(msk);
+
+ mptcp_local_address((struct sock_common *)msk, &mpc_addr);
+
+@@ -405,12 +407,27 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
+ continue;
+
+ if (msk->pm.subflows < subflows_max) {
++ bool is_id0;
++
+ locals[i].addr = entry->addr;
+ locals[i].flags = entry->flags;
+ locals[i].ifindex = entry->ifindex;
+
++ is_id0 = mptcp_addresses_equal(&locals[i].addr,
++ &mpc_addr,
++ locals[i].addr.port);
++
++ if (c_flag_case &&
++ (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) {
++ __clear_bit(locals[i].addr.id,
++ msk->pm.id_avail_bitmap);
++
++ if (!is_id0)
++ msk->pm.local_addr_used++;
++ }
++
+ /* Special case for ID0: set the correct ID */
+- if (mptcp_addresses_equal(&locals[i].addr, &mpc_addr, locals[i].addr.port))
++ if (is_id0)
+ locals[i].addr.id = 0;
+
+ msk->pm.subflows++;
+@@ -419,6 +436,37 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
+ }
+ rcu_read_unlock();
+
++ /* Special case: peer sets the C flag, accept one ADD_ADDR if default
++ * limits are used -- accepting no ADD_ADDR -- and use subflow endpoints
++ */
++ if (!i && c_flag_case) {
++ unsigned int local_addr_max = mptcp_pm_get_local_addr_max(msk);
++
++ while (msk->pm.local_addr_used < local_addr_max &&
++ msk->pm.subflows < subflows_max) {
++ struct mptcp_pm_local *local = &locals[i];
++
++ if (!select_local_address(pernet, msk, local))
++ break;
++
++ __clear_bit(local->addr.id, msk->pm.id_avail_bitmap);
++
++ if (!mptcp_pm_addr_families_match(sk, &local->addr,
++ remote))
++ continue;
++
++ if (mptcp_addresses_equal(&local->addr, &mpc_addr,
++ local->addr.port))
++ continue;
++
++ msk->pm.local_addr_used++;
++ msk->pm.subflows++;
++ i++;
++ }
++
++ return i;
++ }
++
+ /* If the array is empty, fill in the single
+ * 'IPADDRANY' local address
+ */
+diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
+index b15d7fab5c4b66..245428e2316196 100644
+--- a/net/mptcp/protocol.h
++++ b/net/mptcp/protocol.h
+@@ -1201,6 +1201,14 @@ static inline void mptcp_pm_close_subflow(struct mptcp_sock *msk)
+ spin_unlock_bh(&msk->pm.lock);
+ }
+
++static inline bool mptcp_pm_add_addr_c_flag_case(struct mptcp_sock *msk)
++{
++ return READ_ONCE(msk->pm.remote_deny_join_id0) &&
++ msk->pm.local_addr_used == 0 &&
++ mptcp_pm_get_add_addr_accept_max(msk) == 0 &&
++ msk->pm.subflows < mptcp_pm_get_subflows_max(msk);
++}
++
+ void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk);
+
+ static inline struct mptcp_ext *mptcp_get_ext(const struct sk_buff *skb)
+diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c
+index 8ee66a86c3bc75..1a62e384766a76 100644
+--- a/net/netfilter/nft_objref.c
++++ b/net/netfilter/nft_objref.c
+@@ -22,6 +22,35 @@ void nft_objref_eval(const struct nft_expr *expr,
+ obj->ops->eval(obj, regs, pkt);
+ }
+
++static int nft_objref_validate_obj_type(const struct nft_ctx *ctx, u32 type)
++{
++ unsigned int hooks;
++
++ switch (type) {
++ case NFT_OBJECT_SYNPROXY:
++ if (ctx->family != NFPROTO_IPV4 &&
++ ctx->family != NFPROTO_IPV6 &&
++ ctx->family != NFPROTO_INET)
++ return -EOPNOTSUPP;
++
++ hooks = (1 << NF_INET_LOCAL_IN) | (1 << NF_INET_FORWARD);
++
++ return nft_chain_validate_hooks(ctx->chain, hooks);
++ default:
++ break;
++ }
++
++ return 0;
++}
++
++static int nft_objref_validate(const struct nft_ctx *ctx,
++ const struct nft_expr *expr)
++{
++ struct nft_object *obj = nft_objref_priv(expr);
++
++ return nft_objref_validate_obj_type(ctx, obj->ops->type->type);
++}
++
+ static int nft_objref_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+@@ -93,6 +122,7 @@ static const struct nft_expr_ops nft_objref_ops = {
+ .activate = nft_objref_activate,
+ .deactivate = nft_objref_deactivate,
+ .dump = nft_objref_dump,
++ .validate = nft_objref_validate,
+ .reduce = NFT_REDUCE_READONLY,
+ };
+
+@@ -197,6 +227,14 @@ static void nft_objref_map_destroy(const struct nft_ctx *ctx,
+ nf_tables_destroy_set(ctx, priv->set);
+ }
+
++static int nft_objref_map_validate(const struct nft_ctx *ctx,
++ const struct nft_expr *expr)
++{
++ const struct nft_objref_map *priv = nft_expr_priv(expr);
++
++ return nft_objref_validate_obj_type(ctx, priv->set->objtype);
++}
++
+ static const struct nft_expr_ops nft_objref_map_ops = {
+ .type = &nft_objref_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_objref_map)),
+@@ -206,6 +244,7 @@ static const struct nft_expr_ops nft_objref_map_ops = {
+ .deactivate = nft_objref_map_deactivate,
+ .destroy = nft_objref_map_destroy,
+ .dump = nft_objref_map_dump,
++ .validate = nft_objref_map_validate,
+ .reduce = NFT_REDUCE_READONLY,
+ };
+
+diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
+index 3ead591c72fd3c..d099b605e44a7f 100644
+--- a/net/sctp/sm_make_chunk.c
++++ b/net/sctp/sm_make_chunk.c
+@@ -31,6 +31,7 @@
+ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+ #include <crypto/hash.h>
++#include <crypto/utils.h>
+ #include <linux/types.h>
+ #include <linux/kernel.h>
+ #include <linux/ip.h>
+@@ -1788,7 +1789,7 @@ struct sctp_association *sctp_unpack_cookie(
+ }
+ }
+
+- if (memcmp(digest, cookie->signature, SCTP_SIGNATURE_SIZE)) {
++ if (crypto_memneq(digest, cookie->signature, SCTP_SIGNATURE_SIZE)) {
+ *error = -SCTP_IERROR_BAD_SIG;
+ goto fail;
+ }
+diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
+index a0524ba8d78781..dc66dff33d6d46 100644
+--- a/net/sctp/sm_statefuns.c
++++ b/net/sctp/sm_statefuns.c
+@@ -30,6 +30,7 @@
+
+ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
++#include <crypto/utils.h>
+ #include <linux/types.h>
+ #include <linux/kernel.h>
+ #include <linux/ip.h>
+@@ -885,7 +886,8 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net,
+ return SCTP_DISPOSITION_CONSUME;
+
+ nomem_authev:
+- sctp_ulpevent_free(ai_ev);
++ if (ai_ev)
++ sctp_ulpevent_free(ai_ev);
+ nomem_aiev:
+ sctp_ulpevent_free(ev);
+ nomem_ev:
+@@ -4416,7 +4418,7 @@ static enum sctp_ierror sctp_sf_authenticate(
+ sh_key, GFP_ATOMIC);
+
+ /* Discard the packet if the digests do not match */
+- if (memcmp(save_digest, digest, sig_len)) {
++ if (crypto_memneq(save_digest, digest, sig_len)) {
+ kfree(save_digest);
+ return SCTP_IERROR_BAD_SIG;
+ }
+diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
+index 8b1837228799c5..b800d704d80769 100644
+--- a/net/sunrpc/svc_xprt.c
++++ b/net/sunrpc/svc_xprt.c
+@@ -1014,6 +1014,19 @@ static void svc_delete_xprt(struct svc_xprt *xprt)
+ struct svc_serv *serv = xprt->xpt_server;
+ struct svc_deferred_req *dr;
+
++ /* unregister with rpcbind for when transport type is TCP or UDP.
++ */
++ if (test_bit(XPT_RPCB_UNREG, &xprt->xpt_flags)) {
++ struct svc_sock *svsk = container_of(xprt, struct svc_sock,
++ sk_xprt);
++ struct socket *sock = svsk->sk_sock;
++
++ if (svc_register(serv, xprt->xpt_net, sock->sk->sk_family,
++ sock->sk->sk_protocol, 0) < 0)
++ pr_warn("failed to unregister %s with rpcbind\n",
++ xprt->xpt_class->xcl_name);
++ }
++
+ if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
+ return;
+
+diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
+index e2c5e0e626f948..b396c85ff07234 100644
+--- a/net/sunrpc/svcsock.c
++++ b/net/sunrpc/svcsock.c
+@@ -836,6 +836,7 @@ static void svc_udp_init(struct svc_sock *svsk, struct svc_serv *serv)
+ /* data might have come in before data_ready set up */
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
+ set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
++ set_bit(XPT_RPCB_UNREG, &svsk->sk_xprt.xpt_flags);
+
+ /* make sure we get destination address info */
+ switch (svsk->sk_sk->sk_family) {
+@@ -1355,6 +1356,7 @@ static void svc_tcp_init(struct svc_sock *svsk, struct svc_serv *serv)
+ if (sk->sk_state == TCP_LISTEN) {
+ strcpy(svsk->sk_xprt.xpt_remotebuf, "listener");
+ set_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags);
++ set_bit(XPT_RPCB_UNREG, &svsk->sk_xprt.xpt_flags);
+ sk->sk_data_ready = svc_tcp_listen_data_ready;
+ set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
+ } else {
+diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
+index f16f390370dc43..1eb8d9f8b1041b 100644
+--- a/net/xdp/xsk_queue.h
++++ b/net/xdp/xsk_queue.h
+@@ -143,14 +143,24 @@ static inline bool xp_unused_options_set(u32 options)
+ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool,
+ struct xdp_desc *desc)
+ {
+- u64 addr = desc->addr - pool->tx_metadata_len;
+- u64 len = desc->len + pool->tx_metadata_len;
+- u64 offset = addr & (pool->chunk_size - 1);
++ u64 len = desc->len;
++ u64 addr, offset;
+
+- if (!desc->len)
++ if (!len)
+ return false;
+
+- if (offset + len > pool->chunk_size)
++ /* Can overflow if desc->addr < pool->tx_metadata_len */
++ if (check_sub_overflow(desc->addr, pool->tx_metadata_len, &addr))
++ return false;
++
++ offset = addr & (pool->chunk_size - 1);
++
++ /*
++ * Can't overflow: @offset is guaranteed to be < ``U32_MAX``
++ * (pool->chunk_size is ``u32``), @len is guaranteed
++ * to be <= ``U32_MAX``.
++ */
++ if (offset + len + pool->tx_metadata_len > pool->chunk_size)
+ return false;
+
+ if (addr >= pool->addrs_cnt)
+@@ -158,27 +168,42 @@ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool,
+
+ if (xp_unused_options_set(desc->options))
+ return false;
++
+ return true;
+ }
+
+ static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool,
+ struct xdp_desc *desc)
+ {
+- u64 addr = xp_unaligned_add_offset_to_addr(desc->addr) - pool->tx_metadata_len;
+- u64 len = desc->len + pool->tx_metadata_len;
++ u64 len = desc->len;
++ u64 addr, end;
+
+- if (!desc->len)
++ if (!len)
+ return false;
+
++ /* Can't overflow: @len is guaranteed to be <= ``U32_MAX`` */
++ len += pool->tx_metadata_len;
+ if (len > pool->chunk_size)
+ return false;
+
+- if (addr >= pool->addrs_cnt || addr + len > pool->addrs_cnt ||
+- xp_desc_crosses_non_contig_pg(pool, addr, len))
++ /* Can overflow if desc->addr is close to 0 */
++ if (check_sub_overflow(xp_unaligned_add_offset_to_addr(desc->addr),
++ pool->tx_metadata_len, &addr))
++ return false;
++
++ if (addr >= pool->addrs_cnt)
++ return false;
++
++ /* Can overflow if pool->addrs_cnt is high enough */
++ if (check_add_overflow(addr, len, &end) || end > pool->addrs_cnt)
++ return false;
++
++ if (xp_desc_crosses_non_contig_pg(pool, addr, len))
+ return false;
+
+ if (xp_unused_options_set(desc->options))
+ return false;
++
+ return true;
+ }
+
+diff --git a/rust/kernel/cpufreq.rs b/rust/kernel/cpufreq.rs
+index afc15e72a7c37a..b762ecdc22b00b 100644
+--- a/rust/kernel/cpufreq.rs
++++ b/rust/kernel/cpufreq.rs
+@@ -39,7 +39,8 @@
+ const CPUFREQ_NAME_LEN: usize = bindings::CPUFREQ_NAME_LEN as usize;
+
+ /// Default transition latency value in nanoseconds.
+-pub const ETERNAL_LATENCY_NS: u32 = bindings::CPUFREQ_ETERNAL as u32;
++pub const DEFAULT_TRANSITION_LATENCY_NS: u32 =
++ bindings::CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ /// CPU frequency driver flags.
+ pub mod flags {
+@@ -400,13 +401,13 @@ pub fn to_table(mut self) -> Result<TableBox> {
+ /// The following example demonstrates how to create a CPU frequency table.
+ ///
+ /// ```
+-/// use kernel::cpufreq::{ETERNAL_LATENCY_NS, Policy};
++/// use kernel::cpufreq::{DEFAULT_TRANSITION_LATENCY_NS, Policy};
+ ///
+ /// fn update_policy(policy: &mut Policy) {
+ /// policy
+ /// .set_dvfs_possible_from_any_cpu(true)
+ /// .set_fast_switch_possible(true)
+-/// .set_transition_latency_ns(ETERNAL_LATENCY_NS);
++/// .set_transition_latency_ns(DEFAULT_TRANSITION_LATENCY_NS);
+ ///
+ /// pr_info!("The policy details are: {:?}\n", (policy.cpu(), policy.cur()));
+ /// }
+diff --git a/scripts/Makefile.vmlinux b/scripts/Makefile.vmlinux
+index b64862dc6f08d4..ffc7b49e54f707 100644
+--- a/scripts/Makefile.vmlinux
++++ b/scripts/Makefile.vmlinux
+@@ -9,20 +9,6 @@ include $(srctree)/scripts/Makefile.lib
+
+ targets :=
+
+-ifdef CONFIG_ARCH_VMLINUX_NEEDS_RELOCS
+-vmlinux-final := vmlinux.unstripped
+-
+-quiet_cmd_strip_relocs = RSTRIP $@
+- cmd_strip_relocs = $(OBJCOPY) --remove-section='.rel*' --remove-section=!'.rel*.dyn' $< $@
+-
+-vmlinux: $(vmlinux-final) FORCE
+- $(call if_changed,strip_relocs)
+-
+-targets += vmlinux
+-else
+-vmlinux-final := vmlinux
+-endif
+-
+ %.o: %.c FORCE
+ $(call if_changed_rule,cc_o_c)
+
+@@ -61,19 +47,19 @@ targets += .builtin-dtbs-list
+
+ ifdef CONFIG_GENERIC_BUILTIN_DTB
+ targets += .builtin-dtbs.S .builtin-dtbs.o
+-$(vmlinux-final): .builtin-dtbs.o
++vmlinux.unstripped: .builtin-dtbs.o
+ endif
+
+-# vmlinux
++# vmlinux.unstripped
+ # ---------------------------------------------------------------------------
+
+ ifdef CONFIG_MODULES
+ targets += .vmlinux.export.o
+-$(vmlinux-final): .vmlinux.export.o
++vmlinux.unstripped: .vmlinux.export.o
+ endif
+
+ ifdef CONFIG_ARCH_WANTS_PRE_LINK_VMLINUX
+-$(vmlinux-final): arch/$(SRCARCH)/tools/vmlinux.arch.o
++vmlinux.unstripped: arch/$(SRCARCH)/tools/vmlinux.arch.o
+
+ arch/$(SRCARCH)/tools/vmlinux.arch.o: vmlinux.o FORCE
+ $(Q)$(MAKE) $(build)=arch/$(SRCARCH)/tools $@
+@@ -86,17 +72,36 @@ cmd_link_vmlinux = \
+ $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)" "$@"; \
+ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
+
+-targets += $(vmlinux-final)
+-$(vmlinux-final): scripts/link-vmlinux.sh vmlinux.o $(KBUILD_LDS) FORCE
++targets += vmlinux.unstripped
++vmlinux.unstripped: scripts/link-vmlinux.sh vmlinux.o $(KBUILD_LDS) FORCE
+ +$(call if_changed_dep,link_vmlinux)
+ ifdef CONFIG_DEBUG_INFO_BTF
+-$(vmlinux-final): $(RESOLVE_BTFIDS)
++vmlinux.unstripped: $(RESOLVE_BTFIDS)
+ endif
+
+ ifdef CONFIG_BUILDTIME_TABLE_SORT
+-$(vmlinux-final): scripts/sorttable
++vmlinux.unstripped: scripts/sorttable
+ endif
+
++# vmlinux
++# ---------------------------------------------------------------------------
++
++remove-section-y := .modinfo
++remove-section-$(CONFIG_ARCH_VMLINUX_NEEDS_RELOCS) += '.rel*' '!.rel*.dyn'
++# for compatibility with binutils < 2.32
++# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=c12d9fa2afe7abcbe407a00e15719e1a1350c2a7
++remove-section-$(CONFIG_ARCH_VMLINUX_NEEDS_RELOCS) += '.rel.*'
++
++# To avoid warnings: "empty loadable segment detected at ..." from GNU objcopy,
++# it is necessary to remove the PT_LOAD flag from the segment.
++quiet_cmd_strip_relocs = OBJCOPY $@
++ cmd_strip_relocs = $(OBJCOPY) $(patsubst %,--set-section-flags %=noload,$(remove-section-y)) $< $@; \
++ $(OBJCOPY) $(addprefix --remove-section=,$(remove-section-y)) $@
++
++targets += vmlinux
++vmlinux: vmlinux.unstripped FORCE
++ $(call if_changed,strip_relocs)
++
+ # modules.builtin.ranges
+ # ---------------------------------------------------------------------------
+ ifdef CONFIG_BUILTIN_MODULE_RANGES
+@@ -110,7 +115,7 @@ modules.builtin.ranges: $(srctree)/scripts/generate_builtin_ranges.awk \
+ modules.builtin vmlinux.map vmlinux.o.map FORCE
+ $(call if_changed,modules_builtin_ranges)
+
+-vmlinux.map: $(vmlinux-final)
++vmlinux.map: vmlinux.unstripped
+ @:
+
+ endif
+diff --git a/scripts/mksysmap b/scripts/mksysmap
+index 3accbdb269ac70..a607a0059d119e 100755
+--- a/scripts/mksysmap
++++ b/scripts/mksysmap
+@@ -79,6 +79,9 @@
+ / _SDA_BASE_$/d
+ / _SDA2_BASE_$/d
+
++# MODULE_INFO()
++/ __UNIQUE_ID_modinfo[0-9]*$/d
++
+ # ---------------------------------------------------------------------------
+ # Ignored patterns
+ # (symbols that contain the pattern are ignored)
+diff --git a/security/keys/trusted-keys/trusted_tpm1.c b/security/keys/trusted-keys/trusted_tpm1.c
+index 89c9798d180071..e73f2c6c817a07 100644
+--- a/security/keys/trusted-keys/trusted_tpm1.c
++++ b/security/keys/trusted-keys/trusted_tpm1.c
+@@ -7,6 +7,7 @@
+ */
+
+ #include <crypto/hash_info.h>
++#include <crypto/utils.h>
+ #include <linux/init.h>
+ #include <linux/slab.h>
+ #include <linux/parser.h>
+@@ -241,7 +242,7 @@ int TSS_checkhmac1(unsigned char *buffer,
+ if (ret < 0)
+ goto out;
+
+- if (memcmp(testhmac, authdata, SHA1_DIGEST_SIZE))
++ if (crypto_memneq(testhmac, authdata, SHA1_DIGEST_SIZE))
+ ret = -EINVAL;
+ out:
+ kfree_sensitive(sdesc);
+@@ -334,7 +335,7 @@ static int TSS_checkhmac2(unsigned char *buffer,
+ TPM_NONCE_SIZE, ononce, 1, continueflag1, 0, 0);
+ if (ret < 0)
+ goto out;
+- if (memcmp(testhmac1, authdata1, SHA1_DIGEST_SIZE)) {
++ if (crypto_memneq(testhmac1, authdata1, SHA1_DIGEST_SIZE)) {
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -343,7 +344,7 @@ static int TSS_checkhmac2(unsigned char *buffer,
+ TPM_NONCE_SIZE, ononce, 1, continueflag2, 0, 0);
+ if (ret < 0)
+ goto out;
+- if (memcmp(testhmac2, authdata2, SHA1_DIGEST_SIZE))
++ if (crypto_memneq(testhmac2, authdata2, SHA1_DIGEST_SIZE))
+ ret = -EINVAL;
+ out:
+ kfree_sensitive(sdesc);
+diff --git a/sound/soc/sof/intel/hda-pcm.c b/sound/soc/sof/intel/hda-pcm.c
+index 1dd8d2092c3b4f..da6c1e7263cde1 100644
+--- a/sound/soc/sof/intel/hda-pcm.c
++++ b/sound/soc/sof/intel/hda-pcm.c
+@@ -29,6 +29,8 @@
+ #define SDnFMT_BITS(x) ((x) << 4)
+ #define SDnFMT_CHAN(x) ((x) << 0)
+
++#define HDA_MAX_PERIOD_TIME_HEADROOM 10
++
+ static bool hda_always_enable_dmi_l1;
+ module_param_named(always_enable_dmi_l1, hda_always_enable_dmi_l1, bool, 0444);
+ MODULE_PARM_DESC(always_enable_dmi_l1, "SOF HDA always enable DMI l1");
+@@ -291,19 +293,30 @@ int hda_dsp_pcm_open(struct snd_sof_dev *sdev,
+ * On playback start the DMA will transfer dsp_max_burst_size_in_ms
+ * amount of data in one initial burst to fill up the host DMA buffer.
+ * Consequent DMA burst sizes are shorter and their length can vary.
+- * To make sure that userspace allocate large enough ALSA buffer we need
+- * to place a constraint on the buffer time.
++ * To avoid immediate xrun by the initial burst we need to place
++ * constraint on the period size (via PERIOD_TIME) to cover the size of
++ * the host buffer.
++ * We need to add headroom of max 10ms as the firmware needs time to
++ * settle to the 1ms pacing and initially it can run faster for few
++ * internal periods.
+ *
+ * On capture the DMA will transfer 1ms chunks.
+- *
+- * Exact dsp_max_burst_size_in_ms constraint is racy, so set the
+- * constraint to a minimum of 2x dsp_max_burst_size_in_ms.
+ */
+- if (spcm->stream[direction].dsp_max_burst_size_in_ms)
++ if (spcm->stream[direction].dsp_max_burst_size_in_ms) {
++ unsigned int period_time = spcm->stream[direction].dsp_max_burst_size_in_ms;
++
++ /*
++ * add headroom over the maximum burst size to cover the time
++ * needed for the DMA pace to settle.
++ * Limit the headroom time to HDA_MAX_PERIOD_TIME_HEADROOM
++ */
++ period_time += min(period_time, HDA_MAX_PERIOD_TIME_HEADROOM);
++
+ snd_pcm_hw_constraint_minmax(substream->runtime,
+- SNDRV_PCM_HW_PARAM_BUFFER_TIME,
+- spcm->stream[direction].dsp_max_burst_size_in_ms * USEC_PER_MSEC * 2,
++ SNDRV_PCM_HW_PARAM_PERIOD_TIME,
++ period_time * USEC_PER_MSEC,
+ UINT_MAX);
++ }
+
+ /* binding pcm substream to hda stream */
+ substream->runtime->private_data = &dsp_stream->hstream;
+diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c
+index a34f472ef1751f..9c3b3a9aaf83c9 100644
+--- a/sound/soc/sof/intel/hda-stream.c
++++ b/sound/soc/sof/intel/hda-stream.c
+@@ -1129,10 +1129,35 @@ u64 hda_dsp_get_stream_llp(struct snd_sof_dev *sdev,
+ struct snd_soc_component *component,
+ struct snd_pcm_substream *substream)
+ {
+- struct hdac_stream *hstream = substream->runtime->private_data;
+- struct hdac_ext_stream *hext_stream = stream_to_hdac_ext_stream(hstream);
++ struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream);
++ struct snd_soc_pcm_runtime *be_rtd = NULL;
++ struct hdac_ext_stream *hext_stream;
++ struct snd_soc_dai *cpu_dai;
++ struct snd_soc_dpcm *dpcm;
+ u32 llp_l, llp_u;
+
++ /*
++ * The LLP needs to be read from the Link DMA used for this FE as it is
++ * allowed to use any combination of Link and Host channels
++ */
++ for_each_dpcm_be(rtd, substream->stream, dpcm) {
++ if (dpcm->fe != rtd)
++ continue;
++
++ be_rtd = dpcm->be;
++ }
++
++ if (!be_rtd)
++ return 0;
++
++ cpu_dai = snd_soc_rtd_to_cpu(be_rtd, 0);
++ if (!cpu_dai)
++ return 0;
++
++ hext_stream = snd_soc_dai_get_dma_data(cpu_dai, substream);
++ if (!hext_stream)
++ return 0;
++
+ /*
+ * The pplc_addr have been calculated during probe in
+ * hda_dsp_stream_init():
+diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c
+index c93db452bbc07a..16053d224dcdb3 100644
+--- a/sound/soc/sof/ipc4-topology.c
++++ b/sound/soc/sof/ipc4-topology.c
+@@ -623,8 +623,13 @@ static int sof_ipc4_widget_setup_pcm(struct snd_sof_widget *swidget)
+ swidget->tuples,
+ swidget->num_tuples, sizeof(u32), 1);
+ /* Set default DMA buffer size if it is not specified in topology */
+- if (!sps->dsp_max_burst_size_in_ms)
+- sps->dsp_max_burst_size_in_ms = SOF_IPC4_MIN_DMA_BUFFER_SIZE;
++ if (!sps->dsp_max_burst_size_in_ms) {
++ struct snd_sof_widget *pipe_widget = swidget->spipe->pipe_widget;
++ struct sof_ipc4_pipeline *pipeline = pipe_widget->private;
++
++ sps->dsp_max_burst_size_in_ms = pipeline->use_chain_dma ?
++ SOF_IPC4_CHAIN_DMA_BUFFER_SIZE : SOF_IPC4_MIN_DMA_BUFFER_SIZE;
++ }
+ } else {
+ /* Capture data is copied from DSP to host in 1ms bursts */
+ spcm->stream[dir].dsp_max_burst_size_in_ms = 1;
+diff --git a/sound/soc/sof/ipc4-topology.h b/sound/soc/sof/ipc4-topology.h
+index 659e1ae0a85f95..2a2afd0e83338c 100644
+--- a/sound/soc/sof/ipc4-topology.h
++++ b/sound/soc/sof/ipc4-topology.h
+@@ -61,8 +61,11 @@
+ #define SOF_IPC4_CHAIN_DMA_NODE_ID 0x7fffffff
+ #define SOF_IPC4_INVALID_NODE_ID 0xffffffff
+
+-/* FW requires minimum 2ms DMA buffer size */
+-#define SOF_IPC4_MIN_DMA_BUFFER_SIZE 2
++/* FW requires minimum 4ms DMA buffer size */
++#define SOF_IPC4_MIN_DMA_BUFFER_SIZE 4
++
++/* ChainDMA in fw uses 5ms DMA buffer */
++#define SOF_IPC4_CHAIN_DMA_BUFFER_SIZE 5
+
+ /*
+ * The base of multi-gateways. Multi-gateways addressing starts from
+diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
+index b41a42818d8ac2..bd615a708a0aa8 100644
+--- a/tools/build/feature/Makefile
++++ b/tools/build/feature/Makefile
+@@ -316,10 +316,10 @@ $(OUTPUT)test-libcapstone.bin:
+ $(BUILD) # -lcapstone provided by $(FEATURE_CHECK_LDFLAGS-libcapstone)
+
+ $(OUTPUT)test-compile-32.bin:
+- $(CC) -m32 -o $@ test-compile.c
++ $(CC) -m32 -Wall -Werror -o $@ test-compile.c
+
+ $(OUTPUT)test-compile-x32.bin:
+- $(CC) -mx32 -o $@ test-compile.c
++ $(CC) -mx32 -Wall -Werror -o $@ test-compile.c
+
+ $(OUTPUT)test-zlib.bin:
+ $(BUILD) -lz
+diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
+index 6608f1e3701b43..aa1e91c97a226e 100644
+--- a/tools/lib/perf/include/perf/event.h
++++ b/tools/lib/perf/include/perf/event.h
+@@ -291,6 +291,7 @@ struct perf_record_header_event_type {
+ struct perf_record_header_tracing_data {
+ struct perf_event_header header;
+ __u32 size;
++ __u32 pad;
+ };
+
+ #define PERF_RECORD_MISC_BUILD_ID_SIZE (1 << 15)
+diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
+index e2150acc2c1332..f561025d4085e7 100644
+--- a/tools/perf/Makefile.perf
++++ b/tools/perf/Makefile.perf
+@@ -941,7 +941,7 @@ $(OUTPUT)dlfilters/%.so: $(OUTPUT)dlfilters/%.o
+ ifndef NO_JVMTI
+ LIBJVMTI_IN := $(OUTPUT)jvmti/jvmti-in.o
+
+-$(LIBJVMTI_IN): FORCE
++$(LIBJVMTI_IN): prepare FORCE
+ $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=jvmti obj=jvmti
+
+ $(OUTPUT)$(LIBJVMTI): $(LIBJVMTI_IN)
+diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
+index fe737b3ac6e67d..25c41b89f8abbe 100644
+--- a/tools/perf/builtin-trace.c
++++ b/tools/perf/builtin-trace.c
+@@ -4440,7 +4440,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
+
+ if (trace->summary_mode == SUMMARY__BY_TOTAL && !trace->summary_bpf) {
+ trace->syscall_stats = alloc_syscall_stats();
+- if (trace->syscall_stats == NULL)
++ if (IS_ERR(trace->syscall_stats))
+ goto out_delete_evlist;
+ }
+
+@@ -4748,7 +4748,7 @@ static int trace__replay(struct trace *trace)
+
+ if (trace->summary_mode == SUMMARY__BY_TOTAL) {
+ trace->syscall_stats = alloc_syscall_stats();
+- if (trace->syscall_stats == NULL)
++ if (IS_ERR(trace->syscall_stats))
+ goto out;
+ }
+
+diff --git a/tools/perf/perf.h b/tools/perf/perf.h
+index 3cb40965549f5e..e004178472d9ef 100644
+--- a/tools/perf/perf.h
++++ b/tools/perf/perf.h
+@@ -2,9 +2,7 @@
+ #ifndef _PERF_PERF_H
+ #define _PERF_PERF_H
+
+-#ifndef MAX_NR_CPUS
+ #define MAX_NR_CPUS 4096
+-#endif
+
+ enum perf_affinity {
+ PERF_AFFINITY_SYS = 0,
+diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json
+index 5228f94a793f95..6817cac149e0bc 100644
+--- a/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json
++++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json
+@@ -113,7 +113,7 @@
+ {
+ "MetricName": "load_store_spec_rate",
+ "MetricExpr": "LDST_SPEC / INST_SPEC",
+- "BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speclatively executed",
++ "BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speculatively executed",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "100percent of operations"
+ },
+@@ -132,7 +132,7 @@
+ {
+ "MetricName": "pc_write_spec_rate",
+ "MetricExpr": "PC_WRITE_SPEC / INST_SPEC",
+- "BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speclatively executed",
++ "BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speculatively executed",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "100percent of operations"
+ },
+@@ -195,14 +195,14 @@
+ {
+ "MetricName": "stall_frontend_cache_rate",
+ "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
+- "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and cache miss",
++ "BriefDescription": "Proportion of cycles stalled and no operations delivered from frontend and cache miss",
+ "MetricGroup": "Stall",
+ "ScaleUnit": "100percent of cycles"
+ },
+ {
+ "MetricName": "stall_frontend_tlb_rate",
+ "MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES",
+- "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and TLB miss",
++ "BriefDescription": "Proportion of cycles stalled and no operations delivered from frontend and TLB miss",
+ "MetricGroup": "Stall",
+ "ScaleUnit": "100percent of cycles"
+ },
+@@ -391,7 +391,7 @@
+ "ScaleUnit": "100percent of cache acceses"
+ },
+ {
+- "MetricName": "l1d_cache_access_prefetces",
++ "MetricName": "l1d_cache_access_prefetches",
+ "MetricExpr": "L1D_CACHE_PRFM / L1D_CACHE",
+ "BriefDescription": "L1D cache access - prefetch",
+ "MetricGroup": "Cache",
+diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
+index 0b3c37e668717c..8c79b5166a0581 100644
+--- a/tools/perf/tests/perf-record.c
++++ b/tools/perf/tests/perf-record.c
+@@ -115,6 +115,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest
+ if (err < 0) {
+ pr_debug("sched__get_first_possible_cpu: %s\n",
+ str_error_r(errno, sbuf, sizeof(sbuf)));
++ evlist__cancel_workload(evlist);
+ goto out_delete_evlist;
+ }
+
+@@ -126,6 +127,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest
+ if (sched_setaffinity(evlist->workload.pid, cpu_mask_size, &cpu_mask) < 0) {
+ pr_debug("sched_setaffinity: %s\n",
+ str_error_r(errno, sbuf, sizeof(sbuf)));
++ evlist__cancel_workload(evlist);
+ goto out_delete_evlist;
+ }
+
+@@ -137,6 +139,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest
+ if (err < 0) {
+ pr_debug("perf_evlist__open: %s\n",
+ str_error_r(errno, sbuf, sizeof(sbuf)));
++ evlist__cancel_workload(evlist);
+ goto out_delete_evlist;
+ }
+
+@@ -149,6 +152,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest
+ if (err < 0) {
+ pr_debug("evlist__mmap: %s\n",
+ str_error_r(errno, sbuf, sizeof(sbuf)));
++ evlist__cancel_workload(evlist);
+ goto out_delete_evlist;
+ }
+
+diff --git a/tools/perf/tests/shell/amd-ibs-swfilt.sh b/tools/perf/tests/shell/amd-ibs-swfilt.sh
+index 7045ec72ba4cff..e7f66df05c4b1b 100755
+--- a/tools/perf/tests/shell/amd-ibs-swfilt.sh
++++ b/tools/perf/tests/shell/amd-ibs-swfilt.sh
+@@ -1,6 +1,10 @@
+ #!/bin/bash
+ # AMD IBS software filtering
+
++ParanoidAndNotRoot() {
++ [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ]
++}
++
+ echo "check availability of IBS swfilt"
+
+ # check if IBS PMU is available
+@@ -16,6 +20,7 @@ if [ ! -f /sys/bus/event_source/devices/ibs_op/format/swfilt ]; then
+ fi
+
+ echo "run perf record with modifier and swfilt"
++err=0
+
+ # setting any modifiers should fail
+ perf record -B -e ibs_op//u -o /dev/null true 2> /dev/null
+@@ -31,11 +36,17 @@ if [ $? -ne 0 ]; then
+ exit 1
+ fi
+
+-# setting it with swfilt=1 should be fine
+-perf record -B -e ibs_op/swfilt=1/k -o /dev/null true
+-if [ $? -ne 0 ]; then
+- echo "[FAIL] IBS op PMU cannot handle swfilt for exclude_user"
+- exit 1
++if ! ParanoidAndNotRoot 1
++then
++ # setting it with swfilt=1 should be fine
++ perf record -B -e ibs_op/swfilt=1/k -o /dev/null true
++ if [ $? -ne 0 ]; then
++ echo "[FAIL] IBS op PMU cannot handle swfilt for exclude_user"
++ exit 1
++ fi
++else
++ echo "[SKIP] not root and perf_event_paranoid too high for exclude_user"
++ err=2
+ fi
+
+ # check ibs_fetch PMU as well
+@@ -46,10 +57,16 @@ if [ $? -ne 0 ]; then
+ fi
+
+ # check system wide recording
+-perf record -aB --synth=no -e ibs_op/swfilt/k -o /dev/null true
+-if [ $? -ne 0 ]; then
+- echo "[FAIL] IBS op PMU cannot handle swfilt in system-wide mode"
+- exit 1
++if ! ParanoidAndNotRoot 0
++then
++ perf record -aB --synth=no -e ibs_op/swfilt/k -o /dev/null true
++ if [ $? -ne 0 ]; then
++ echo "[FAIL] IBS op PMU cannot handle swfilt in system-wide mode"
++ exit 1
++ fi
++else
++ echo "[SKIP] not root and perf_event_paranoid too high for system-wide/exclude_user"
++ err=2
+ fi
+
+ echo "check number of samples with swfilt"
+@@ -60,8 +77,16 @@ if [ ${kernel_sample} -ne 0 ]; then
+ exit 1
+ fi
+
+-user_sample=$(perf record -e ibs_fetch/swfilt/k -o- true | perf script -i- -F misc | grep -c ^U)
+-if [ ${user_sample} -ne 0 ]; then
+- echo "[FAIL] unexpected user samples: " ${user_sample}
+- exit 1
++if ! ParanoidAndNotRoot 1
++then
++ user_sample=$(perf record -e ibs_fetch/swfilt/k -o- true | perf script -i- -F misc | grep -c ^U)
++ if [ ${user_sample} -ne 0 ]; then
++ echo "[FAIL] unexpected user samples: " ${user_sample}
++ exit 1
++ fi
++else
++ echo "[SKIP] not root and perf_event_paranoid too high for exclude_user"
++ err=2
+ fi
++
++exit $err
+diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
+index 6fcb5e52b9b4fc..78a02e90ece1e6 100755
+--- a/tools/perf/tests/shell/record_lbr.sh
++++ b/tools/perf/tests/shell/record_lbr.sh
+@@ -4,6 +4,10 @@
+
+ set -e
+
++ParanoidAndNotRoot() {
++ [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ]
++}
++
+ if [ ! -f /sys/bus/event_source/devices/cpu/caps/branches ] &&
+ [ ! -f /sys/bus/event_source/devices/cpu_core/caps/branches ]
+ then
+@@ -23,6 +27,7 @@ cleanup() {
+ }
+
+ trap_cleanup() {
++ echo "Unexpected signal in ${FUNCNAME[1]}"
+ cleanup
+ exit 1
+ }
+@@ -123,8 +128,11 @@ lbr_test "-j ind_call" "any indirect call" 2
+ lbr_test "-j ind_jmp" "any indirect jump" 100
+ lbr_test "-j call" "direct calls" 2
+ lbr_test "-j ind_call,u" "any indirect user call" 100
+-lbr_test "-a -b" "system wide any branch" 2
+-lbr_test "-a -j any_call" "system wide any call" 2
++if ! ParanoidAndNotRoot 1
++then
++ lbr_test "-a -b" "system wide any branch" 2
++ lbr_test "-a -j any_call" "system wide any call" 2
++fi
+
+ # Parallel
+ parallel_lbr_test "-b" "parallel any branch" 100 &
+@@ -141,10 +149,16 @@ parallel_lbr_test "-j call" "parallel direct calls" 100 &
+ pid6=$!
+ parallel_lbr_test "-j ind_call,u" "parallel any indirect user call" 100 &
+ pid7=$!
+-parallel_lbr_test "-a -b" "parallel system wide any branch" 100 &
+-pid8=$!
+-parallel_lbr_test "-a -j any_call" "parallel system wide any call" 100 &
+-pid9=$!
++if ParanoidAndNotRoot 1
++then
++ pid8=
++ pid9=
++else
++ parallel_lbr_test "-a -b" "parallel system wide any branch" 100 &
++ pid8=$!
++ parallel_lbr_test "-a -j any_call" "parallel system wide any call" 100 &
++ pid9=$!
++fi
+
+ for pid in $pid1 $pid2 $pid3 $pid4 $pid5 $pid6 $pid7 $pid8 $pid9
+ do
+diff --git a/tools/perf/tests/shell/stat+event_uniquifying.sh b/tools/perf/tests/shell/stat+event_uniquifying.sh
+index bf54bd6c3e2e61..b5dec6b6da3693 100755
+--- a/tools/perf/tests/shell/stat+event_uniquifying.sh
++++ b/tools/perf/tests/shell/stat+event_uniquifying.sh
+@@ -4,74 +4,63 @@
+
+ set -e
+
+-stat_output=$(mktemp /tmp/__perf_test.stat_output.XXXXX)
+-perf_tool=perf
+ err=0
++stat_output=$(mktemp /tmp/__perf_test.stat_output.XXXXX)
+
+-test_event_uniquifying() {
+- # We use `clockticks` in `uncore_imc` to verify the uniquify behavior.
+- pmu="uncore_imc"
+- event="clockticks"
+-
+- # If the `-A` option is added, the event should be uniquified.
+- #
+- # $perf list -v clockticks
+- #
+- # List of pre-defined events (to be used in -e or -M):
+- #
+- # uncore_imc_0/clockticks/ [Kernel PMU event]
+- # uncore_imc_1/clockticks/ [Kernel PMU event]
+- # uncore_imc_2/clockticks/ [Kernel PMU event]
+- # uncore_imc_3/clockticks/ [Kernel PMU event]
+- # uncore_imc_4/clockticks/ [Kernel PMU event]
+- # uncore_imc_5/clockticks/ [Kernel PMU event]
+- #
+- # ...
+- #
+- # $perf stat -e clockticks -A -- true
+- #
+- # Performance counter stats for 'system wide':
+- #
+- # CPU0 3,773,018 uncore_imc_0/clockticks/
+- # CPU0 3,609,025 uncore_imc_1/clockticks/
+- # CPU0 0 uncore_imc_2/clockticks/
+- # CPU0 3,230,009 uncore_imc_3/clockticks/
+- # CPU0 3,049,897 uncore_imc_4/clockticks/
+- # CPU0 0 uncore_imc_5/clockticks/
+- #
+- # 0.002029828 seconds time elapsed
+-
+- echo "stat event uniquifying test"
+- uniquified_event_array=()
++cleanup() {
++ rm -f "${stat_output}"
+
+- # Skip if the machine does not have `uncore_imc` device.
+- if ! ${perf_tool} list pmu | grep -q ${pmu}; then
+- echo "Target does not support PMU ${pmu} [Skipped]"
+- err=2
+- return
+- fi
++ trap - EXIT TERM INT
++}
+
+- # Check how many uniquified events.
+- while IFS= read -r line; do
+- uniquified_event=$(echo "$line" | awk '{print $1}')
+- uniquified_event_array+=("${uniquified_event}")
+- done < <(${perf_tool} list -v ${event} | grep ${pmu})
++trap_cleanup() {
++ echo "Unexpected signal in ${FUNCNAME[1]}"
++ cleanup
++ exit 1
++}
++trap trap_cleanup EXIT TERM INT
+
+- perf_command="${perf_tool} stat -e $event -A -o ${stat_output} -- true"
+- $perf_command
++test_event_uniquifying() {
++ echo "Uniquification of PMU sysfs events test"
+
+- # Check the output contains all uniquified events.
+- for uniquified_event in "${uniquified_event_array[@]}"; do
+- if ! cat "${stat_output}" | grep -q "${uniquified_event}"; then
+- echo "Event is not uniquified [Failed]"
+- echo "${perf_command}"
+- cat "${stat_output}"
+- err=1
+- break
+- fi
++ # Read events from perf list with and without -v. With -v the duplicate PMUs
++ # aren't deduplicated. Note, json events are listed by perf list without a
++ # PMU.
++ read -ra pmu_events <<< "$(perf list --raw pmu)"
++ read -ra pmu_v_events <<< "$(perf list -v --raw pmu)"
++ # For all non-deduplicated events.
++ for pmu_v_event in "${pmu_v_events[@]}"; do
++ # If the event matches an event in the deduplicated events then it musn't
++ # be an event with duplicate PMUs, continue the outer loop.
++ for pmu_event in "${pmu_events[@]}"; do
++ if [[ "$pmu_v_event" == "$pmu_event" ]]; then
++ continue 2
++ fi
++ done
++ # Strip the suffix from the non-deduplicated event's PMU.
++ event=$(echo "$pmu_v_event" | sed -E 's/_[0-9]+//')
++ for pmu_event in "${pmu_events[@]}"; do
++ if [[ "$event" == "$pmu_event" ]]; then
++ echo "Testing event ${event} is uniquified to ${pmu_v_event}"
++ if ! perf stat -e "$event" -A -o ${stat_output} -- true; then
++ echo "Error running perf stat for event '$event' [Skip]"
++ if [ $err = 0 ]; then
++ err=2
++ fi
++ continue
++ fi
++ # Ensure the non-deduplicated event appears in the output.
++ if ! grep -q "${pmu_v_event}" "${stat_output}"; then
++ echo "Uniquification of PMU sysfs events test [Failed]"
++ cat "${stat_output}"
++ err=1
++ fi
++ break
++ fi
++ done
+ done
+ }
+
+ test_event_uniquifying
+-rm -f "${stat_output}"
++cleanup
+ exit $err
+diff --git a/tools/perf/tests/shell/trace_btf_enum.sh b/tools/perf/tests/shell/trace_btf_enum.sh
+index 572001d75d7815..03e9f680a4a690 100755
+--- a/tools/perf/tests/shell/trace_btf_enum.sh
++++ b/tools/perf/tests/shell/trace_btf_enum.sh
+@@ -23,6 +23,14 @@ check_vmlinux() {
+ fi
+ }
+
++check_permissions() {
++ if perf trace -e $syscall $TESTPROG 2>&1 | grep -q "Operation not permitted"
++ then
++ echo "trace+enum test [Skipped permissions]"
++ err=2
++ fi
++}
++
+ trace_landlock() {
+ echo "Tracing syscall ${syscall}"
+
+@@ -56,6 +64,9 @@ trace_non_syscall() {
+ }
+
+ check_vmlinux
++if [ $err = 0 ]; then
++ check_permissions
++fi
+
+ if [ $err = 0 ]; then
+ trace_landlock
+diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
+index 8942fa598a84fb..3086dad92965af 100644
+--- a/tools/perf/util/arm-spe.c
++++ b/tools/perf/util/arm-spe.c
+@@ -670,8 +670,8 @@ static void arm_spe__synth_data_source_common(const struct arm_spe_record *recor
+ * socket
+ */
+ case ARM_SPE_COMMON_DS_REMOTE:
+- data_src->mem_lvl = PERF_MEM_LVL_REM_CCE1;
+- data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE;
++ data_src->mem_lvl = PERF_MEM_LVL_NA;
++ data_src->mem_lvl_num = PERF_MEM_LVLNUM_NA;
+ data_src->mem_remote = PERF_MEM_REMOTE_REMOTE;
+ data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER;
+ break;
+@@ -839,7 +839,7 @@ static void arm_spe__synth_memory_level(const struct arm_spe_record *record,
+ }
+
+ if (record->type & ARM_SPE_REMOTE_ACCESS)
+- data_src->mem_lvl |= PERF_MEM_LVL_REM_CCE1;
++ data_src->mem_remote = PERF_MEM_REMOTE_REMOTE;
+ }
+
+ static bool arm_spe__synth_ds(struct arm_spe_queue *speq,
+diff --git a/tools/perf/util/bpf-filter.c b/tools/perf/util/bpf-filter.c
+index a0b11f35395f81..92308c38fbb567 100644
+--- a/tools/perf/util/bpf-filter.c
++++ b/tools/perf/util/bpf-filter.c
+@@ -443,6 +443,10 @@ static int create_idx_hash(struct evsel *evsel, struct perf_bpf_filter_entry *en
+ return -1;
+ }
+
++#define LIBBPF_CURRENT_VERSION_GEQ(major, minor) \
++ (LIBBPF_MAJOR_VERSION > (major) || \
++ (LIBBPF_MAJOR_VERSION == (major) && LIBBPF_MINOR_VERSION >= (minor)))
++
+ int perf_bpf_filter__prepare(struct evsel *evsel, struct target *target)
+ {
+ int i, x, y, fd, ret;
+@@ -451,8 +455,12 @@ int perf_bpf_filter__prepare(struct evsel *evsel, struct target *target)
+ struct bpf_link *link;
+ struct perf_bpf_filter_entry *entry;
+ bool needs_idx_hash = !target__has_cpu(target);
++#if LIBBPF_CURRENT_VERSION_GEQ(1, 7)
+ DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts,
+ .dont_enable = true);
++#else
++ DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts);
++#endif
+
+ entry = calloc(MAX_FILTERS, sizeof(*entry));
+ if (entry == NULL)
+diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
+index 73fcafbffc6a66..ed88ba570c80ae 100644
+--- a/tools/perf/util/bpf_counter.c
++++ b/tools/perf/util/bpf_counter.c
+@@ -278,6 +278,7 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu_map_idx
+ {
+ struct bpf_prog_profiler_bpf *skel;
+ struct bpf_counter *counter;
++ int cpu = perf_cpu_map__cpu(evsel->core.cpus, cpu_map_idx).cpu;
+ int ret;
+
+ list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+@@ -285,7 +286,7 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu_map_idx
+ assert(skel != NULL);
+
+ ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
+- &cpu_map_idx, &fd, BPF_ANY);
++ &cpu, &fd, BPF_ANY);
+ if (ret)
+ return ret;
+ }
+@@ -393,7 +394,6 @@ static int bperf_check_target(struct evsel *evsel,
+ return 0;
+ }
+
+-static struct perf_cpu_map *all_cpu_map;
+ static __u32 filter_entry_cnt;
+
+ static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
+@@ -437,7 +437,7 @@ static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
+ * following evsel__open_per_cpu call
+ */
+ evsel->leader_skel = skel;
+- evsel__open_per_cpu(evsel, all_cpu_map, -1);
++ evsel__open(evsel, evsel->core.cpus, evsel->core.threads);
+
+ out:
+ bperf_leader_bpf__destroy(skel);
+@@ -475,12 +475,6 @@ static int bperf__load(struct evsel *evsel, struct target *target)
+ if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
+ return -1;
+
+- if (!all_cpu_map) {
+- all_cpu_map = perf_cpu_map__new_online_cpus();
+- if (!all_cpu_map)
+- return -1;
+- }
+-
+ evsel->bperf_leader_prog_fd = -1;
+ evsel->bperf_leader_link_fd = -1;
+
+@@ -598,9 +592,10 @@ static int bperf__load(struct evsel *evsel, struct target *target)
+ static int bperf__install_pe(struct evsel *evsel, int cpu_map_idx, int fd)
+ {
+ struct bperf_leader_bpf *skel = evsel->leader_skel;
++ int cpu = perf_cpu_map__cpu(evsel->core.cpus, cpu_map_idx).cpu;
+
+ return bpf_map_update_elem(bpf_map__fd(skel->maps.events),
+- &cpu_map_idx, &fd, BPF_ANY);
++ &cpu, &fd, BPF_ANY);
+ }
+
+ /*
+@@ -609,13 +604,12 @@ static int bperf__install_pe(struct evsel *evsel, int cpu_map_idx, int fd)
+ */
+ static int bperf_sync_counters(struct evsel *evsel)
+ {
+- int num_cpu, i, cpu;
++ struct perf_cpu cpu;
++ int idx;
++
++ perf_cpu_map__for_each_cpu(cpu, idx, evsel->core.cpus)
++ bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu.cpu);
+
+- num_cpu = perf_cpu_map__nr(all_cpu_map);
+- for (i = 0; i < num_cpu; i++) {
+- cpu = perf_cpu_map__cpu(all_cpu_map, i).cpu;
+- bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
+- }
+ return 0;
+ }
+
+diff --git a/tools/perf/util/bpf_counter_cgroup.c b/tools/perf/util/bpf_counter_cgroup.c
+index 6ff42619de12bd..883ce8a670bcd8 100644
+--- a/tools/perf/util/bpf_counter_cgroup.c
++++ b/tools/perf/util/bpf_counter_cgroup.c
+@@ -185,7 +185,8 @@ static int bperf_cgrp__load(struct evsel *evsel,
+ }
+
+ static int bperf_cgrp__install_pe(struct evsel *evsel __maybe_unused,
+- int cpu __maybe_unused, int fd __maybe_unused)
++ int cpu_map_idx __maybe_unused,
++ int fd __maybe_unused)
+ {
+ /* nothing to do */
+ return 0;
+diff --git a/tools/perf/util/bpf_skel/kwork_top.bpf.c b/tools/perf/util/bpf_skel/kwork_top.bpf.c
+index 73e32e06303015..6673386302e2fd 100644
+--- a/tools/perf/util/bpf_skel/kwork_top.bpf.c
++++ b/tools/perf/util/bpf_skel/kwork_top.bpf.c
+@@ -18,9 +18,7 @@ enum kwork_class_type {
+ };
+
+ #define MAX_ENTRIES 102400
+-#ifndef MAX_NR_CPUS
+ #define MAX_NR_CPUS 4096
+-#endif
+ #define PF_KTHREAD 0x00200000
+ #define MAX_COMMAND_LEN 16
+
+diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
+index bf7f3268b9a2f3..35505a1ffd1117 100644
+--- a/tools/perf/util/build-id.c
++++ b/tools/perf/util/build-id.c
+@@ -86,6 +86,13 @@ int build_id__snprintf(const struct build_id *build_id, char *bf, size_t bf_size
+ {
+ size_t offs = 0;
+
++ if (build_id->size == 0) {
++ /* Ensure bf is always \0 terminated. */
++ if (bf_size > 0)
++ bf[0] = '\0';
++ return 0;
++ }
++
+ for (size_t i = 0; i < build_id->size && offs < bf_size; ++i)
+ offs += snprintf(bf + offs, bf_size - offs, "%02x", build_id->data[i]);
+
+diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
+index b1e4919d016f1c..e257bd918c8914 100644
+--- a/tools/perf/util/disasm.c
++++ b/tools/perf/util/disasm.c
+@@ -390,13 +390,16 @@ static int jump__parse(struct arch *arch, struct ins_operands *ops, struct map_s
+ * skip over possible up to 2 operands to get to address, e.g.:
+ * tbnz w0, #26, ffff0000083cd190 <security_file_permission+0xd0>
+ */
+- if (c++ != NULL) {
++ if (c != NULL) {
++ c++;
+ ops->target.addr = strtoull(c, NULL, 16);
+ if (!ops->target.addr) {
+ c = strchr(c, ',');
+ c = validate_comma(c, ops);
+- if (c++ != NULL)
++ if (c != NULL) {
++ c++;
+ ops->target.addr = strtoull(c, NULL, 16);
++ }
+ }
+ } else {
+ ops->target.addr = strtoull(ops->raw, NULL, 16);
+diff --git a/tools/perf/util/drm_pmu.c b/tools/perf/util/drm_pmu.c
+index 988890f37ba7a4..98d4d2b556d4ed 100644
+--- a/tools/perf/util/drm_pmu.c
++++ b/tools/perf/util/drm_pmu.c
+@@ -458,8 +458,10 @@ static int for_each_drm_fdinfo_in_dir(int (*cb)(void *args, int fdinfo_dir_fd, c
+ }
+ ret = cb(args, fdinfo_dir_fd, fd_entry->d_name);
+ if (ret)
+- return ret;
++ goto close_fdinfo;
+ }
++
++close_fdinfo:
+ if (fdinfo_dir_fd != -1)
+ close(fdinfo_dir_fd);
+ closedir(fd_dir);
+diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
+index d264c143b59250..5df59812b80ca2 100644
+--- a/tools/perf/util/evsel.c
++++ b/tools/perf/util/evsel.c
+@@ -3562,7 +3562,7 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
+
+ /* If event has exclude user then don't exclude kernel. */
+ if (evsel->core.attr.exclude_user)
+- return false;
++ goto no_fallback;
+
+ /* Is there already the separator in the name. */
+ if (strchr(name, '/') ||
+@@ -3570,7 +3570,7 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
+ sep = "";
+
+ if (asprintf(&new_name, "%s%su", name, sep) < 0)
+- return false;
++ goto no_fallback;
+
+ free(evsel->name);
+ evsel->name = new_name;
+@@ -3593,17 +3593,19 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
+ sep = "";
+
+ if (asprintf(&new_name, "%s%sH", name, sep) < 0)
+- return false;
++ goto no_fallback;
+
+ free(evsel->name);
+ evsel->name = new_name;
+ /* Apple M1 requires exclude_guest */
+- scnprintf(msg, msgsize, "trying to fall back to excluding guest samples");
++ scnprintf(msg, msgsize, "Trying to fall back to excluding guest samples");
+ evsel->core.attr.exclude_guest = 1;
+
+ return true;
+ }
+-
++no_fallback:
++ scnprintf(msg, msgsize, "No fallback found for '%s' for error %d",
++ evsel__name(evsel), err);
+ return false;
+ }
+
+@@ -3935,6 +3937,8 @@ bool evsel__is_hybrid(const struct evsel *evsel)
+
+ struct evsel *evsel__leader(const struct evsel *evsel)
+ {
++ if (evsel->core.leader == NULL)
++ return NULL;
+ return container_of(evsel->core.leader, struct evsel, core);
+ }
+
+@@ -4048,9 +4052,9 @@ bool evsel__set_needs_uniquify(struct evsel *counter, const struct perf_stat_con
+
+ void evsel__uniquify_counter(struct evsel *counter)
+ {
+- const char *name, *pmu_name;
+- char *new_name, *config;
+- int ret;
++ const char *name, *pmu_name, *config;
++ char *new_name;
++ int len, ret;
+
+ /* No uniquification necessary. */
+ if (!counter->needs_uniquify)
+@@ -4064,15 +4068,23 @@ void evsel__uniquify_counter(struct evsel *counter)
+ counter->uniquified_name = true;
+
+ name = evsel__name(counter);
++ config = strchr(name, '/');
+ pmu_name = counter->pmu->name;
+- /* Already prefixed by the PMU name. */
+- if (!strncmp(name, pmu_name, strlen(pmu_name)))
+- return;
+
+- config = strchr(name, '/');
+- if (config) {
+- int len = config - name;
++ /* Already prefixed by the PMU name? */
++ len = pmu_name_len_no_suffix(pmu_name);
++
++ if (!strncmp(name, pmu_name, len)) {
++ /*
++ * If the PMU name is there, then there is no sense in not
++ * having a slash. Do this for robustness.
++ */
++ if (config == NULL)
++ config = name - 1;
+
++ ret = asprintf(&new_name, "%s/%s", pmu_name, config + 1);
++ } else if (config) {
++ len = config - name;
+ if (config[1] == '/') {
+ /* case: event// */
+ ret = asprintf(&new_name, "%s/%.*s/%s", pmu_name, len, name, config + 2);
+@@ -4084,7 +4096,7 @@ void evsel__uniquify_counter(struct evsel *counter)
+ config = strchr(name, ':');
+ if (config) {
+ /* case: event:.. */
+- int len = config - name;
++ len = config - name;
+
+ ret = asprintf(&new_name, "%s/%.*s/%s", pmu_name, len, name, config + 1);
+ } else {
+diff --git a/tools/perf/util/lzma.c b/tools/perf/util/lzma.c
+index bbcd2ffcf4bd13..c355757ed3911d 100644
+--- a/tools/perf/util/lzma.c
++++ b/tools/perf/util/lzma.c
+@@ -120,7 +120,7 @@ bool lzma_is_compressed(const char *input)
+ ssize_t rc;
+
+ if (fd < 0)
+- return -1;
++ return false;
+
+ rc = read(fd, buf, sizeof(buf));
+ close(fd);
+diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
+index 8282ddf68b9832..0026cff4d69e41 100644
+--- a/tools/perf/util/parse-events.c
++++ b/tools/perf/util/parse-events.c
+@@ -126,7 +126,8 @@ static char *get_config_name(const struct parse_events_terms *head_terms)
+ return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME);
+ }
+
+-static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms)
++static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms,
++ bool fake_pmu)
+ {
+ struct parse_events_term *term;
+ struct perf_cpu_map *cpus = NULL;
+@@ -135,24 +136,33 @@ static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head
+ return NULL;
+
+ list_for_each_entry(term, &head_terms->terms, list) {
+- if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) {
+- struct perf_cpu_map *term_cpus;
++ struct perf_cpu_map *term_cpus;
+
+- if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+- term_cpus = perf_cpu_map__new_int(term->val.num);
++ if (term->type_term != PARSE_EVENTS__TERM_TYPE_CPU)
++ continue;
++
++ if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
++ term_cpus = perf_cpu_map__new_int(term->val.num);
++ } else {
++ struct perf_pmu *pmu = perf_pmus__find(term->val.str);
++
++ if (pmu) {
++ term_cpus = pmu->is_core && perf_cpu_map__is_empty(pmu->cpus)
++ ? cpu_map__online()
++ : perf_cpu_map__get(pmu->cpus);
+ } else {
+- struct perf_pmu *pmu = perf_pmus__find(term->val.str);
+-
+- if (pmu && perf_cpu_map__is_empty(pmu->cpus))
+- term_cpus = pmu->is_core ? cpu_map__online() : NULL;
+- else if (pmu)
+- term_cpus = perf_cpu_map__get(pmu->cpus);
+- else
+- term_cpus = perf_cpu_map__new(term->val.str);
++ term_cpus = perf_cpu_map__new(term->val.str);
++ if (!term_cpus && fake_pmu) {
++ /*
++ * Assume the PMU string makes sense on a different
++ * machine and fake a value with all online CPUs.
++ */
++ term_cpus = cpu_map__online();
++ }
+ }
+- perf_cpu_map__merge(&cpus, term_cpus);
+- perf_cpu_map__put(term_cpus);
+ }
++ perf_cpu_map__merge(&cpus, term_cpus);
++ perf_cpu_map__put(term_cpus);
+ }
+
+ return cpus;
+@@ -369,13 +379,13 @@ static int parse_aliases(const char *str, const char *const names[][EVSEL__MAX_A
+
+ typedef int config_term_func_t(struct perf_event_attr *attr,
+ struct parse_events_term *term,
+- struct parse_events_error *err);
++ struct parse_events_state *parse_state);
+ static int config_term_common(struct perf_event_attr *attr,
+ struct parse_events_term *term,
+- struct parse_events_error *err);
++ struct parse_events_state *parse_state);
+ static int config_attr(struct perf_event_attr *attr,
+ const struct parse_events_terms *head,
+- struct parse_events_error *err,
++ struct parse_events_state *parse_state,
+ config_term_func_t config_term);
+
+ /**
+@@ -471,7 +481,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name,
+ bool found_supported = false;
+ const char *config_name = get_config_name(parsed_terms);
+ const char *metric_id = get_config_metric_id(parsed_terms);
+- struct perf_cpu_map *cpus = get_config_cpu(parsed_terms);
++ struct perf_cpu_map *cpus = get_config_cpu(parsed_terms, parse_state->fake_pmu);
+ int ret = 0;
+ struct evsel *first_wildcard_match = NULL;
+
+@@ -514,8 +524,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name,
+ found_supported = true;
+
+ if (parsed_terms) {
+- if (config_attr(&attr, parsed_terms, parse_state->error,
+- config_term_common)) {
++ if (config_attr(&attr, parsed_terms, parse_state, config_term_common)) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+@@ -767,8 +776,7 @@ int parse_events_add_breakpoint(struct parse_events_state *parse_state,
+ attr.sample_period = 1;
+
+ if (head_config) {
+- if (config_attr(&attr, head_config, parse_state->error,
+- config_term_common))
++ if (config_attr(&attr, head_config, parse_state, config_term_common))
+ return -EINVAL;
+
+ if (get_config_terms(head_config, &config_terms))
+@@ -903,12 +911,12 @@ void parse_events__shrink_config_terms(void)
+
+ static int config_term_common(struct perf_event_attr *attr,
+ struct parse_events_term *term,
+- struct parse_events_error *err)
++ struct parse_events_state *parse_state)
+ {
+-#define CHECK_TYPE_VAL(type) \
+-do { \
+- if (check_type_val(term, err, PARSE_EVENTS__TERM_TYPE_ ## type)) \
+- return -EINVAL; \
++#define CHECK_TYPE_VAL(type) \
++do { \
++ if (check_type_val(term, parse_state->error, PARSE_EVENTS__TERM_TYPE_ ## type)) \
++ return -EINVAL; \
+ } while (0)
+
+ switch (term->type_term) {
+@@ -939,7 +947,7 @@ do { \
+ if (strcmp(term->val.str, "no") &&
+ parse_branch_str(term->val.str,
+ &attr->branch_sample_type)) {
+- parse_events_error__handle(err, term->err_val,
++ parse_events_error__handle(parse_state->error, term->err_val,
+ strdup("invalid branch sample type"),
+ NULL);
+ return -EINVAL;
+@@ -948,7 +956,7 @@ do { \
+ case PARSE_EVENTS__TERM_TYPE_TIME:
+ CHECK_TYPE_VAL(NUM);
+ if (term->val.num > 1) {
+- parse_events_error__handle(err, term->err_val,
++ parse_events_error__handle(parse_state->error, term->err_val,
+ strdup("expected 0 or 1"),
+ NULL);
+ return -EINVAL;
+@@ -990,7 +998,7 @@ do { \
+ case PARSE_EVENTS__TERM_TYPE_PERCORE:
+ CHECK_TYPE_VAL(NUM);
+ if ((unsigned int)term->val.num > 1) {
+- parse_events_error__handle(err, term->err_val,
++ parse_events_error__handle(parse_state->error, term->err_val,
+ strdup("expected 0 or 1"),
+ NULL);
+ return -EINVAL;
+@@ -1005,7 +1013,7 @@ do { \
+ case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
+ CHECK_TYPE_VAL(NUM);
+ if (term->val.num > UINT_MAX) {
+- parse_events_error__handle(err, term->err_val,
++ parse_events_error__handle(parse_state->error, term->err_val,
+ strdup("too big"),
+ NULL);
+ return -EINVAL;
+@@ -1016,7 +1024,7 @@ do { \
+
+ if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+ if (term->val.num >= (u64)cpu__max_present_cpu().cpu) {
+- parse_events_error__handle(err, term->err_val,
++ parse_events_error__handle(parse_state->error, term->err_val,
+ strdup("too big"),
+ /*help=*/NULL);
+ return -EINVAL;
+@@ -1028,8 +1036,8 @@ do { \
+ break;
+
+ map = perf_cpu_map__new(term->val.str);
+- if (!map) {
+- parse_events_error__handle(err, term->err_val,
++ if (!map && !parse_state->fake_pmu) {
++ parse_events_error__handle(parse_state->error, term->err_val,
+ strdup("not a valid PMU or CPU number"),
+ /*help=*/NULL);
+ return -EINVAL;
+@@ -1042,7 +1050,7 @@ do { \
+ case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
+ case PARSE_EVENTS__TERM_TYPE_HARDWARE:
+ default:
+- parse_events_error__handle(err, term->err_term,
++ parse_events_error__handle(parse_state->error, term->err_term,
+ strdup(parse_events__term_type_str(term->type_term)),
+ parse_events_formats_error_string(NULL));
+ return -EINVAL;
+@@ -1057,7 +1065,7 @@ do { \
+ * if an invalid config term is provided for legacy events
+ * (for example, instructions/badterm/...), which is confusing.
+ */
+- if (!config_term_avail(term->type_term, err))
++ if (!config_term_avail(term->type_term, parse_state->error))
+ return -EINVAL;
+ return 0;
+ #undef CHECK_TYPE_VAL
+@@ -1065,7 +1073,7 @@ do { \
+
+ static int config_term_pmu(struct perf_event_attr *attr,
+ struct parse_events_term *term,
+- struct parse_events_error *err)
++ struct parse_events_state *parse_state)
+ {
+ if (term->type_term == PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE) {
+ struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type);
+@@ -1074,7 +1082,7 @@ static int config_term_pmu(struct perf_event_attr *attr,
+ char *err_str;
+
+ if (asprintf(&err_str, "Failed to find PMU for type %d", attr->type) >= 0)
+- parse_events_error__handle(err, term->err_term,
++ parse_events_error__handle(parse_state->error, term->err_term,
+ err_str, /*help=*/NULL);
+ return -EINVAL;
+ }
+@@ -1100,7 +1108,7 @@ static int config_term_pmu(struct perf_event_attr *attr,
+ char *err_str;
+
+ if (asprintf(&err_str, "Failed to find PMU for type %d", attr->type) >= 0)
+- parse_events_error__handle(err, term->err_term,
++ parse_events_error__handle(parse_state->error, term->err_term,
+ err_str, /*help=*/NULL);
+ return -EINVAL;
+ }
+@@ -1128,12 +1136,12 @@ static int config_term_pmu(struct perf_event_attr *attr,
+ */
+ return 0;
+ }
+- return config_term_common(attr, term, err);
++ return config_term_common(attr, term, parse_state);
+ }
+
+ static int config_term_tracepoint(struct perf_event_attr *attr,
+ struct parse_events_term *term,
+- struct parse_events_error *err)
++ struct parse_events_state *parse_state)
+ {
+ switch (term->type_term) {
+ case PARSE_EVENTS__TERM_TYPE_CALLGRAPH:
+@@ -1147,7 +1155,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
+ case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+ case PARSE_EVENTS__TERM_TYPE_AUX_ACTION:
+ case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
+- return config_term_common(attr, term, err);
++ return config_term_common(attr, term, parse_state);
+ case PARSE_EVENTS__TERM_TYPE_USER:
+ case PARSE_EVENTS__TERM_TYPE_CONFIG:
+ case PARSE_EVENTS__TERM_TYPE_CONFIG1:
+@@ -1166,12 +1174,10 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
+ case PARSE_EVENTS__TERM_TYPE_HARDWARE:
+ case PARSE_EVENTS__TERM_TYPE_CPU:
+ default:
+- if (err) {
+- parse_events_error__handle(err, term->err_term,
++ parse_events_error__handle(parse_state->error, term->err_term,
+ strdup(parse_events__term_type_str(term->type_term)),
+ strdup("valid terms: call-graph,stack-size\n")
+ );
+- }
+ return -EINVAL;
+ }
+
+@@ -1180,13 +1186,13 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
+
+ static int config_attr(struct perf_event_attr *attr,
+ const struct parse_events_terms *head,
+- struct parse_events_error *err,
++ struct parse_events_state *parse_state,
+ config_term_func_t config_term)
+ {
+ struct parse_events_term *term;
+
+ list_for_each_entry(term, &head->terms, list)
+- if (config_term(attr, term, err))
++ if (config_term(attr, term, parse_state))
+ return -EINVAL;
+
+ return 0;
+@@ -1378,8 +1384,7 @@ int parse_events_add_tracepoint(struct parse_events_state *parse_state,
+ if (head_config) {
+ struct perf_event_attr attr;
+
+- if (config_attr(&attr, head_config, err,
+- config_term_tracepoint))
++ if (config_attr(&attr, head_config, parse_state, config_term_tracepoint))
+ return -EINVAL;
+ }
+
+@@ -1408,8 +1413,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state,
+ }
+
+ if (head_config) {
+- if (config_attr(&attr, head_config, parse_state->error,
+- config_term_common))
++ if (config_attr(&attr, head_config, parse_state, config_term_common))
+ return -EINVAL;
+
+ if (get_config_terms(head_config, &config_terms))
+@@ -1418,7 +1422,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state,
+
+ name = get_config_name(head_config);
+ metric_id = get_config_metric_id(head_config);
+- cpus = get_config_cpu(head_config);
++ cpus = get_config_cpu(head_config, parse_state->fake_pmu);
+ ret = __add_event(list, &parse_state->idx, &attr, /*init_attr*/true, name,
+ metric_id, pmu, &config_terms, first_wildcard_match,
+ cpus, /*alternate_hw_config=*/PERF_COUNT_HW_MAX) ? 0 : -ENOMEM;
+@@ -1531,7 +1535,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state,
+ fix_raw(&parsed_terms, pmu);
+
+ /* Configure attr/terms with a known PMU, this will set hardcoded terms. */
+- if (config_attr(&attr, &parsed_terms, parse_state->error, config_term_pmu)) {
++ if (config_attr(&attr, &parsed_terms, parse_state, config_term_pmu)) {
+ parse_events_terms__exit(&parsed_terms);
+ return -EINVAL;
+ }
+@@ -1555,7 +1559,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state,
+
+ /* Configure attr/terms again if an alias was expanded. */
+ if (alias_rewrote_terms &&
+- config_attr(&attr, &parsed_terms, parse_state->error, config_term_pmu)) {
++ config_attr(&attr, &parsed_terms, parse_state, config_term_pmu)) {
+ parse_events_terms__exit(&parsed_terms);
+ return -EINVAL;
+ }
+@@ -1583,7 +1587,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state,
+ return -EINVAL;
+ }
+
+- term_cpu = get_config_cpu(&parsed_terms);
++ term_cpu = get_config_cpu(&parsed_terms, parse_state->fake_pmu);
+ evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true,
+ get_config_name(&parsed_terms),
+ get_config_metric_id(&parsed_terms), pmu,
+diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
+index 26ae078278cd67..09af486c83e4ff 100644
+--- a/tools/perf/util/session.c
++++ b/tools/perf/util/session.c
+@@ -1402,7 +1402,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
+ const struct perf_tool *tool = session->tool;
+ struct perf_sample sample;
+ int fd = perf_data__fd(session->data);
+- int err;
++ s64 err;
+
+ perf_sample__init(&sample, /*all=*/true);
+ if ((event->header.type != PERF_RECORD_COMPRESSED &&
+diff --git a/tools/perf/util/setup.py b/tools/perf/util/setup.py
+index dd289d15acfd62..9cae2c472f4ad4 100644
+--- a/tools/perf/util/setup.py
++++ b/tools/perf/util/setup.py
+@@ -1,6 +1,7 @@
+ from os import getenv, path
+ from subprocess import Popen, PIPE
+ from re import sub
++import shlex
+
+ cc = getenv("CC")
+ assert cc, "Environment variable CC not set"
+@@ -22,7 +23,9 @@ assert srctree, "Environment variable srctree, for the Linux sources, not set"
+ src_feature_tests = f'{srctree}/tools/build/feature'
+
+ def clang_has_option(option):
+- cc_output = Popen([cc, cc_options + option, path.join(src_feature_tests, "test-hello.c") ], stderr=PIPE).stderr.readlines()
++ cmd = shlex.split(f"{cc} {cc_options} {option}")
++ cmd.append(path.join(src_feature_tests, "test-hello.c"))
++ cc_output = Popen(cmd, stderr=PIPE).stderr.readlines()
+ return [o for o in cc_output if ((b"unknown argument" in o) or (b"is not supported" in o) or (b"unknown warning option" in o))] == [ ]
+
+ if cc_is_clang:
+diff --git a/tools/perf/util/zlib.c b/tools/perf/util/zlib.c
+index 78d2297c1b6746..1f7c065230599d 100644
+--- a/tools/perf/util/zlib.c
++++ b/tools/perf/util/zlib.c
+@@ -88,7 +88,7 @@ bool gzip_is_compressed(const char *input)
+ ssize_t rc;
+
+ if (fd < 0)
+- return -1;
++ return false;
+
+ rc = read(fd, buf, sizeof(buf));
+ close(fd);
+diff --git a/tools/power/acpi/tools/acpidump/apfiles.c b/tools/power/acpi/tools/acpidump/apfiles.c
+index 75db0091e2758a..d6b8a201480b75 100644
+--- a/tools/power/acpi/tools/acpidump/apfiles.c
++++ b/tools/power/acpi/tools/acpidump/apfiles.c
+@@ -103,7 +103,7 @@ int ap_open_output_file(char *pathname)
+
+ int ap_write_to_binary_file(struct acpi_table_header *table, u32 instance)
+ {
+- char filename[ACPI_NAMESEG_SIZE + 16] ACPI_NONSTRING;
++ char filename[ACPI_NAMESEG_SIZE + 16];
+ char instance_str[16];
+ ACPI_FILE file;
+ acpi_size actual;
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+index 7fd555b123b900..8e92dfead43bf5 100755
+--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+@@ -3187,6 +3187,17 @@ deny_join_id0_tests()
+ run_tests $ns1 $ns2 10.0.1.1
+ chk_join_nr 1 1 1
+ fi
++
++ # default limits, server deny join id 0 + signal
++ if reset_with_allow_join_id0 "default limits, server deny join id 0" 0 1; then
++ pm_nl_set_limits $ns1 0 2
++ pm_nl_set_limits $ns2 0 2
++ pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
++ pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow
++ pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow
++ run_tests $ns1 $ns2 10.0.1.1
++ chk_join_nr 2 2 2
++ fi
+ }
+
+ fullmesh_tests()
+diff --git a/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh b/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh
+index 1014551dd76945..6731fe1eaf2e99 100755
+--- a/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh
++++ b/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh
+@@ -17,9 +17,31 @@ cleanup()
+
+ checktool "socat -h" "run test without socat"
+ checktool "iptables --version" "run test without iptables"
++checktool "conntrack --version" "run test without conntrack"
+
+ trap cleanup EXIT
+
++connect_done()
++{
++ local ns="$1"
++ local port="$2"
++
++ ip netns exec "$ns" ss -nt -o state established "dport = :$port" | grep -q "$port"
++}
++
++check_ctstate()
++{
++ local ns="$1"
++ local dp="$2"
++
++ if ! ip netns exec "$ns" conntrack --get -s 192.168.1.2 -d 192.168.1.1 -p tcp \
++ --sport 10000 --dport "$dp" --state ESTABLISHED > /dev/null 2>&1;then
++ echo "FAIL: Did not find expected state for dport $2"
++ ip netns exec "$ns" bash -c 'conntrack -L; conntrack -S; ss -nt'
++ ret=1
++ fi
++}
++
+ setup_ns ns1 ns2
+
+ # Connect the namespaces using a veth pair
+@@ -44,15 +66,18 @@ socatpid=$!
+ ip netns exec "$ns2" sysctl -q net.ipv4.ip_local_port_range="10000 10000"
+
+ # add a virtual IP using DNAT
+-ip netns exec "$ns2" iptables -t nat -A OUTPUT -d 10.96.0.1/32 -p tcp --dport 443 -j DNAT --to-destination 192.168.1.1:5201
++ip netns exec "$ns2" iptables -t nat -A OUTPUT -d 10.96.0.1/32 -p tcp --dport 443 -j DNAT --to-destination 192.168.1.1:5201 || exit 1
+
+ # ... and route it to the other namespace
+ ip netns exec "$ns2" ip route add 10.96.0.1 via 192.168.1.1
+
+-# add a persistent connection from the other namespace
+-ip netns exec "$ns2" socat -t 10 - TCP:192.168.1.1:5201 > /dev/null &
++# listener should be up by now, wait if it isn't yet.
++wait_local_port_listen "$ns1" 5201 tcp
+
+-sleep 1
++# add a persistent connection from the other namespace
++sleep 10 | ip netns exec "$ns2" socat -t 10 - TCP:192.168.1.1:5201 > /dev/null &
++cpid0=$!
++busywait "$BUSYWAIT_TIMEOUT" connect_done "$ns2" "5201"
+
+ # ip daddr:dport will be rewritten to 192.168.1.1 5201
+ # NAT must reallocate source port 10000 because
+@@ -71,26 +96,25 @@ fi
+ ip netns exec "$ns1" iptables -t nat -A PREROUTING -p tcp --dport 5202 -j REDIRECT --to-ports 5201
+ ip netns exec "$ns1" iptables -t nat -A PREROUTING -p tcp --dport 5203 -j REDIRECT --to-ports 5201
+
+-sleep 5 | ip netns exec "$ns2" socat -t 5 -u STDIN TCP:192.168.1.1:5202,connect-timeout=5 >/dev/null &
++sleep 5 | ip netns exec "$ns2" socat -T 5 -u STDIN TCP:192.168.1.1:5202,connect-timeout=5 >/dev/null &
++cpid1=$!
+
+-# if connect succeeds, client closes instantly due to EOF on stdin.
+-# if connect hangs, it will time out after 5s.
+-echo | ip netns exec "$ns2" socat -t 3 -u STDIN TCP:192.168.1.1:5203,connect-timeout=5 >/dev/null &
++sleep 5 | ip netns exec "$ns2" socat -T 5 -u STDIN TCP:192.168.1.1:5203,connect-timeout=5 >/dev/null &
+ cpid2=$!
+
+-time_then=$(date +%s)
+-wait $cpid2
+-rv=$?
+-time_now=$(date +%s)
++busywait "$BUSYWAIT_TIMEOUT" connect_done "$ns2" 5202
++busywait "$BUSYWAIT_TIMEOUT" connect_done "$ns2" 5203
+
+-# Check how much time has elapsed, expectation is for
+-# 'cpid2' to connect and then exit (and no connect delay).
+-delta=$((time_now - time_then))
++check_ctstate "$ns1" 5202
++check_ctstate "$ns1" 5203
+
+-if [ $delta -lt 2 ] && [ $rv -eq 0 ]; then
++kill $socatpid $cpid0 $cpid1 $cpid2
++socatpid=0
++
++if [ $ret -eq 0 ]; then
+ echo "PASS: could connect to service via redirected ports"
+ else
+- echo "FAIL: socat cannot connect to service via redirect ($delta seconds elapsed, returned $rv)"
++ echo "FAIL: socat cannot connect to service via redirect"
+ ret=1
+ fi
+
+diff --git a/tools/testing/selftests/net/netfilter/nft_fib.sh b/tools/testing/selftests/net/netfilter/nft_fib.sh
+index 9929a9ffef6521..04544905c2164d 100755
+--- a/tools/testing/selftests/net/netfilter/nft_fib.sh
++++ b/tools/testing/selftests/net/netfilter/nft_fib.sh
+@@ -256,12 +256,12 @@ test_ping_unreachable() {
+ local daddr4=$1
+ local daddr6=$2
+
+- if ip netns exec "$ns1" ping -c 1 -w 1 -q "$daddr4" > /dev/null; then
++ if ip netns exec "$ns1" ping -c 1 -W 0.1 -q "$daddr4" > /dev/null; then
+ echo "FAIL: ${ns1} could reach $daddr4" 1>&2
+ return 1
+ fi
+
+- if ip netns exec "$ns1" ping -c 1 -w 1 -q "$daddr6" > /dev/null; then
++ if ip netns exec "$ns1" ping -c 1 -W 0.1 -q "$daddr6" > /dev/null; then
+ echo "FAIL: ${ns1} could reach $daddr6" 1>&2
+ return 1
+ fi
+@@ -437,14 +437,17 @@ check_type()
+ local addr="$3"
+ local type="$4"
+ local count="$5"
++ local lret=0
+
+ [ -z "$count" ] && count=1
+
+ if ! ip netns exec "$nsrouter" nft get element inet t "$setname" { "$iifname" . "$addr" . "$type" } |grep -q "counter packets $count";then
+- echo "FAIL: did not find $iifname . $addr . $type in $setname"
++ echo "FAIL: did not find $iifname . $addr . $type in $setname with $count packets"
+ ip netns exec "$nsrouter" nft list set inet t "$setname"
+ ret=1
+- return 1
++ # do not fail right away, delete entry if it exists so later test that
++ # checks for unwanted keys don't get confused by this *expected* key.
++ lret=1
+ fi
+
+ # delete the entry, this allows to check if anything unexpected appeared
+@@ -456,7 +459,7 @@ check_type()
+ return 1
+ fi
+
+- return 0
++ return $lret
+ }
+
+ check_local()
+diff --git a/tools/testing/selftests/net/ovpn/ovpn-cli.c b/tools/testing/selftests/net/ovpn/ovpn-cli.c
+index 9201f2905f2cee..8d0f2f61923c98 100644
+--- a/tools/testing/selftests/net/ovpn/ovpn-cli.c
++++ b/tools/testing/selftests/net/ovpn/ovpn-cli.c
+@@ -1586,6 +1586,7 @@ static int ovpn_listen_mcast(void)
+ sock = nl_socket_alloc();
+ if (!sock) {
+ fprintf(stderr, "cannot allocate netlink socket\n");
++ ret = -ENOMEM;
+ goto err_free;
+ }
+
+@@ -2105,6 +2106,7 @@ static int ovpn_run_cmd(struct ovpn_ctx *ovpn)
+ ret = ovpn_listen_mcast();
+ break;
+ case CMD_INVALID:
++ ret = -EINVAL;
+ break;
+ }
+
+diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
+index 663a9cef1952f0..dcac5cbe793370 100644
+--- a/tools/testing/selftests/rseq/rseq.c
++++ b/tools/testing/selftests/rseq/rseq.c
+@@ -40,9 +40,9 @@
+ * Define weak versions to play nice with binaries that are statically linked
+ * against a libc that doesn't support registering its own rseq.
+ */
+-__weak ptrdiff_t __rseq_offset;
+-__weak unsigned int __rseq_size;
+-__weak unsigned int __rseq_flags;
++extern __weak ptrdiff_t __rseq_offset;
++extern __weak unsigned int __rseq_size;
++extern __weak unsigned int __rseq_flags;
+
+ static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset;
+ static const unsigned int *libc_rseq_size_p = &__rseq_size;
+@@ -209,7 +209,7 @@ void rseq_init(void)
+ * libc not having registered a restartable sequence. Try to find the
+ * symbols if that's the case.
+ */
+- if (!*libc_rseq_size_p) {
++ if (!libc_rseq_size_p || !*libc_rseq_size_p) {
+ libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset");
+ libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size");
+ libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags");
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-24 9:08 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-24 9:08 UTC (permalink / raw
To: gentoo-commits
commit: bd62138bac93886fca454232d910f3135040e20e
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Fri Oct 24 09:08:05 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Fri Oct 24 09:08:05 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=bd62138b
Linux patch 6.17.5
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 +
1004_linux-6.17.5.patch | 8365 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 8369 insertions(+)
diff --git a/0000_README b/0000_README
index 44b62f97..f15fc4f0 100644
--- a/0000_README
+++ b/0000_README
@@ -59,6 +59,10 @@ Patch: 1003_linux-6.17.4.patch
From: https://www.kernel.org
Desc: Linux 6.17.4
+Patch: 1004_linux-6.17.5.patch
+From: https://www.kernel.org
+Desc: Linux 6.17.5
+
Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
diff --git a/1004_linux-6.17.5.patch b/1004_linux-6.17.5.patch
new file mode 100644
index 00000000..e568bf25
--- /dev/null
+++ b/1004_linux-6.17.5.patch
@@ -0,0 +1,8365 @@
+diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
+index b18ef4064bc046..a7ec57060f64f5 100644
+--- a/Documentation/arch/arm64/silicon-errata.rst
++++ b/Documentation/arch/arm64/silicon-errata.rst
+@@ -200,6 +200,8 @@ stable kernels.
+ +----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Neoverse-V3 | #3312417 | ARM64_ERRATUM_3194386 |
+ +----------------+-----------------+-----------------+-----------------------------+
++| ARM | Neoverse-V3AE | #3312417 | ARM64_ERRATUM_3194386 |
+++----------------+-----------------+-----------------+-----------------------------+
+ | ARM | MMU-500 | #841119,826419 | ARM_SMMU_MMU_500_CPRE_ERRATA|
+ | | | #562869,1047329 | |
+ +----------------+-----------------+-----------------+-----------------------------+
+diff --git a/Documentation/networking/seg6-sysctl.rst b/Documentation/networking/seg6-sysctl.rst
+index 07c20e470bafe6..1b6af4779be114 100644
+--- a/Documentation/networking/seg6-sysctl.rst
++++ b/Documentation/networking/seg6-sysctl.rst
+@@ -25,6 +25,9 @@ seg6_require_hmac - INTEGER
+
+ Default is 0.
+
++/proc/sys/net/ipv6/seg6_* variables:
++====================================
++
+ seg6_flowlabel - INTEGER
+ Controls the behaviour of computing the flowlabel of outer
+ IPv6 header in case of SR T.encaps
+diff --git a/Documentation/sphinx/kernel_feat.py b/Documentation/sphinx/kernel_feat.py
+index e3a51867f27bd5..aaac76892cebb0 100644
+--- a/Documentation/sphinx/kernel_feat.py
++++ b/Documentation/sphinx/kernel_feat.py
+@@ -40,9 +40,11 @@ import sys
+ from docutils import nodes, statemachine
+ from docutils.statemachine import ViewList
+ from docutils.parsers.rst import directives, Directive
+-from docutils.utils.error_reporting import ErrorString
+ from sphinx.util.docutils import switch_source_input
+
++def ErrorString(exc): # Shamelessly stolen from docutils
++ return f'{exc.__class__.__name}: {exc}'
++
+ __version__ = '1.0'
+
+ def setup(app):
+diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
+index 1e566e87ebcdda..641e81c58a8c18 100755
+--- a/Documentation/sphinx/kernel_include.py
++++ b/Documentation/sphinx/kernel_include.py
+@@ -35,13 +35,15 @@
+ import os.path
+
+ from docutils import io, nodes, statemachine
+-from docutils.utils.error_reporting import SafeString, ErrorString
+ from docutils.parsers.rst import directives
+ from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
+ from docutils.parsers.rst.directives.misc import Include
+
+ __version__ = '1.0'
+
++def ErrorString(exc): # Shamelessly stolen from docutils
++ return f'{exc.__class__.__name}: {exc}'
++
+ # ==============================================================================
+ def setup(app):
+ # ==============================================================================
+@@ -112,7 +114,7 @@ class KernelInclude(Include):
+ raise self.severe('Problems with "%s" directive path:\n'
+ 'Cannot encode input file path "%s" '
+ '(wrong locale?).' %
+- (self.name, SafeString(path)))
++ (self.name, path))
+ except IOError as error:
+ raise self.severe('Problems with "%s" directive path:\n%s.' %
+ (self.name, ErrorString(error)))
+diff --git a/Documentation/sphinx/maintainers_include.py b/Documentation/sphinx/maintainers_include.py
+index d31cff8674367c..519ad18685b23f 100755
+--- a/Documentation/sphinx/maintainers_include.py
++++ b/Documentation/sphinx/maintainers_include.py
+@@ -22,10 +22,12 @@ import re
+ import os.path
+
+ from docutils import statemachine
+-from docutils.utils.error_reporting import ErrorString
+ from docutils.parsers.rst import Directive
+ from docutils.parsers.rst.directives.misc import Include
+
++def ErrorString(exc): # Shamelessly stolen from docutils
++ return f'{exc.__class__.__name}: {exc}'
++
+ __version__ = '1.0'
+
+ def setup(app):
+diff --git a/Makefile b/Makefile
+index 4c3092dae03cf5..072a3be6255109 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 17
+-SUBLEVEL = 4
++SUBLEVEL = 5
+ EXTRAVERSION =
+ NAME = Baby Opossum Posse
+
+diff --git a/arch/Kconfig b/arch/Kconfig
+index d1b4ffd6e08564..880cddff5eda7f 100644
+--- a/arch/Kconfig
++++ b/arch/Kconfig
+@@ -917,6 +917,7 @@ config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC
+ def_bool y
+ depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
+ depends on RUSTC_VERSION >= 107900
++ depends on ARM64 || X86_64
+ # With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373
+ depends on (RUSTC_LLVM_VERSION >= 190103 && RUSTC_VERSION >= 108200) || \
+ (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS)
+diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
+index e9bbfacc35a64d..93f391e67af151 100644
+--- a/arch/arm64/Kconfig
++++ b/arch/arm64/Kconfig
+@@ -1138,6 +1138,7 @@ config ARM64_ERRATUM_3194386
+ * ARM Neoverse-V1 erratum 3324341
+ * ARM Neoverse V2 erratum 3324336
+ * ARM Neoverse-V3 erratum 3312417
++ * ARM Neoverse-V3AE erratum 3312417
+
+ On affected cores "MSR SSBS, #0" instructions may not affect
+ subsequent speculative instructions, which may permit unexepected
+diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
+index 661735616787e2..eaec55dd3dbecc 100644
+--- a/arch/arm64/include/asm/cputype.h
++++ b/arch/arm64/include/asm/cputype.h
+@@ -93,6 +93,7 @@
+ #define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+ #define ARM_CPU_PART_CORTEX_A720 0xD81
+ #define ARM_CPU_PART_CORTEX_X4 0xD82
++#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+ #define ARM_CPU_PART_NEOVERSE_V3 0xD84
+ #define ARM_CPU_PART_CORTEX_X925 0xD85
+ #define ARM_CPU_PART_CORTEX_A725 0xD87
+@@ -182,6 +183,7 @@
+ #define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+ #define MIDR_CORTEX_A720 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720)
+ #define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
++#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+ #define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+ #define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+ #define MIDR_CORTEX_A725 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A725)
+diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
+index 6604fd6f33f452..9effb4b68208da 100644
+--- a/arch/arm64/include/asm/sysreg.h
++++ b/arch/arm64/include/asm/sysreg.h
+@@ -1231,10 +1231,19 @@
+ __val; \
+ })
+
++/*
++ * The "Z" constraint combined with the "%x0" template should be enough
++ * to force XZR generation if (v) is a constant 0 value but LLVM does not
++ * yet understand that modifier/constraint combo so a conditional is required
++ * to nudge the compiler into using XZR as a source for a 0 constant value.
++ */
+ #define write_sysreg_s(v, r) do { \
+ u64 __val = (u64)(v); \
+ u32 __maybe_unused __check_r = (u32)(r); \
+- asm volatile(__msr_s(r, "%x0") : : "rZ" (__val)); \
++ if (__builtin_constant_p(__val) && __val == 0) \
++ asm volatile(__msr_s(r, "xzr")); \
++ else \
++ asm volatile(__msr_s(r, "%x0") : : "r" (__val)); \
+ } while (0)
+
+ /*
+diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
+index 59d723c9ab8f5a..21f86c160aab2b 100644
+--- a/arch/arm64/kernel/cpu_errata.c
++++ b/arch/arm64/kernel/cpu_errata.c
+@@ -545,6 +545,7 @@ static const struct midr_range erratum_spec_ssbs_list[] = {
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
++ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ {}
+ };
+ #endif
+diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
+index 2b0c5925502e70..db116a62ac9556 100644
+--- a/arch/arm64/kernel/entry-common.c
++++ b/arch/arm64/kernel/entry-common.c
+@@ -832,6 +832,8 @@ static void noinstr el0_breakpt(struct pt_regs *regs, unsigned long esr)
+
+ static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr)
+ {
++ bool step_done;
++
+ if (!is_ttbr0_addr(regs->pc))
+ arm64_apply_bp_hardening();
+
+@@ -842,10 +844,10 @@ static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr)
+ * If we are stepping a suspended breakpoint there's nothing more to do:
+ * the single-step is complete.
+ */
+- if (!try_step_suspended_breakpoints(regs)) {
+- local_daif_restore(DAIF_PROCCTX);
++ step_done = try_step_suspended_breakpoints(regs);
++ local_daif_restore(DAIF_PROCCTX);
++ if (!step_done)
+ do_el0_softstep(esr, regs);
+- }
+ exit_to_user_mode(regs);
+ }
+
+diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
+index bd6b6a620a09ca..3036df0cc2013c 100644
+--- a/arch/arm64/kvm/arm.c
++++ b/arch/arm64/kvm/arm.c
+@@ -1789,6 +1789,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
+ case KVM_GET_VCPU_EVENTS: {
+ struct kvm_vcpu_events events;
+
++ if (!kvm_vcpu_initialized(vcpu))
++ return -ENOEXEC;
++
+ if (kvm_arm_vcpu_get_events(vcpu, &events))
+ return -EINVAL;
+
+@@ -1800,6 +1803,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
+ case KVM_SET_VCPU_EVENTS: {
+ struct kvm_vcpu_events events;
+
++ if (!kvm_vcpu_initialized(vcpu))
++ return -ENOEXEC;
++
+ if (copy_from_user(&events, argp, sizeof(events)))
+ return -EFAULT;
+
+diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
+index 5782e743fd2702..4ebc333dd786f6 100644
+--- a/arch/powerpc/kernel/fadump.c
++++ b/arch/powerpc/kernel/fadump.c
+@@ -1747,6 +1747,9 @@ void __init fadump_setup_param_area(void)
+ {
+ phys_addr_t range_start, range_end;
+
++ if (!fw_dump.fadump_enabled)
++ return;
++
+ if (!fw_dump.param_area_supported || fw_dump.dump_active)
+ return;
+
+diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
+index c0738d6c6498a5..8723390c7cad5f 100644
+--- a/arch/riscv/kernel/probes/kprobes.c
++++ b/arch/riscv/kernel/probes/kprobes.c
+@@ -49,10 +49,15 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
+ post_kprobe_handler(p, kcb, regs);
+ }
+
+-static bool __kprobes arch_check_kprobe(struct kprobe *p)
++static bool __kprobes arch_check_kprobe(unsigned long addr)
+ {
+- unsigned long tmp = (unsigned long)p->addr - p->offset;
+- unsigned long addr = (unsigned long)p->addr;
++ unsigned long tmp, offset;
++
++ /* start iterating at the closest preceding symbol */
++ if (!kallsyms_lookup_size_offset(addr, NULL, &offset))
++ return false;
++
++ tmp = addr - offset;
+
+ while (tmp <= addr) {
+ if (tmp == addr)
+@@ -71,7 +76,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
+ if ((unsigned long)insn & 0x1)
+ return -EILSEQ;
+
+- if (!arch_check_kprobe(p))
++ if (!arch_check_kprobe((unsigned long)p->addr))
+ return -EILSEQ;
+
+ /* copy instruction */
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index a6f88ca1a6b495..a11e17f3b4b1b6 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -1338,11 +1338,23 @@ static __init int print_s5_reset_status_mmio(void)
+ return 0;
+
+ value = ioread32(addr);
+- iounmap(addr);
+
+ /* Value with "all bits set" is an error response and should be ignored. */
+- if (value == U32_MAX)
++ if (value == U32_MAX) {
++ iounmap(addr);
+ return 0;
++ }
++
++ /*
++ * Clear all reason bits so they won't be retained if the next reset
++ * does not update the register. Besides, some bits are never cleared by
++ * hardware so it's software's responsibility to clear them.
++ *
++ * Writing the value back effectively clears all reason bits as they are
++ * write-1-to-clear.
++ */
++ iowrite32(value, addr);
++ iounmap(addr);
+
+ for (i = 0; i < ARRAY_SIZE(s5_reset_reason_txt); i++) {
+ if (!(value & BIT(i)))
+diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
+index c261558276cdd4..eed0f8417b8c57 100644
+--- a/arch/x86/kernel/cpu/resctrl/monitor.c
++++ b/arch/x86/kernel/cpu/resctrl/monitor.c
+@@ -224,15 +224,35 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
+ return chunks >> shift;
+ }
+
++static u64 get_corrected_val(struct rdt_resource *r, struct rdt_mon_domain *d,
++ u32 rmid, enum resctrl_event_id eventid, u64 msr_val)
++{
++ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
++ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
++ struct arch_mbm_state *am;
++ u64 chunks;
++
++ am = get_arch_mbm_state(hw_dom, rmid, eventid);
++ if (am) {
++ am->chunks += mbm_overflow_count(am->prev_msr, msr_val,
++ hw_res->mbm_width);
++ chunks = get_corrected_mbm_count(rmid, am->chunks);
++ am->prev_msr = msr_val;
++ } else {
++ chunks = msr_val;
++ }
++
++ return chunks * hw_res->mon_scale;
++}
++
+ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
+ u32 unused, u32 rmid, enum resctrl_event_id eventid,
+ u64 *val, void *ignored)
+ {
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
+- struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
+- u64 msr_val, chunks;
++ u64 msr_val;
+ u32 prmid;
+ int ret;
+
+@@ -240,22 +260,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
+
+ prmid = logical_rmid_to_physical_rmid(cpu, rmid);
+ ret = __rmid_read_phys(prmid, eventid, &msr_val);
+- if (ret)
+- return ret;
+
+- am = get_arch_mbm_state(hw_dom, rmid, eventid);
+- if (am) {
+- am->chunks += mbm_overflow_count(am->prev_msr, msr_val,
+- hw_res->mbm_width);
+- chunks = get_corrected_mbm_count(rmid, am->chunks);
+- am->prev_msr = msr_val;
+- } else {
+- chunks = msr_val;
++ if (!ret) {
++ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
++ } else if (ret == -EINVAL) {
++ am = get_arch_mbm_state(hw_dom, rmid, eventid);
++ if (am)
++ am->prev_msr = 0;
+ }
+
+- *val = chunks * hw_res->mon_scale;
+-
+- return 0;
++ return ret;
+ }
+
+ /*
+diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
+index 39f80111e6f175..5d221709353e0a 100644
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -911,11 +911,31 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
+ * CR3 and cpu_tlbstate.loaded_mm are not all in sync.
+ */
+ this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
+- barrier();
+
+- /* Start receiving IPIs and then read tlb_gen (and LAM below) */
++ /*
++ * Make sure this CPU is set in mm_cpumask() such that we'll
++ * receive invalidation IPIs.
++ *
++ * Rely on the smp_mb() implied by cpumask_set_cpu()'s atomic
++ * operation, or explicitly provide one. Such that:
++ *
++ * switch_mm_irqs_off() flush_tlb_mm_range()
++ * smp_store_release(loaded_mm, SWITCHING); atomic64_inc_return(tlb_gen)
++ * smp_mb(); // here // smp_mb() implied
++ * atomic64_read(tlb_gen); this_cpu_read(loaded_mm);
++ *
++ * we properly order against flush_tlb_mm_range(), where the
++ * loaded_mm load can happen in mative_flush_tlb_multi() ->
++ * should_flush_tlb().
++ *
++ * This way switch_mm() must see the new tlb_gen or
++ * flush_tlb_mm_range() must see the new loaded_mm, or both.
++ */
+ if (next != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))
+ cpumask_set_cpu(cpu, mm_cpumask(next));
++ else
++ smp_mb();
++
+ next_tlb_gen = atomic64_read(&next->context.tlb_gen);
+
+ ns = choose_new_asid(next, next_tlb_gen);
+diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
+index 7246fc2563152c..091e9623bc294c 100644
+--- a/block/blk-cgroup.c
++++ b/block/blk-cgroup.c
+@@ -812,8 +812,7 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
+ }
+ /*
+ * Similar to blkg_conf_open_bdev, but additionally freezes the queue,
+- * acquires q->elevator_lock, and ensures the correct locking order
+- * between q->elevator_lock and q->rq_qos_mutex.
++ * ensures the correct locking order between freeze queue and q->rq_qos_mutex.
+ *
+ * This function returns negative error on failure. On success it returns
+ * memflags which must be saved and later passed to blkg_conf_exit_frozen
+@@ -834,13 +833,11 @@ unsigned long __must_check blkg_conf_open_bdev_frozen(struct blkg_conf_ctx *ctx)
+ * At this point, we haven’t started protecting anything related to QoS,
+ * so we release q->rq_qos_mutex here, which was first acquired in blkg_
+ * conf_open_bdev. Later, we re-acquire q->rq_qos_mutex after freezing
+- * the queue and acquiring q->elevator_lock to maintain the correct
+- * locking order.
++ * the queue to maintain the correct locking order.
+ */
+ mutex_unlock(&ctx->bdev->bd_queue->rq_qos_mutex);
+
+ memflags = blk_mq_freeze_queue(ctx->bdev->bd_queue);
+- mutex_lock(&ctx->bdev->bd_queue->elevator_lock);
+ mutex_lock(&ctx->bdev->bd_queue->rq_qos_mutex);
+
+ return memflags;
+@@ -1002,9 +999,8 @@ void blkg_conf_exit(struct blkg_conf_ctx *ctx)
+ EXPORT_SYMBOL_GPL(blkg_conf_exit);
+
+ /*
+- * Similar to blkg_conf_exit, but also unfreezes the queue and releases
+- * q->elevator_lock. Should be used when blkg_conf_open_bdev_frozen
+- * is used to open the bdev.
++ * Similar to blkg_conf_exit, but also unfreezes the queue. Should be used
++ * when blkg_conf_open_bdev_frozen is used to open the bdev.
+ */
+ void blkg_conf_exit_frozen(struct blkg_conf_ctx *ctx, unsigned long memflags)
+ {
+@@ -1012,7 +1008,6 @@ void blkg_conf_exit_frozen(struct blkg_conf_ctx *ctx, unsigned long memflags)
+ struct request_queue *q = ctx->bdev->bd_queue;
+
+ blkg_conf_exit(ctx);
+- mutex_unlock(&q->elevator_lock);
+ blk_mq_unfreeze_queue(q, memflags);
+ }
+ }
+diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
+index d06bb137a74377..e0bed16485c346 100644
+--- a/block/blk-mq-sched.c
++++ b/block/blk-mq-sched.c
+@@ -557,7 +557,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e,
+ if (blk_mq_is_shared_tags(flags)) {
+ /* Shared tags are stored at index 0 in @et->tags. */
+ q->sched_shared_tags = et->tags[0];
+- blk_mq_tag_update_sched_shared_tags(q);
++ blk_mq_tag_update_sched_shared_tags(q, et->nr_requests);
+ }
+
+ queue_for_each_hw_ctx(q, hctx, i) {
+diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
+index aed84c5d5c2b22..12f48e7a0f7743 100644
+--- a/block/blk-mq-tag.c
++++ b/block/blk-mq-tag.c
+@@ -622,10 +622,11 @@ void blk_mq_tag_resize_shared_tags(struct blk_mq_tag_set *set, unsigned int size
+ sbitmap_queue_resize(&tags->bitmap_tags, size - set->reserved_tags);
+ }
+
+-void blk_mq_tag_update_sched_shared_tags(struct request_queue *q)
++void blk_mq_tag_update_sched_shared_tags(struct request_queue *q,
++ unsigned int nr)
+ {
+ sbitmap_queue_resize(&q->sched_shared_tags->bitmap_tags,
+- q->nr_requests - q->tag_set->reserved_tags);
++ nr - q->tag_set->reserved_tags);
+ }
+
+ /**
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index f8a8a23b904023..19f62b070ca9dc 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -4942,7 +4942,7 @@ struct elevator_tags *blk_mq_update_nr_requests(struct request_queue *q,
+ * tags can't grow, see blk_mq_alloc_sched_tags().
+ */
+ if (q->elevator)
+- blk_mq_tag_update_sched_shared_tags(q);
++ blk_mq_tag_update_sched_shared_tags(q, nr);
+ else
+ blk_mq_tag_resize_shared_tags(set, nr);
+ } else if (!q->elevator) {
+diff --git a/block/blk-mq.h b/block/blk-mq.h
+index 6c9d03625ba124..2fdc8eeb400403 100644
+--- a/block/blk-mq.h
++++ b/block/blk-mq.h
+@@ -188,7 +188,8 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
+ struct blk_mq_tags **tags, unsigned int depth);
+ void blk_mq_tag_resize_shared_tags(struct blk_mq_tag_set *set,
+ unsigned int size);
+-void blk_mq_tag_update_sched_shared_tags(struct request_queue *q);
++void blk_mq_tag_update_sched_shared_tags(struct request_queue *q,
++ unsigned int nr);
+
+ void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
+ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
+diff --git a/drivers/accel/qaic/qaic.h b/drivers/accel/qaic/qaic.h
+index c31081e42cee0a..820d133236dd19 100644
+--- a/drivers/accel/qaic/qaic.h
++++ b/drivers/accel/qaic/qaic.h
+@@ -97,6 +97,8 @@ struct dma_bridge_chan {
+ * response queue's head and tail pointer of this DBC.
+ */
+ void __iomem *dbc_base;
++ /* Synchronizes access to Request queue's head and tail pointer */
++ struct mutex req_lock;
+ /* Head of list where each node is a memory handle queued in request queue */
+ struct list_head xfer_list;
+ /* Synchronizes DBC readers during cleanup */
+diff --git a/drivers/accel/qaic/qaic_control.c b/drivers/accel/qaic/qaic_control.c
+index d8bdab69f80095..b86a8e48e731b7 100644
+--- a/drivers/accel/qaic/qaic_control.c
++++ b/drivers/accel/qaic/qaic_control.c
+@@ -407,7 +407,7 @@ static int find_and_map_user_pages(struct qaic_device *qdev,
+ return -EINVAL;
+ remaining = in_trans->size - resources->xferred_dma_size;
+ if (remaining == 0)
+- return 0;
++ return -EINVAL;
+
+ if (check_add_overflow(xfer_start_addr, remaining, &end))
+ return -EINVAL;
+diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c
+index 797289e9d78064..c4f117edb266ec 100644
+--- a/drivers/accel/qaic/qaic_data.c
++++ b/drivers/accel/qaic/qaic_data.c
+@@ -1356,13 +1356,17 @@ static int __qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct dr
+ goto release_ch_rcu;
+ }
+
++ ret = mutex_lock_interruptible(&dbc->req_lock);
++ if (ret)
++ goto release_ch_rcu;
++
+ head = readl(dbc->dbc_base + REQHP_OFF);
+ tail = readl(dbc->dbc_base + REQTP_OFF);
+
+ if (head == U32_MAX || tail == U32_MAX) {
+ /* PCI link error */
+ ret = -ENODEV;
+- goto release_ch_rcu;
++ goto unlock_req_lock;
+ }
+
+ queue_level = head <= tail ? tail - head : dbc->nelem - (head - tail);
+@@ -1370,11 +1374,12 @@ static int __qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct dr
+ ret = send_bo_list_to_device(qdev, file_priv, exec, args->hdr.count, is_partial, dbc,
+ head, &tail);
+ if (ret)
+- goto release_ch_rcu;
++ goto unlock_req_lock;
+
+ /* Finalize commit to hardware */
+ submit_ts = ktime_get_ns();
+ writel(tail, dbc->dbc_base + REQTP_OFF);
++ mutex_unlock(&dbc->req_lock);
+
+ update_profiling_data(file_priv, exec, args->hdr.count, is_partial, received_ts,
+ submit_ts, queue_level);
+@@ -1382,6 +1387,9 @@ static int __qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct dr
+ if (datapath_polling)
+ schedule_work(&dbc->poll_work);
+
++unlock_req_lock:
++ if (ret)
++ mutex_unlock(&dbc->req_lock);
+ release_ch_rcu:
+ srcu_read_unlock(&dbc->ch_lock, rcu_id);
+ unlock_dev_srcu:
+diff --git a/drivers/accel/qaic/qaic_debugfs.c b/drivers/accel/qaic/qaic_debugfs.c
+index a991b8198dc40e..8dc4fe5bb560ed 100644
+--- a/drivers/accel/qaic/qaic_debugfs.c
++++ b/drivers/accel/qaic/qaic_debugfs.c
+@@ -218,6 +218,9 @@ static int qaic_bootlog_mhi_probe(struct mhi_device *mhi_dev, const struct mhi_d
+ if (ret)
+ goto destroy_workqueue;
+
++ dev_set_drvdata(&mhi_dev->dev, qdev);
++ qdev->bootlog_ch = mhi_dev;
++
+ for (i = 0; i < BOOTLOG_POOL_SIZE; i++) {
+ msg = devm_kzalloc(&qdev->pdev->dev, sizeof(*msg), GFP_KERNEL);
+ if (!msg) {
+@@ -233,8 +236,6 @@ static int qaic_bootlog_mhi_probe(struct mhi_device *mhi_dev, const struct mhi_d
+ goto mhi_unprepare;
+ }
+
+- dev_set_drvdata(&mhi_dev->dev, qdev);
+- qdev->bootlog_ch = mhi_dev;
+ return 0;
+
+ mhi_unprepare:
+diff --git a/drivers/accel/qaic/qaic_drv.c b/drivers/accel/qaic/qaic_drv.c
+index e31bcb0ecfc946..e162f4b8a262ab 100644
+--- a/drivers/accel/qaic/qaic_drv.c
++++ b/drivers/accel/qaic/qaic_drv.c
+@@ -454,6 +454,9 @@ static struct qaic_device *create_qdev(struct pci_dev *pdev,
+ return NULL;
+ init_waitqueue_head(&qdev->dbc[i].dbc_release);
+ INIT_LIST_HEAD(&qdev->dbc[i].bo_lists);
++ ret = drmm_mutex_init(drm, &qdev->dbc[i].req_lock);
++ if (ret)
++ return NULL;
+ }
+
+ return qdev;
+diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
+index ff53f5f029b404..2a210719c4ce5c 100644
+--- a/drivers/ata/libata-core.c
++++ b/drivers/ata/libata-core.c
+@@ -2174,13 +2174,10 @@ static int ata_read_log_directory(struct ata_device *dev)
+ }
+
+ version = get_unaligned_le16(&dev->gp_log_dir[0]);
+- if (version != 0x0001) {
+- ata_dev_err(dev, "Invalid log directory version 0x%04x\n",
+- version);
+- ata_clear_log_directory(dev);
+- dev->quirks |= ATA_QUIRK_NO_LOG_DIR;
+- return -EINVAL;
+- }
++ if (version != 0x0001)
++ ata_dev_warn_once(dev,
++ "Invalid log directory version 0x%04x\n",
++ version);
+
+ return 0;
+ }
+diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
+index 712624cba2b6e0..87f0ed3f3f51fa 100644
+--- a/drivers/cxl/acpi.c
++++ b/drivers/cxl/acpi.c
+@@ -345,7 +345,7 @@ static int cxl_acpi_set_cache_size(struct cxl_root_decoder *cxlrd)
+ struct resource res;
+ int nid, rc;
+
+- res = DEFINE_RES(start, size, 0);
++ res = DEFINE_RES_MEM(start, size);
+ nid = phys_to_target_node(start);
+
+ rc = hmat_get_extended_linear_cache_size(&res, nid, &cache_size);
+diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c
+index 7c750599ea6906..4bc484b46f439f 100644
+--- a/drivers/cxl/core/features.c
++++ b/drivers/cxl/core/features.c
+@@ -371,6 +371,9 @@ cxl_feature_info(struct cxl_features_state *cxlfs,
+ {
+ struct cxl_feat_entry *feat;
+
++ if (!cxlfs || !cxlfs->entries)
++ return ERR_PTR(-EOPNOTSUPP);
++
+ for (int i = 0; i < cxlfs->entries->num_features; i++) {
+ feat = &cxlfs->entries->ent[i];
+ if (uuid_equal(uuid, &feat->uuid))
+diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
+index 71cc42d052481a..be452118432827 100644
+--- a/drivers/cxl/core/region.c
++++ b/drivers/cxl/core/region.c
+@@ -831,7 +831,7 @@ static int match_free_decoder(struct device *dev, const void *data)
+ }
+
+ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
+- struct range *range)
++ const struct range *range)
+ {
+ if (!p->res)
+ return false;
+@@ -3287,10 +3287,7 @@ static int match_region_by_range(struct device *dev, const void *data)
+ p = &cxlr->params;
+
+ guard(rwsem_read)(&cxl_rwsem.region);
+- if (p->res && p->res->start == r->start && p->res->end == r->end)
+- return 1;
+-
+- return 0;
++ return region_res_match_cxl_range(p, r);
+ }
+
+ static int cxl_extended_linear_cache_resize(struct cxl_region *cxlr,
+diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
+index a53ec4798b12fb..a972e4ef193686 100644
+--- a/drivers/cxl/core/trace.h
++++ b/drivers/cxl/core/trace.h
+@@ -1068,7 +1068,7 @@ TRACE_EVENT(cxl_poison,
+ __entry->hpa = cxl_dpa_to_hpa(cxlr, cxlmd,
+ __entry->dpa);
+ if (__entry->hpa != ULLONG_MAX && cxlr->params.cache_size)
+- __entry->hpa_alias0 = __entry->hpa +
++ __entry->hpa_alias0 = __entry->hpa -
+ cxlr->params.cache_size;
+ else
+ __entry->hpa_alias0 = ULLONG_MAX;
+diff --git a/drivers/dpll/zl3073x/core.c b/drivers/dpll/zl3073x/core.c
+index 7ebcfc5ec1f090..59c75b470efbfa 100644
+--- a/drivers/dpll/zl3073x/core.c
++++ b/drivers/dpll/zl3073x/core.c
+@@ -809,21 +809,163 @@ zl3073x_dev_periodic_work(struct kthread_work *work)
+ msecs_to_jiffies(500));
+ }
+
++/**
++ * zl3073x_dev_phase_meas_setup - setup phase offset measurement
++ * @zldev: pointer to zl3073x_dev structure
++ *
++ * Enable phase offset measurement block, set measurement averaging factor
++ * and enable DPLL-to-its-ref phase measurement for all DPLLs.
++ *
++ * Returns: 0 on success, <0 on error
++ */
++static int
++zl3073x_dev_phase_meas_setup(struct zl3073x_dev *zldev)
++{
++ struct zl3073x_dpll *zldpll;
++ u8 dpll_meas_ctrl, mask = 0;
++ int rc;
++
++ /* Read DPLL phase measurement control register */
++ rc = zl3073x_read_u8(zldev, ZL_REG_DPLL_MEAS_CTRL, &dpll_meas_ctrl);
++ if (rc)
++ return rc;
++
++ /* Setup phase measurement averaging factor */
++ dpll_meas_ctrl &= ~ZL_DPLL_MEAS_CTRL_AVG_FACTOR;
++ dpll_meas_ctrl |= FIELD_PREP(ZL_DPLL_MEAS_CTRL_AVG_FACTOR, 3);
++
++ /* Enable DPLL measurement block */
++ dpll_meas_ctrl |= ZL_DPLL_MEAS_CTRL_EN;
++
++ /* Update phase measurement control register */
++ rc = zl3073x_write_u8(zldev, ZL_REG_DPLL_MEAS_CTRL, dpll_meas_ctrl);
++ if (rc)
++ return rc;
++
++ /* Enable DPLL-to-connected-ref measurement for each channel */
++ list_for_each_entry(zldpll, &zldev->dplls, list)
++ mask |= BIT(zldpll->id);
++
++ return zl3073x_write_u8(zldev, ZL_REG_DPLL_PHASE_ERR_READ_MASK, mask);
++}
++
++/**
++ * zl3073x_dev_start - Start normal operation
++ * @zldev: zl3073x device pointer
++ * @full: perform full initialization
++ *
++ * The function starts normal operation, which means registering all DPLLs and
++ * their pins, and starting monitoring. If full initialization is requested,
++ * the function additionally initializes the phase offset measurement block and
++ * fetches hardware-invariant parameters.
++ *
++ * Return: 0 on success, <0 on error
++ */
++int zl3073x_dev_start(struct zl3073x_dev *zldev, bool full)
++{
++ struct zl3073x_dpll *zldpll;
++ u8 info;
++ int rc;
++
++ rc = zl3073x_read_u8(zldev, ZL_REG_INFO, &info);
++ if (rc) {
++ dev_err(zldev->dev, "Failed to read device status info\n");
++ return rc;
++ }
++
++ if (!FIELD_GET(ZL_INFO_READY, info)) {
++ /* The ready bit indicates that the firmware was successfully
++ * configured and is ready for normal operation. If it is
++ * cleared then the configuration stored in flash is wrong
++ * or missing. In this situation the driver will expose
++ * only devlink interface to give an opportunity to flash
++ * the correct config.
++ */
++ dev_info(zldev->dev,
++ "FW not fully ready - missing or corrupted config\n");
++
++ return 0;
++ }
++
++ if (full) {
++ /* Fetch device state */
++ rc = zl3073x_dev_state_fetch(zldev);
++ if (rc)
++ return rc;
++
++ /* Setup phase offset measurement block */
++ rc = zl3073x_dev_phase_meas_setup(zldev);
++ if (rc) {
++ dev_err(zldev->dev,
++ "Failed to setup phase measurement\n");
++ return rc;
++ }
++ }
++
++ /* Register all DPLLs */
++ list_for_each_entry(zldpll, &zldev->dplls, list) {
++ rc = zl3073x_dpll_register(zldpll);
++ if (rc) {
++ dev_err_probe(zldev->dev, rc,
++ "Failed to register DPLL%u\n",
++ zldpll->id);
++ return rc;
++ }
++ }
++
++ /* Perform initial firmware fine phase correction */
++ rc = zl3073x_dpll_init_fine_phase_adjust(zldev);
++ if (rc) {
++ dev_err_probe(zldev->dev, rc,
++ "Failed to init fine phase correction\n");
++ return rc;
++ }
++
++ /* Start monitoring */
++ kthread_queue_delayed_work(zldev->kworker, &zldev->work, 0);
++
++ return 0;
++}
++
++/**
++ * zl3073x_dev_stop - Stop normal operation
++ * @zldev: zl3073x device pointer
++ *
++ * The function stops the normal operation that mean deregistration of all
++ * DPLLs and their pins and stop monitoring.
++ *
++ * Return: 0 on success, <0 on error
++ */
++void zl3073x_dev_stop(struct zl3073x_dev *zldev)
++{
++ struct zl3073x_dpll *zldpll;
++
++ /* Stop monitoring */
++ kthread_cancel_delayed_work_sync(&zldev->work);
++
++ /* Unregister all DPLLs */
++ list_for_each_entry(zldpll, &zldev->dplls, list) {
++ if (zldpll->dpll_dev)
++ zl3073x_dpll_unregister(zldpll);
++ }
++}
++
+ static void zl3073x_dev_dpll_fini(void *ptr)
+ {
+ struct zl3073x_dpll *zldpll, *next;
+ struct zl3073x_dev *zldev = ptr;
+
+- /* Stop monitoring thread */
++ /* Stop monitoring and unregister DPLLs */
++ zl3073x_dev_stop(zldev);
++
++ /* Destroy monitoring thread */
+ if (zldev->kworker) {
+- kthread_cancel_delayed_work_sync(&zldev->work);
+ kthread_destroy_worker(zldev->kworker);
+ zldev->kworker = NULL;
+ }
+
+- /* Release DPLLs */
++ /* Free all DPLLs */
+ list_for_each_entry_safe(zldpll, next, &zldev->dplls, list) {
+- zl3073x_dpll_unregister(zldpll);
+ list_del(&zldpll->list);
+ zl3073x_dpll_free(zldpll);
+ }
+@@ -839,7 +981,7 @@ zl3073x_devm_dpll_init(struct zl3073x_dev *zldev, u8 num_dplls)
+
+ INIT_LIST_HEAD(&zldev->dplls);
+
+- /* Initialize all DPLLs */
++ /* Allocate all DPLLs */
+ for (i = 0; i < num_dplls; i++) {
+ zldpll = zl3073x_dpll_alloc(zldev, i);
+ if (IS_ERR(zldpll)) {
+@@ -849,25 +991,9 @@ zl3073x_devm_dpll_init(struct zl3073x_dev *zldev, u8 num_dplls)
+ goto error;
+ }
+
+- rc = zl3073x_dpll_register(zldpll);
+- if (rc) {
+- dev_err_probe(zldev->dev, rc,
+- "Failed to register DPLL%u\n", i);
+- zl3073x_dpll_free(zldpll);
+- goto error;
+- }
+-
+ list_add_tail(&zldpll->list, &zldev->dplls);
+ }
+
+- /* Perform initial firmware fine phase correction */
+- rc = zl3073x_dpll_init_fine_phase_adjust(zldev);
+- if (rc) {
+- dev_err_probe(zldev->dev, rc,
+- "Failed to init fine phase correction\n");
+- goto error;
+- }
+-
+ /* Initialize monitoring thread */
+ kthread_init_delayed_work(&zldev->work, zl3073x_dev_periodic_work);
+ kworker = kthread_run_worker(0, "zl3073x-%s", dev_name(zldev->dev));
+@@ -875,9 +1001,14 @@ zl3073x_devm_dpll_init(struct zl3073x_dev *zldev, u8 num_dplls)
+ rc = PTR_ERR(kworker);
+ goto error;
+ }
+-
+ zldev->kworker = kworker;
+- kthread_queue_delayed_work(zldev->kworker, &zldev->work, 0);
++
++ /* Start normal operation */
++ rc = zl3073x_dev_start(zldev, true);
++ if (rc) {
++ dev_err_probe(zldev->dev, rc, "Failed to start device\n");
++ goto error;
++ }
+
+ /* Add devres action to release DPLL related resources */
+ rc = devm_add_action_or_reset(zldev->dev, zl3073x_dev_dpll_fini, zldev);
+@@ -892,46 +1023,6 @@ zl3073x_devm_dpll_init(struct zl3073x_dev *zldev, u8 num_dplls)
+ return rc;
+ }
+
+-/**
+- * zl3073x_dev_phase_meas_setup - setup phase offset measurement
+- * @zldev: pointer to zl3073x_dev structure
+- * @num_channels: number of DPLL channels
+- *
+- * Enable phase offset measurement block, set measurement averaging factor
+- * and enable DPLL-to-its-ref phase measurement for all DPLLs.
+- *
+- * Returns: 0 on success, <0 on error
+- */
+-static int
+-zl3073x_dev_phase_meas_setup(struct zl3073x_dev *zldev, int num_channels)
+-{
+- u8 dpll_meas_ctrl, mask;
+- int i, rc;
+-
+- /* Read DPLL phase measurement control register */
+- rc = zl3073x_read_u8(zldev, ZL_REG_DPLL_MEAS_CTRL, &dpll_meas_ctrl);
+- if (rc)
+- return rc;
+-
+- /* Setup phase measurement averaging factor */
+- dpll_meas_ctrl &= ~ZL_DPLL_MEAS_CTRL_AVG_FACTOR;
+- dpll_meas_ctrl |= FIELD_PREP(ZL_DPLL_MEAS_CTRL_AVG_FACTOR, 3);
+-
+- /* Enable DPLL measurement block */
+- dpll_meas_ctrl |= ZL_DPLL_MEAS_CTRL_EN;
+-
+- /* Update phase measurement control register */
+- rc = zl3073x_write_u8(zldev, ZL_REG_DPLL_MEAS_CTRL, dpll_meas_ctrl);
+- if (rc)
+- return rc;
+-
+- /* Enable DPLL-to-connected-ref measurement for each channel */
+- for (i = 0, mask = 0; i < num_channels; i++)
+- mask |= BIT(i);
+-
+- return zl3073x_write_u8(zldev, ZL_REG_DPLL_PHASE_ERR_READ_MASK, mask);
+-}
+-
+ /**
+ * zl3073x_dev_probe - initialize zl3073x device
+ * @zldev: pointer to zl3073x device
+@@ -999,17 +1090,6 @@ int zl3073x_dev_probe(struct zl3073x_dev *zldev,
+ return dev_err_probe(zldev->dev, rc,
+ "Failed to initialize mutex\n");
+
+- /* Fetch device state */
+- rc = zl3073x_dev_state_fetch(zldev);
+- if (rc)
+- return rc;
+-
+- /* Setup phase offset measurement block */
+- rc = zl3073x_dev_phase_meas_setup(zldev, chip_info->num_channels);
+- if (rc)
+- return dev_err_probe(zldev->dev, rc,
+- "Failed to setup phase measurement\n");
+-
+ /* Register DPLL channels */
+ rc = zl3073x_devm_dpll_init(zldev, chip_info->num_channels);
+ if (rc)
+diff --git a/drivers/dpll/zl3073x/core.h b/drivers/dpll/zl3073x/core.h
+index 71af2c8001109e..84e52d5521a349 100644
+--- a/drivers/dpll/zl3073x/core.h
++++ b/drivers/dpll/zl3073x/core.h
+@@ -111,6 +111,9 @@ struct zl3073x_dev *zl3073x_devm_alloc(struct device *dev);
+ int zl3073x_dev_probe(struct zl3073x_dev *zldev,
+ const struct zl3073x_chip_info *chip_info);
+
++int zl3073x_dev_start(struct zl3073x_dev *zldev, bool full);
++void zl3073x_dev_stop(struct zl3073x_dev *zldev);
++
+ /**********************
+ * Registers operations
+ **********************/
+diff --git a/drivers/dpll/zl3073x/devlink.c b/drivers/dpll/zl3073x/devlink.c
+index 7e7fe726ee37a1..c2e9f7aca3c841 100644
+--- a/drivers/dpll/zl3073x/devlink.c
++++ b/drivers/dpll/zl3073x/devlink.c
+@@ -86,14 +86,12 @@ zl3073x_devlink_reload_down(struct devlink *devlink, bool netns_change,
+ struct netlink_ext_ack *extack)
+ {
+ struct zl3073x_dev *zldev = devlink_priv(devlink);
+- struct zl3073x_dpll *zldpll;
+
+ if (action != DEVLINK_RELOAD_ACTION_DRIVER_REINIT)
+ return -EOPNOTSUPP;
+
+- /* Unregister all DPLLs */
+- list_for_each_entry(zldpll, &zldev->dplls, list)
+- zl3073x_dpll_unregister(zldpll);
++ /* Stop normal operation */
++ zl3073x_dev_stop(zldev);
+
+ return 0;
+ }
+@@ -107,7 +105,6 @@ zl3073x_devlink_reload_up(struct devlink *devlink,
+ {
+ struct zl3073x_dev *zldev = devlink_priv(devlink);
+ union devlink_param_value val;
+- struct zl3073x_dpll *zldpll;
+ int rc;
+
+ if (action != DEVLINK_RELOAD_ACTION_DRIVER_REINIT)
+@@ -125,13 +122,10 @@ zl3073x_devlink_reload_up(struct devlink *devlink,
+ zldev->clock_id = val.vu64;
+ }
+
+- /* Re-register all DPLLs */
+- list_for_each_entry(zldpll, &zldev->dplls, list) {
+- rc = zl3073x_dpll_register(zldpll);
+- if (rc)
+- dev_warn(zldev->dev,
+- "Failed to re-register DPLL%u\n", zldpll->id);
+- }
++ /* Restart normal operation */
++ rc = zl3073x_dev_start(zldev, false);
++ if (rc)
++ dev_warn(zldev->dev, "Failed to re-start normal operation\n");
+
+ *actions_performed = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
+
+diff --git a/drivers/dpll/zl3073x/regs.h b/drivers/dpll/zl3073x/regs.h
+index 614e33128a5c9a..bb9965b8e8c754 100644
+--- a/drivers/dpll/zl3073x/regs.h
++++ b/drivers/dpll/zl3073x/regs.h
+@@ -67,6 +67,9 @@
+ * Register Page 0, General
+ **************************/
+
++#define ZL_REG_INFO ZL_REG(0, 0x00, 1)
++#define ZL_INFO_READY BIT(7)
++
+ #define ZL_REG_ID ZL_REG(0, 0x01, 2)
+ #define ZL_REG_REVISION ZL_REG(0, 0x03, 2)
+ #define ZL_REG_FW_VER ZL_REG(0, 0x05, 2)
+diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
+index 930de203d533c3..2d0fea87af79f1 100644
+--- a/drivers/gpu/drm/amd/amdgpu/Makefile
++++ b/drivers/gpu/drm/amd/amdgpu/Makefile
+@@ -84,7 +84,8 @@ amdgpu-y += \
+ vega20_reg_init.o nbio_v7_4.o nbio_v2_3.o nv.o arct_reg_init.o mxgpu_nv.o \
+ nbio_v7_2.o hdp_v4_0.o hdp_v5_0.o aldebaran_reg_init.o aldebaran.o soc21.o soc24.o \
+ sienna_cichlid.o smu_v13_0_10.o nbio_v4_3.o hdp_v6_0.o nbio_v7_7.o hdp_v5_2.o lsdma_v6_0.o \
+- nbio_v7_9.o aqua_vanjaram.o nbio_v7_11.o lsdma_v7_0.o hdp_v7_0.o nbif_v6_3_1.o
++ nbio_v7_9.o aqua_vanjaram.o nbio_v7_11.o lsdma_v7_0.o hdp_v7_0.o nbif_v6_3_1.o \
++ cyan_skillfish_reg_init.o
+
+ # add DF block
+ amdgpu-y += \
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+index d5f9d48bf8842d..902eac2c685f3c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+@@ -2325,10 +2325,9 @@ void amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(struct kgd_mem *mem)
+ int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
+ struct kfd_vm_fault_info *mem)
+ {
+- if (atomic_read(&adev->gmc.vm_fault_info_updated) == 1) {
++ if (atomic_read_acquire(&adev->gmc.vm_fault_info_updated) == 1) {
+ *mem = *adev->gmc.vm_fault_info;
+- mb(); /* make sure read happened */
+- atomic_set(&adev->gmc.vm_fault_info_updated, 0);
++ atomic_set_release(&adev->gmc.vm_fault_info_updated, 0);
+ }
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+index efe0058b48ca85..e814da2b14225b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+@@ -1033,7 +1033,9 @@ static uint8_t amdgpu_discovery_get_harvest_info(struct amdgpu_device *adev,
+ /* Until a uniform way is figured, get mask based on hwid */
+ switch (hw_id) {
+ case VCN_HWID:
+- harvest = ((1 << inst) & adev->vcn.inst_mask) == 0;
++ /* VCN vs UVD+VCE */
++ if (!amdgpu_ip_version(adev, VCE_HWIP, 0))
++ harvest = ((1 << inst) & adev->vcn.inst_mask) == 0;
+ break;
+ case DMU_HWID:
+ if (adev->harvest_ip_mask & AMD_HARVEST_IP_DMU_MASK)
+@@ -2562,7 +2564,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ amdgpu_discovery_init(adev);
+ vega10_reg_base_init(adev);
+ adev->sdma.num_instances = 2;
++ adev->sdma.sdma_mask = 3;
+ adev->gmc.num_umc = 4;
++ adev->gfx.xcc_mask = 1;
+ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(9, 0, 0);
+ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(9, 0, 0);
+ adev->ip_versions[OSSSYS_HWIP][0] = IP_VERSION(4, 0, 0);
+@@ -2589,7 +2593,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ amdgpu_discovery_init(adev);
+ vega10_reg_base_init(adev);
+ adev->sdma.num_instances = 2;
++ adev->sdma.sdma_mask = 3;
+ adev->gmc.num_umc = 4;
++ adev->gfx.xcc_mask = 1;
+ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(9, 3, 0);
+ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(9, 3, 0);
+ adev->ip_versions[OSSSYS_HWIP][0] = IP_VERSION(4, 0, 1);
+@@ -2616,8 +2622,10 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ amdgpu_discovery_init(adev);
+ vega10_reg_base_init(adev);
+ adev->sdma.num_instances = 1;
++ adev->sdma.sdma_mask = 1;
+ adev->vcn.num_vcn_inst = 1;
+ adev->gmc.num_umc = 2;
++ adev->gfx.xcc_mask = 1;
+ if (adev->apu_flags & AMD_APU_IS_RAVEN2) {
+ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(9, 2, 0);
+ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(9, 2, 0);
+@@ -2662,7 +2670,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ amdgpu_discovery_init(adev);
+ vega20_reg_base_init(adev);
+ adev->sdma.num_instances = 2;
++ adev->sdma.sdma_mask = 3;
+ adev->gmc.num_umc = 8;
++ adev->gfx.xcc_mask = 1;
+ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(9, 4, 0);
+ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(9, 4, 0);
+ adev->ip_versions[OSSSYS_HWIP][0] = IP_VERSION(4, 2, 0);
+@@ -2690,8 +2700,10 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ amdgpu_discovery_init(adev);
+ arct_reg_base_init(adev);
+ adev->sdma.num_instances = 8;
++ adev->sdma.sdma_mask = 0xff;
+ adev->vcn.num_vcn_inst = 2;
+ adev->gmc.num_umc = 8;
++ adev->gfx.xcc_mask = 1;
+ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(9, 4, 1);
+ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(9, 4, 1);
+ adev->ip_versions[OSSSYS_HWIP][0] = IP_VERSION(4, 2, 1);
+@@ -2723,8 +2735,10 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ amdgpu_discovery_init(adev);
+ aldebaran_reg_base_init(adev);
+ adev->sdma.num_instances = 5;
++ adev->sdma.sdma_mask = 0x1f;
+ adev->vcn.num_vcn_inst = 2;
+ adev->gmc.num_umc = 4;
++ adev->gfx.xcc_mask = 1;
+ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(9, 4, 2);
+ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(9, 4, 2);
+ adev->ip_versions[OSSSYS_HWIP][0] = IP_VERSION(4, 4, 0);
+@@ -2746,6 +2760,38 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ adev->ip_versions[UVD_HWIP][1] = IP_VERSION(2, 6, 0);
+ adev->ip_versions[XGMI_HWIP][0] = IP_VERSION(6, 1, 0);
+ break;
++ case CHIP_CYAN_SKILLFISH:
++ if (adev->apu_flags & AMD_APU_IS_CYAN_SKILLFISH2) {
++ r = amdgpu_discovery_reg_base_init(adev);
++ if (r)
++ return -EINVAL;
++
++ amdgpu_discovery_harvest_ip(adev);
++ amdgpu_discovery_get_gfx_info(adev);
++ amdgpu_discovery_get_mall_info(adev);
++ amdgpu_discovery_get_vcn_info(adev);
++ } else {
++ cyan_skillfish_reg_base_init(adev);
++ adev->sdma.num_instances = 2;
++ adev->sdma.sdma_mask = 3;
++ adev->gfx.xcc_mask = 1;
++ adev->ip_versions[MMHUB_HWIP][0] = IP_VERSION(2, 0, 3);
++ adev->ip_versions[ATHUB_HWIP][0] = IP_VERSION(2, 0, 3);
++ adev->ip_versions[OSSSYS_HWIP][0] = IP_VERSION(5, 0, 1);
++ adev->ip_versions[HDP_HWIP][0] = IP_VERSION(5, 0, 1);
++ adev->ip_versions[SDMA0_HWIP][0] = IP_VERSION(5, 0, 1);
++ adev->ip_versions[SDMA1_HWIP][1] = IP_VERSION(5, 0, 1);
++ adev->ip_versions[DF_HWIP][0] = IP_VERSION(3, 5, 0);
++ adev->ip_versions[NBIO_HWIP][0] = IP_VERSION(2, 1, 1);
++ adev->ip_versions[UMC_HWIP][0] = IP_VERSION(8, 1, 1);
++ adev->ip_versions[MP0_HWIP][0] = IP_VERSION(11, 0, 8);
++ adev->ip_versions[MP1_HWIP][0] = IP_VERSION(11, 0, 8);
++ adev->ip_versions[THM_HWIP][0] = IP_VERSION(11, 0, 1);
++ adev->ip_versions[SMUIO_HWIP][0] = IP_VERSION(11, 0, 8);
++ adev->ip_versions[GC_HWIP][0] = IP_VERSION(10, 1, 3);
++ adev->ip_versions[UVD_HWIP][0] = IP_VERSION(2, 0, 3);
++ }
++ break;
+ default:
+ r = amdgpu_discovery_reg_base_init(adev);
+ if (r) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+index dbbb3407fa13ba..65f4a76490eacc 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+@@ -2665,7 +2665,7 @@ static int amdgpu_pmops_thaw(struct device *dev)
+ struct drm_device *drm_dev = dev_get_drvdata(dev);
+
+ /* do not resume device if it's normal hibernation */
+- if (!pm_hibernate_is_recovering())
++ if (!pm_hibernate_is_recovering() && !pm_hibernation_mode_is_suspend())
+ return 0;
+
+ return amdgpu_device_resume(drm_dev, true);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+index 9e7506965cab27..9f79f0cc5ff836 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+@@ -759,11 +759,42 @@ void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring)
+ * @fence: fence of the ring to signal
+ *
+ */
+-void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *fence)
++void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af)
+ {
+- dma_fence_set_error(&fence->base, -ETIME);
+- amdgpu_fence_write(fence->ring, fence->seq);
+- amdgpu_fence_process(fence->ring);
++ struct dma_fence *unprocessed;
++ struct dma_fence __rcu **ptr;
++ struct amdgpu_fence *fence;
++ struct amdgpu_ring *ring = af->ring;
++ unsigned long flags;
++ u32 seq, last_seq;
++
++ last_seq = amdgpu_fence_read(ring) & ring->fence_drv.num_fences_mask;
++ seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask;
++
++ /* mark all fences from the guilty context with an error */
++ spin_lock_irqsave(&ring->fence_drv.lock, flags);
++ do {
++ last_seq++;
++ last_seq &= ring->fence_drv.num_fences_mask;
++
++ ptr = &ring->fence_drv.fences[last_seq];
++ rcu_read_lock();
++ unprocessed = rcu_dereference(*ptr);
++
++ if (unprocessed && !dma_fence_is_signaled_locked(unprocessed)) {
++ fence = container_of(unprocessed, struct amdgpu_fence, base);
++
++ if (fence == af)
++ dma_fence_set_error(&fence->base, -ETIME);
++ else if (fence->context == af->context)
++ dma_fence_set_error(&fence->base, -ECANCELED);
++ }
++ rcu_read_unlock();
++ } while (last_seq != seq);
++ spin_unlock_irqrestore(&ring->fence_drv.lock, flags);
++ /* signal the guilty fence */
++ amdgpu_fence_write(ring, af->seq);
++ amdgpu_fence_process(ring);
+ }
+
+ void amdgpu_fence_save_wptr(struct dma_fence *fence)
+@@ -791,14 +822,19 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring,
+ struct dma_fence *unprocessed;
+ struct dma_fence __rcu **ptr;
+ struct amdgpu_fence *fence;
+- u64 wptr, i, seqno;
++ u64 wptr;
++ u32 seq, last_seq;
+
+- seqno = amdgpu_fence_read(ring);
++ last_seq = amdgpu_fence_read(ring) & ring->fence_drv.num_fences_mask;
++ seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask;
+ wptr = ring->fence_drv.signalled_wptr;
+ ring->ring_backup_entries_to_copy = 0;
+
+- for (i = seqno + 1; i <= ring->fence_drv.sync_seq; ++i) {
+- ptr = &ring->fence_drv.fences[i & ring->fence_drv.num_fences_mask];
++ do {
++ last_seq++;
++ last_seq &= ring->fence_drv.num_fences_mask;
++
++ ptr = &ring->fence_drv.fences[last_seq];
+ rcu_read_lock();
+ unprocessed = rcu_dereference(*ptr);
+
+@@ -814,7 +850,7 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring,
+ wptr = fence->wptr;
+ }
+ rcu_read_unlock();
+- }
++ } while (last_seq != seq);
+ }
+
+ /*
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+index 693357caa9a8d7..d9d7fc4c33cba7 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+@@ -2350,7 +2350,7 @@ static int psp_securedisplay_initialize(struct psp_context *psp)
+ }
+
+ ret = psp_ta_load(psp, &psp->securedisplay_context.context);
+- if (!ret) {
++ if (!ret && !psp->securedisplay_context.context.resp_status) {
+ psp->securedisplay_context.context.initialized = true;
+ mutex_init(&psp->securedisplay_context.mutex);
+ } else
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+index 8f6ce948c6841d..5ec5c3ff22bb07 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+@@ -811,7 +811,7 @@ int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring,
+ if (r)
+ return r;
+
+- /* signal the fence of the bad job */
++ /* signal the guilty fence and set an error on all fences from the context */
+ if (guilty_fence)
+ amdgpu_fence_driver_guilty_force_completion(guilty_fence);
+ /* Re-emit the non-guilty commands */
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+index 12783ea3ba0f18..869b486168f3e0 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+@@ -155,7 +155,7 @@ extern const struct drm_sched_backend_ops amdgpu_sched_ops;
+ void amdgpu_fence_driver_clear_job_fences(struct amdgpu_ring *ring);
+ void amdgpu_fence_driver_set_error(struct amdgpu_ring *ring, int error);
+ void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
+-void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *fence);
++void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af);
+ void amdgpu_fence_save_wptr(struct dma_fence *fence);
+
+ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring);
+diff --git a/drivers/gpu/drm/amd/amdgpu/cyan_skillfish_reg_init.c b/drivers/gpu/drm/amd/amdgpu/cyan_skillfish_reg_init.c
+new file mode 100644
+index 00000000000000..96616a865aac71
+--- /dev/null
++++ b/drivers/gpu/drm/amd/amdgpu/cyan_skillfish_reg_init.c
+@@ -0,0 +1,56 @@
++// SPDX-License-Identifier: GPL-2.0
++/*
++ * Copyright 2018 Advanced Micro Devices, Inc.
++ *
++ * Permission is hereby granted, free of charge, to any person obtaining a
++ * copy of this software and associated documentation files (the "Software"),
++ * to deal in the Software without restriction, including without limitation
++ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
++ * and/or sell copies of the Software, and to permit persons to whom the
++ * Software is furnished to do so, subject to the following conditions:
++ *
++ * The above copyright notice and this permission notice shall be included in
++ * all copies or substantial portions of the Software.
++ *
++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
++ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
++ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
++ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
++ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
++ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
++ * OTHER DEALINGS IN THE SOFTWARE.
++ *
++ */
++#include "amdgpu.h"
++#include "nv.h"
++
++#include "soc15_common.h"
++#include "soc15_hw_ip.h"
++#include "cyan_skillfish_ip_offset.h"
++
++int cyan_skillfish_reg_base_init(struct amdgpu_device *adev)
++{
++ /* HW has more IP blocks, only initialized the blocke needed by driver */
++ uint32_t i;
++
++ adev->gfx.xcc_mask = 1;
++ for (i = 0 ; i < MAX_INSTANCE ; ++i) {
++ adev->reg_offset[GC_HWIP][i] = (uint32_t *)(&(GC_BASE.instance[i]));
++ adev->reg_offset[HDP_HWIP][i] = (uint32_t *)(&(HDP_BASE.instance[i]));
++ adev->reg_offset[MMHUB_HWIP][i] = (uint32_t *)(&(MMHUB_BASE.instance[i]));
++ adev->reg_offset[ATHUB_HWIP][i] = (uint32_t *)(&(ATHUB_BASE.instance[i]));
++ adev->reg_offset[NBIO_HWIP][i] = (uint32_t *)(&(NBIO_BASE.instance[i]));
++ adev->reg_offset[MP0_HWIP][i] = (uint32_t *)(&(MP0_BASE.instance[i]));
++ adev->reg_offset[MP1_HWIP][i] = (uint32_t *)(&(MP1_BASE.instance[i]));
++ adev->reg_offset[VCN_HWIP][i] = (uint32_t *)(&(UVD0_BASE.instance[i]));
++ adev->reg_offset[DF_HWIP][i] = (uint32_t *)(&(DF_BASE.instance[i]));
++ adev->reg_offset[DCE_HWIP][i] = (uint32_t *)(&(DMU_BASE.instance[i]));
++ adev->reg_offset[OSSSYS_HWIP][i] = (uint32_t *)(&(OSSSYS_BASE.instance[i]));
++ adev->reg_offset[SDMA0_HWIP][i] = (uint32_t *)(&(GC_BASE.instance[i]));
++ adev->reg_offset[SDMA1_HWIP][i] = (uint32_t *)(&(GC_BASE.instance[i]));
++ adev->reg_offset[SMUIO_HWIP][i] = (uint32_t *)(&(SMUIO_BASE.instance[i]));
++ adev->reg_offset[THM_HWIP][i] = (uint32_t *)(&(THM_BASE.instance[i]));
++ adev->reg_offset[CLK_HWIP][i] = (uint32_t *)(&(CLK_BASE.instance[i]));
++ }
++ return 0;
++}
+diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+index a8d5795084fc97..cf30d333205078 100644
+--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+@@ -1066,7 +1066,7 @@ static int gmc_v7_0_sw_init(struct amdgpu_ip_block *ip_block)
+ GFP_KERNEL);
+ if (!adev->gmc.vm_fault_info)
+ return -ENOMEM;
+- atomic_set(&adev->gmc.vm_fault_info_updated, 0);
++ atomic_set_release(&adev->gmc.vm_fault_info_updated, 0);
+
+ return 0;
+ }
+@@ -1288,7 +1288,7 @@ static int gmc_v7_0_process_interrupt(struct amdgpu_device *adev,
+ vmid = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
+ VMID);
+ if (amdgpu_amdkfd_is_kfd_vmid(adev, vmid)
+- && !atomic_read(&adev->gmc.vm_fault_info_updated)) {
++ && !atomic_read_acquire(&adev->gmc.vm_fault_info_updated)) {
+ struct kfd_vm_fault_info *info = adev->gmc.vm_fault_info;
+ u32 protections = REG_GET_FIELD(status,
+ VM_CONTEXT1_PROTECTION_FAULT_STATUS,
+@@ -1304,8 +1304,7 @@ static int gmc_v7_0_process_interrupt(struct amdgpu_device *adev,
+ info->prot_read = protections & 0x8 ? true : false;
+ info->prot_write = protections & 0x10 ? true : false;
+ info->prot_exec = protections & 0x20 ? true : false;
+- mb();
+- atomic_set(&adev->gmc.vm_fault_info_updated, 1);
++ atomic_set_release(&adev->gmc.vm_fault_info_updated, 1);
+ }
+
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+index b45fa0cea9d27d..0d4c93ff6f74c1 100644
+--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+@@ -1179,7 +1179,7 @@ static int gmc_v8_0_sw_init(struct amdgpu_ip_block *ip_block)
+ GFP_KERNEL);
+ if (!adev->gmc.vm_fault_info)
+ return -ENOMEM;
+- atomic_set(&adev->gmc.vm_fault_info_updated, 0);
++ atomic_set_release(&adev->gmc.vm_fault_info_updated, 0);
+
+ return 0;
+ }
+@@ -1474,7 +1474,7 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
+ vmid = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
+ VMID);
+ if (amdgpu_amdkfd_is_kfd_vmid(adev, vmid)
+- && !atomic_read(&adev->gmc.vm_fault_info_updated)) {
++ && !atomic_read_acquire(&adev->gmc.vm_fault_info_updated)) {
+ struct kfd_vm_fault_info *info = adev->gmc.vm_fault_info;
+ u32 protections = REG_GET_FIELD(status,
+ VM_CONTEXT1_PROTECTION_FAULT_STATUS,
+@@ -1490,8 +1490,7 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
+ info->prot_read = protections & 0x8 ? true : false;
+ info->prot_write = protections & 0x10 ? true : false;
+ info->prot_exec = protections & 0x20 ? true : false;
+- mb();
+- atomic_set(&adev->gmc.vm_fault_info_updated, 1);
++ atomic_set_release(&adev->gmc.vm_fault_info_updated, 1);
+ }
+
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
+index 39caac14d5fe1c..1622b1cd6f2ef4 100644
+--- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
+@@ -225,7 +225,12 @@ static int mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
+ pipe, x_pkt->header.opcode);
+
+ r = amdgpu_fence_wait_polling(ring, seq, timeout);
+- if (r < 1 || !*status_ptr) {
++
++ /*
++ * status_ptr[31:0] == 0 (fail) or status_ptr[63:0] == 1 (success).
++ * If status_ptr[31:0] == 0 then status_ptr[63:32] will have debug error information.
++ */
++ if (r < 1 || !(lower_32_bits(*status_ptr))) {
+
+ if (misc_op_str)
+ dev_err(adev->dev, "MES(%d) failed to respond to msg=%s (%s)\n",
+diff --git a/drivers/gpu/drm/amd/amdgpu/nv.h b/drivers/gpu/drm/amd/amdgpu/nv.h
+index 83e9782aef39d6..8f4817404f10d0 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nv.h
++++ b/drivers/gpu/drm/amd/amdgpu/nv.h
+@@ -31,5 +31,6 @@ extern const struct amdgpu_ip_block_version nv_common_ip_block;
+ void nv_grbm_select(struct amdgpu_device *adev,
+ u32 me, u32 pipe, u32 queue, u32 vmid);
+ void nv_set_virt_ops(struct amdgpu_device *adev);
++int cyan_skillfish_reg_base_init(struct amdgpu_device *adev);
+
+ #endif
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 58c4e57abc9e0f..163780030eb16e 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -2041,8 +2041,6 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+
+ dc_hardware_init(adev->dm.dc);
+
+- adev->dm.restore_backlight = true;
+-
+ adev->dm.hpd_rx_offload_wq = hpd_rx_irq_create_workqueue(adev);
+ if (!adev->dm.hpd_rx_offload_wq) {
+ drm_err(adev_to_drm(adev), "failed to create hpd rx offload workqueue.\n");
+@@ -3405,7 +3403,6 @@ static int dm_resume(struct amdgpu_ip_block *ip_block)
+ dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
+
+ dc_resume(dm->dc);
+- adev->dm.restore_backlight = true;
+
+ amdgpu_dm_irq_resume_early(adev);
+
+@@ -9836,6 +9833,7 @@ static void amdgpu_dm_commit_streams(struct drm_atomic_state *state,
+ bool mode_set_reset_required = false;
+ u32 i;
+ struct dc_commit_streams_params params = {dc_state->streams, dc_state->stream_count};
++ bool set_backlight_level = false;
+
+ /* Disable writeback */
+ for_each_old_connector_in_state(state, connector, old_con_state, i) {
+@@ -9955,6 +9953,7 @@ static void amdgpu_dm_commit_streams(struct drm_atomic_state *state,
+ acrtc->hw_mode = new_crtc_state->mode;
+ crtc->hwmode = new_crtc_state->mode;
+ mode_set_reset_required = true;
++ set_backlight_level = true;
+ } else if (modereset_required(new_crtc_state)) {
+ drm_dbg_atomic(dev,
+ "Atomic commit: RESET. crtc id %d:[%p]\n",
+@@ -10011,16 +10010,13 @@ static void amdgpu_dm_commit_streams(struct drm_atomic_state *state,
+ * to fix a flicker issue.
+ * It will cause the dm->actual_brightness is not the current panel brightness
+ * level. (the dm->brightness is the correct panel level)
+- * So we set the backlight level with dm->brightness value after initial
+- * set mode. Use restore_backlight flag to avoid setting backlight level
+- * for every subsequent mode set.
++ * So we set the backlight level with dm->brightness value after set mode
+ */
+- if (dm->restore_backlight) {
++ if (set_backlight_level) {
+ for (i = 0; i < dm->num_of_edps; i++) {
+ if (dm->backlight_dev[i])
+ amdgpu_dm_backlight_set_level(dm, i, dm->brightness[i]);
+ }
+- dm->restore_backlight = false;
+ }
+ }
+
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+index 6aae51c1beb363..b937da0a4e4a00 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+@@ -610,13 +610,6 @@ struct amdgpu_display_manager {
+ */
+ u32 actual_brightness[AMDGPU_DM_MAX_NUM_EDP];
+
+- /**
+- * @restore_backlight:
+- *
+- * Flag to indicate whether to restore backlight after modeset.
+- */
+- bool restore_backlight;
+-
+ /**
+ * @aux_hpd_discon_quirk:
+ *
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+index 8da882c518565f..9b28c072826992 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+@@ -5444,8 +5444,7 @@ static int smu7_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ thermal_data->max = table_info->cac_dtp_table->usSoftwareShutdownTemp *
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ else if (hwmgr->pp_table_version == PP_TABLE_V0)
+- thermal_data->max = data->thermal_temp_setting.temperature_shutdown *
+- PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
++ thermal_data->max = data->thermal_temp_setting.temperature_shutdown;
+
+ thermal_data->sw_ctf_threshold = thermal_data->max;
+
+diff --git a/drivers/gpu/drm/bridge/lontium-lt9211.c b/drivers/gpu/drm/bridge/lontium-lt9211.c
+index 399fa7eebd49cc..03fc8fd10f20aa 100644
+--- a/drivers/gpu/drm/bridge/lontium-lt9211.c
++++ b/drivers/gpu/drm/bridge/lontium-lt9211.c
+@@ -121,8 +121,7 @@ static int lt9211_read_chipid(struct lt9211 *ctx)
+ }
+
+ /* Test for known Chip ID. */
+- if (chipid[0] != REG_CHIPID0_VALUE || chipid[1] != REG_CHIPID1_VALUE ||
+- chipid[2] != REG_CHIPID2_VALUE) {
++ if (chipid[0] != REG_CHIPID0_VALUE || chipid[1] != REG_CHIPID1_VALUE) {
+ dev_err(ctx->dev, "Unknown Chip ID: 0x%02x 0x%02x 0x%02x\n",
+ chipid[0], chipid[1], chipid[2]);
+ return -EINVAL;
+diff --git a/drivers/gpu/drm/drm_draw.c b/drivers/gpu/drm/drm_draw.c
+index 9dc0408fbbeadb..5b956229c82fb6 100644
+--- a/drivers/gpu/drm/drm_draw.c
++++ b/drivers/gpu/drm/drm_draw.c
+@@ -127,7 +127,7 @@ EXPORT_SYMBOL(drm_draw_fill16);
+
+ void drm_draw_fill24(struct iosys_map *dmap, unsigned int dpitch,
+ unsigned int height, unsigned int width,
+- u16 color)
++ u32 color)
+ {
+ unsigned int y, x;
+
+diff --git a/drivers/gpu/drm/drm_draw_internal.h b/drivers/gpu/drm/drm_draw_internal.h
+index f121ee7339dc11..20cb404e23ea62 100644
+--- a/drivers/gpu/drm/drm_draw_internal.h
++++ b/drivers/gpu/drm/drm_draw_internal.h
+@@ -47,7 +47,7 @@ void drm_draw_fill16(struct iosys_map *dmap, unsigned int dpitch,
+
+ void drm_draw_fill24(struct iosys_map *dmap, unsigned int dpitch,
+ unsigned int height, unsigned int width,
+- u16 color);
++ u32 color);
+
+ void drm_draw_fill32(struct iosys_map *dmap, unsigned int dpitch,
+ unsigned int height, unsigned int width,
+diff --git a/drivers/gpu/drm/i915/display/intel_fb.c b/drivers/gpu/drm/i915/display/intel_fb.c
+index 0da842bd2f2f13..974e5b547d886e 100644
+--- a/drivers/gpu/drm/i915/display/intel_fb.c
++++ b/drivers/gpu/drm/i915/display/intel_fb.c
+@@ -2111,10 +2111,10 @@ static void intel_user_framebuffer_destroy(struct drm_framebuffer *fb)
+ if (intel_fb_uses_dpt(fb))
+ intel_dpt_destroy(intel_fb->dpt_vm);
+
+- intel_frontbuffer_put(intel_fb->frontbuffer);
+-
+ intel_fb_bo_framebuffer_fini(intel_fb_bo(fb));
+
++ intel_frontbuffer_put(intel_fb->frontbuffer);
++
+ kfree(intel_fb);
+ }
+
+@@ -2216,15 +2216,17 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ int ret = -EINVAL;
+ int i;
+
++ /*
++ * intel_frontbuffer_get() must be done before
++ * intel_fb_bo_framebuffer_init() to avoid set_tiling vs. addfb race.
++ */
++ intel_fb->frontbuffer = intel_frontbuffer_get(obj);
++ if (!intel_fb->frontbuffer)
++ return -ENOMEM;
++
+ ret = intel_fb_bo_framebuffer_init(fb, obj, mode_cmd);
+ if (ret)
+- return ret;
+-
+- intel_fb->frontbuffer = intel_frontbuffer_get(obj);
+- if (!intel_fb->frontbuffer) {
+- ret = -ENOMEM;
+- goto err;
+- }
++ goto err_frontbuffer_put;
+
+ ret = -EINVAL;
+ if (!drm_any_plane_has_format(display->drm,
+@@ -2233,7 +2235,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ drm_dbg_kms(display->drm,
+ "unsupported pixel format %p4cc / modifier 0x%llx\n",
+ &mode_cmd->pixel_format, mode_cmd->modifier[0]);
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+ }
+
+ max_stride = intel_fb_max_stride(display, mode_cmd->pixel_format,
+@@ -2244,7 +2246,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ mode_cmd->modifier[0] != DRM_FORMAT_MOD_LINEAR ?
+ "tiled" : "linear",
+ mode_cmd->pitches[0], max_stride);
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+ }
+
+ /* FIXME need to adjust LINOFF/TILEOFF accordingly. */
+@@ -2252,7 +2254,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ drm_dbg_kms(display->drm,
+ "plane 0 offset (0x%08x) must be 0\n",
+ mode_cmd->offsets[0]);
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+ }
+
+ drm_helper_mode_fill_fb_struct(display->drm, fb, info, mode_cmd);
+@@ -2262,7 +2264,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+
+ if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
+ drm_dbg_kms(display->drm, "bad plane %d handle\n", i);
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+ }
+
+ stride_alignment = intel_fb_stride_alignment(fb, i);
+@@ -2270,7 +2272,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ drm_dbg_kms(display->drm,
+ "plane %d pitch (%d) must be at least %u byte aligned\n",
+ i, fb->pitches[i], stride_alignment);
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+ }
+
+ if (intel_fb_is_gen12_ccs_aux_plane(fb, i)) {
+@@ -2280,7 +2282,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ drm_dbg_kms(display->drm,
+ "ccs aux plane %d pitch (%d) must be %d\n",
+ i, fb->pitches[i], ccs_aux_stride);
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+ }
+ }
+
+@@ -2289,7 +2291,7 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+
+ ret = intel_fill_fb_info(display, intel_fb);
+ if (ret)
+- goto err_frontbuffer_put;
++ goto err_bo_framebuffer_fini;
+
+ if (intel_fb_uses_dpt(fb)) {
+ struct i915_address_space *vm;
+@@ -2315,10 +2317,10 @@ int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
+ err_free_dpt:
+ if (intel_fb_uses_dpt(fb))
+ intel_dpt_destroy(intel_fb->dpt_vm);
++err_bo_framebuffer_fini:
++ intel_fb_bo_framebuffer_fini(obj);
+ err_frontbuffer_put:
+ intel_frontbuffer_put(intel_fb->frontbuffer);
+-err:
+- intel_fb_bo_framebuffer_fini(obj);
+ return ret;
+ }
+
+diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
+index 43be5377ddc1a0..73ed28ac957341 100644
+--- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c
++++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
+@@ -270,6 +270,8 @@ static void frontbuffer_release(struct kref *ref)
+ spin_unlock(&display->fb_tracking.lock);
+
+ i915_active_fini(&front->write);
++
++ drm_gem_object_put(obj);
+ kfree_rcu(front, rcu);
+ }
+
+@@ -287,6 +289,8 @@ intel_frontbuffer_get(struct drm_gem_object *obj)
+ if (!front)
+ return NULL;
+
++ drm_gem_object_get(obj);
++
+ front->obj = obj;
+ kref_init(&front->ref);
+ atomic_set(&front->bits, 0);
+@@ -299,8 +303,12 @@ intel_frontbuffer_get(struct drm_gem_object *obj)
+ spin_lock(&display->fb_tracking.lock);
+ cur = intel_bo_set_frontbuffer(obj, front);
+ spin_unlock(&display->fb_tracking.lock);
+- if (cur != front)
++
++ if (cur != front) {
++ drm_gem_object_put(obj);
+ kfree(front);
++ }
++
+ return cur;
+ }
+
+diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_frontbuffer.h b/drivers/gpu/drm/i915/gem/i915_gem_object_frontbuffer.h
+index b6dc3d1b9bb131..b682969e3a293c 100644
+--- a/drivers/gpu/drm/i915/gem/i915_gem_object_frontbuffer.h
++++ b/drivers/gpu/drm/i915/gem/i915_gem_object_frontbuffer.h
+@@ -89,12 +89,10 @@ i915_gem_object_set_frontbuffer(struct drm_i915_gem_object *obj,
+
+ if (!front) {
+ RCU_INIT_POINTER(obj->frontbuffer, NULL);
+- drm_gem_object_put(intel_bo_to_drm_bo(obj));
+ } else if (rcu_access_pointer(obj->frontbuffer)) {
+ cur = rcu_dereference_protected(obj->frontbuffer, true);
+ kref_get(&cur->ref);
+ } else {
+- drm_gem_object_get(intel_bo_to_drm_bo(obj));
+ rcu_assign_pointer(obj->frontbuffer, front);
+ }
+
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+index 0d5197c0824a91..5cf3a516ccfb38 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
++++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+@@ -1324,9 +1324,16 @@ static int ct_receive(struct intel_guc_ct *ct)
+
+ static void ct_try_receive_message(struct intel_guc_ct *ct)
+ {
++ struct intel_guc *guc = ct_to_guc(ct);
+ int ret;
+
+- if (GEM_WARN_ON(!ct->enabled))
++ if (!ct->enabled) {
++ GEM_WARN_ON(!guc_to_gt(guc)->uc.reset_in_progress);
++ return;
++ }
++
++ /* When interrupt disabled, message handling is not expected */
++ if (!guc->interrupts.enabled)
+ return;
+
+ ret = ct_receive(ct);
+diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
+index 36f1034839c273..44a99583518898 100644
+--- a/drivers/gpu/drm/panthor/panthor_fw.c
++++ b/drivers/gpu/drm/panthor/panthor_fw.c
+@@ -1099,6 +1099,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
+ }
+
+ panthor_job_irq_suspend(&ptdev->fw->irq);
++ panthor_fw_stop(ptdev);
+ }
+
+ /**
+diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
+index b50927a824b402..7ec7bea5e38e6c 100644
+--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
++++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
+@@ -1031,7 +1031,7 @@ static int vop2_plane_atomic_check(struct drm_plane *plane,
+ return format;
+
+ if (drm_rect_width(src) >> 16 < 4 || drm_rect_height(src) >> 16 < 4 ||
+- drm_rect_width(dest) < 4 || drm_rect_width(dest) < 4) {
++ drm_rect_width(dest) < 4 || drm_rect_height(dest) < 4) {
+ drm_err(vop2->drm, "Invalid size: %dx%d->%dx%d, min size is 4x4\n",
+ drm_rect_width(src) >> 16, drm_rect_height(src) >> 16,
+ drm_rect_width(dest), drm_rect_height(dest));
+diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
+index e2cda28a1af49d..5193be67b28e5c 100644
+--- a/drivers/gpu/drm/scheduler/sched_main.c
++++ b/drivers/gpu/drm/scheduler/sched_main.c
+@@ -986,13 +986,14 @@ int drm_sched_job_add_resv_dependencies(struct drm_sched_job *job,
+ dma_resv_assert_held(resv);
+
+ dma_resv_for_each_fence(&cursor, resv, usage, fence) {
+- /* Make sure to grab an additional ref on the added fence */
+- dma_fence_get(fence);
+- ret = drm_sched_job_add_dependency(job, fence);
+- if (ret) {
+- dma_fence_put(fence);
++ /*
++ * As drm_sched_job_add_dependency always consumes the fence
++ * reference (even when it fails), and dma_resv_for_each_fence
++ * is not obtaining one, we need to grab one before calling.
++ */
++ ret = drm_sched_job_add_dependency(job, dma_fence_get(fence));
++ if (ret)
+ return ret;
+- }
+ }
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
+index c38fba18effe1c..f2cfba6748998e 100644
+--- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
++++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
+@@ -16,6 +16,7 @@
+ #include "xe_device.h"
+ #include "xe_ggtt.h"
+ #include "xe_pm.h"
++#include "xe_vram_types.h"
+
+ static void
+ write_dpt_rotated(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs, u32 bo_ofs,
+@@ -289,7 +290,7 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
+ if (IS_DGFX(to_xe_device(bo->ttm.base.dev)) &&
+ intel_fb_rc_ccs_cc_plane(&fb->base) >= 0 &&
+ !(bo->flags & XE_BO_FLAG_NEEDS_CPU_ACCESS)) {
+- struct xe_tile *tile = xe_device_get_root_tile(xe);
++ struct xe_vram_region *vram = xe_device_get_root_tile(xe)->mem.vram;
+
+ /*
+ * If we need to able to access the clear-color value stored in
+@@ -297,7 +298,7 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
+ * accessible. This is important on small-bar systems where
+ * only some subset of VRAM is CPU accessible.
+ */
+- if (tile->mem.vram.io_size < tile->mem.vram.usable_size) {
++ if (xe_vram_region_io_size(vram) < xe_vram_region_usable_size(vram)) {
+ ret = -EINVAL;
+ goto err;
+ }
+diff --git a/drivers/gpu/drm/xe/display/xe_plane_initial.c b/drivers/gpu/drm/xe/display/xe_plane_initial.c
+index dcbc4b2d3fd944..b2d27458def525 100644
+--- a/drivers/gpu/drm/xe/display/xe_plane_initial.c
++++ b/drivers/gpu/drm/xe/display/xe_plane_initial.c
+@@ -21,6 +21,7 @@
+ #include "intel_plane.h"
+ #include "intel_plane_initial.h"
+ #include "xe_bo.h"
++#include "xe_vram_types.h"
+ #include "xe_wa.h"
+
+ #include <generated/xe_wa_oob.h>
+@@ -103,7 +104,7 @@ initial_plane_bo(struct xe_device *xe,
+ * We don't currently expect this to ever be placed in the
+ * stolen portion.
+ */
+- if (phys_base >= tile0->mem.vram.usable_size) {
++ if (phys_base >= xe_vram_region_usable_size(tile0->mem.vram)) {
+ drm_err(&xe->drm,
+ "Initial plane programming using invalid range, phys_base=%pa\n",
+ &phys_base);
+diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+index 5cd5ab8529c5c0..9994887fc73f97 100644
+--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
++++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+@@ -342,6 +342,7 @@
+ #define POWERGATE_ENABLE XE_REG(0xa210)
+ #define RENDER_POWERGATE_ENABLE REG_BIT(0)
+ #define MEDIA_POWERGATE_ENABLE REG_BIT(1)
++#define MEDIA_SAMPLERS_POWERGATE_ENABLE REG_BIT(2)
+ #define VDN_HCP_POWERGATE_ENABLE(n) REG_BIT(3 + 2 * (n))
+ #define VDN_MFXVDENC_POWERGATE_ENABLE(n) REG_BIT(4 + 2 * (n))
+
+diff --git a/drivers/gpu/drm/xe/xe_assert.h b/drivers/gpu/drm/xe/xe_assert.h
+index 68fe70ce2be3ba..a818eaa05b7dcf 100644
+--- a/drivers/gpu/drm/xe/xe_assert.h
++++ b/drivers/gpu/drm/xe/xe_assert.h
+@@ -12,6 +12,7 @@
+
+ #include "xe_gt_types.h"
+ #include "xe_step.h"
++#include "xe_vram.h"
+
+ /**
+ * DOC: Xe Asserts
+@@ -145,7 +146,8 @@
+ const struct xe_tile *__tile = (tile); \
+ char __buf[10] __maybe_unused; \
+ xe_assert_msg(tile_to_xe(__tile), condition, "tile: %u VRAM %s\n" msg, \
+- __tile->id, ({ string_get_size(__tile->mem.vram.actual_physical_size, 1, \
++ __tile->id, ({ string_get_size( \
++ xe_vram_region_actual_physical_size(__tile->mem.vram), 1, \
+ STRING_UNITS_2, __buf, sizeof(__buf)); __buf; }), ## arg); \
+ })
+
+diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
+index bae7ff2e59276c..50c79049ccea0e 100644
+--- a/drivers/gpu/drm/xe/xe_bo.c
++++ b/drivers/gpu/drm/xe/xe_bo.c
+@@ -36,6 +36,7 @@
+ #include "xe_trace_bo.h"
+ #include "xe_ttm_stolen_mgr.h"
+ #include "xe_vm.h"
++#include "xe_vram_types.h"
+
+ const char *const xe_mem_type_to_name[TTM_NUM_MEM_TYPES] = {
+ [XE_PL_SYSTEM] = "system",
+diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
+index 9ce94d25201562..cfb1ec266a6da1 100644
+--- a/drivers/gpu/drm/xe/xe_bo.h
++++ b/drivers/gpu/drm/xe/xe_bo.h
+@@ -12,6 +12,7 @@
+ #include "xe_macros.h"
+ #include "xe_vm_types.h"
+ #include "xe_vm.h"
++#include "xe_vram_types.h"
+
+ #define XE_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by default */
+
+@@ -23,8 +24,9 @@
+ #define XE_BO_FLAG_VRAM_MASK (XE_BO_FLAG_VRAM0 | XE_BO_FLAG_VRAM1)
+ /* -- */
+ #define XE_BO_FLAG_STOLEN BIT(4)
++#define XE_BO_FLAG_VRAM(vram) (XE_BO_FLAG_VRAM0 << ((vram)->id))
+ #define XE_BO_FLAG_VRAM_IF_DGFX(tile) (IS_DGFX(tile_to_xe(tile)) ? \
+- XE_BO_FLAG_VRAM0 << (tile)->id : \
++ XE_BO_FLAG_VRAM((tile)->mem.vram) : \
+ XE_BO_FLAG_SYSTEM)
+ #define XE_BO_FLAG_GGTT BIT(5)
+ #define XE_BO_FLAG_IGNORE_MIN_PAGE_SIZE BIT(6)
+diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
+index d5dbc51e8612d8..bc5b4c5fab8129 100644
+--- a/drivers/gpu/drm/xe/xe_bo_evict.c
++++ b/drivers/gpu/drm/xe/xe_bo_evict.c
+@@ -182,7 +182,6 @@ int xe_bo_evict_all(struct xe_device *xe)
+
+ static int xe_bo_restore_and_map_ggtt(struct xe_bo *bo)
+ {
+- struct xe_device *xe = xe_bo_device(bo);
+ int ret;
+
+ ret = xe_bo_restore_pinned(bo);
+@@ -201,13 +200,6 @@ static int xe_bo_restore_and_map_ggtt(struct xe_bo *bo)
+ }
+ }
+
+- /*
+- * We expect validate to trigger a move VRAM and our move code
+- * should setup the iosys map.
+- */
+- xe_assert(xe, !(bo->flags & XE_BO_FLAG_PINNED_LATE_RESTORE) ||
+- !iosys_map_is_null(&bo->vmap));
+-
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
+index ff560d82496ff4..57d34698139ee2 100644
+--- a/drivers/gpu/drm/xe/xe_bo_types.h
++++ b/drivers/gpu/drm/xe/xe_bo_types.h
+@@ -9,6 +9,7 @@
+ #include <linux/iosys-map.h>
+
+ #include <drm/drm_gpusvm.h>
++#include <drm/drm_pagemap.h>
+ #include <drm/ttm/ttm_bo.h>
+ #include <drm/ttm/ttm_device.h>
+ #include <drm/ttm/ttm_placement.h>
+diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
+index 6ece4defa9df03..1c9907b8a4e9ec 100644
+--- a/drivers/gpu/drm/xe/xe_device.c
++++ b/drivers/gpu/drm/xe/xe_device.c
+@@ -64,6 +64,7 @@
+ #include "xe_ttm_sys_mgr.h"
+ #include "xe_vm.h"
+ #include "xe_vram.h"
++#include "xe_vram_types.h"
+ #include "xe_vsec.h"
+ #include "xe_wait_user_fence.h"
+ #include "xe_wa.h"
+@@ -688,6 +689,21 @@ static void sriov_update_device_info(struct xe_device *xe)
+ }
+ }
+
++static int xe_device_vram_alloc(struct xe_device *xe)
++{
++ struct xe_vram_region *vram;
++
++ if (!IS_DGFX(xe))
++ return 0;
++
++ vram = drmm_kzalloc(&xe->drm, sizeof(*vram), GFP_KERNEL);
++ if (!vram)
++ return -ENOMEM;
++
++ xe->mem.vram = vram;
++ return 0;
++}
++
+ /**
+ * xe_device_probe_early: Device early probe
+ * @xe: xe device instance
+@@ -735,6 +751,10 @@ int xe_device_probe_early(struct xe_device *xe)
+
+ xe->wedged.mode = xe_modparam.wedged_mode;
+
++ err = xe_device_vram_alloc(xe);
++ if (err)
++ return err;
++
+ return 0;
+ }
+ ALLOW_ERROR_INJECTION(xe_device_probe_early, ERRNO); /* See xe_pci_probe() */
+@@ -1029,7 +1049,7 @@ void xe_device_l2_flush(struct xe_device *xe)
+ spin_lock(>->global_invl_lock);
+
+ xe_mmio_write32(>->mmio, XE2_GLOBAL_INVAL, 0x1);
+- if (xe_mmio_wait32(>->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 500, NULL, true))
++ if (xe_mmio_wait32(>->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 1000, NULL, true))
+ xe_gt_err_once(gt, "Global invalidation timeout\n");
+
+ spin_unlock(>->global_invl_lock);
+diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
+index 7ceb0c90f3914c..ac6419f475733b 100644
+--- a/drivers/gpu/drm/xe/xe_device_types.h
++++ b/drivers/gpu/drm/xe/xe_device_types.h
+@@ -10,7 +10,6 @@
+
+ #include <drm/drm_device.h>
+ #include <drm/drm_file.h>
+-#include <drm/drm_pagemap.h>
+ #include <drm/ttm/ttm_device.h>
+
+ #include "xe_devcoredump_types.h"
+@@ -26,7 +25,6 @@
+ #include "xe_sriov_vf_types.h"
+ #include "xe_step_types.h"
+ #include "xe_survivability_mode_types.h"
+-#include "xe_ttm_vram_mgr_types.h"
+
+ #if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+ #define TEST_VM_OPS_ERROR
+@@ -39,6 +37,7 @@ struct xe_ggtt;
+ struct xe_i2c;
+ struct xe_pat_ops;
+ struct xe_pxp;
++struct xe_vram_region;
+
+ #define XE_BO_INVALID_OFFSET LONG_MAX
+
+@@ -71,61 +70,6 @@ struct xe_pxp;
+ const struct xe_tile * : (const struct xe_device *)((tile__)->xe), \
+ struct xe_tile * : (tile__)->xe)
+
+-/**
+- * struct xe_vram_region - memory region structure
+- * This is used to describe a memory region in xe
+- * device, such as HBM memory or CXL extension memory.
+- */
+-struct xe_vram_region {
+- /** @io_start: IO start address of this VRAM instance */
+- resource_size_t io_start;
+- /**
+- * @io_size: IO size of this VRAM instance
+- *
+- * This represents how much of this VRAM we can access
+- * via the CPU through the VRAM BAR. This can be smaller
+- * than @usable_size, in which case only part of VRAM is CPU
+- * accessible (typically the first 256M). This
+- * configuration is known as small-bar.
+- */
+- resource_size_t io_size;
+- /** @dpa_base: This memory regions's DPA (device physical address) base */
+- resource_size_t dpa_base;
+- /**
+- * @usable_size: usable size of VRAM
+- *
+- * Usable size of VRAM excluding reserved portions
+- * (e.g stolen mem)
+- */
+- resource_size_t usable_size;
+- /**
+- * @actual_physical_size: Actual VRAM size
+- *
+- * Actual VRAM size including reserved portions
+- * (e.g stolen mem)
+- */
+- resource_size_t actual_physical_size;
+- /** @mapping: pointer to VRAM mappable space */
+- void __iomem *mapping;
+- /** @ttm: VRAM TTM manager */
+- struct xe_ttm_vram_mgr ttm;
+-#if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP)
+- /** @pagemap: Used to remap device memory as ZONE_DEVICE */
+- struct dev_pagemap pagemap;
+- /**
+- * @dpagemap: The struct drm_pagemap of the ZONE_DEVICE memory
+- * pages of this tile.
+- */
+- struct drm_pagemap dpagemap;
+- /**
+- * @hpa_base: base host physical address
+- *
+- * This is generated when remap device memory as ZONE_DEVICE
+- */
+- resource_size_t hpa_base;
+-#endif
+-};
+-
+ /**
+ * struct xe_mmio - register mmio structure
+ *
+@@ -216,7 +160,7 @@ struct xe_tile {
+ * Although VRAM is associated with a specific tile, it can
+ * still be accessed by all tiles' GTs.
+ */
+- struct xe_vram_region vram;
++ struct xe_vram_region *vram;
+
+ /** @mem.ggtt: Global graphics translation table */
+ struct xe_ggtt *ggtt;
+@@ -412,7 +356,7 @@ struct xe_device {
+ /** @mem: memory info for device */
+ struct {
+ /** @mem.vram: VRAM info for device */
+- struct xe_vram_region vram;
++ struct xe_vram_region *vram;
+ /** @mem.sys_mgr: system TTM manager */
+ struct ttm_resource_manager sys_mgr;
+ /** @mem.sys_mgr: system memory shrinker. */
+diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c
+index ffb210216aa99f..9bd197da60279c 100644
+--- a/drivers/gpu/drm/xe/xe_gt_idle.c
++++ b/drivers/gpu/drm/xe/xe_gt_idle.c
+@@ -124,6 +124,9 @@ void xe_gt_idle_enable_pg(struct xe_gt *gt)
+ if (xe_gt_is_main_type(gt))
+ gtidle->powergate_enable |= RENDER_POWERGATE_ENABLE;
+
++ if (MEDIA_VERx100(xe) >= 1100 && MEDIA_VERx100(xe) < 1255)
++ gtidle->powergate_enable |= MEDIA_SAMPLERS_POWERGATE_ENABLE;
++
+ if (xe->info.platform != XE_DG1) {
+ for (i = XE_HW_ENGINE_VCS0, j = 0; i <= XE_HW_ENGINE_VCS7; ++i, ++j) {
+ if ((gt->info.engine_mask & BIT(i)))
+@@ -246,6 +249,11 @@ int xe_gt_idle_pg_print(struct xe_gt *gt, struct drm_printer *p)
+ drm_printf(p, "Media Slice%d Power Gate Status: %s\n", n,
+ str_up_down(pg_status & media_slices[n].status_bit));
+ }
++
++ if (MEDIA_VERx100(xe) >= 1100 && MEDIA_VERx100(xe) < 1255)
++ drm_printf(p, "Media Samplers Power Gating Enabled: %s\n",
++ str_yes_no(pg_enabled & MEDIA_SAMPLERS_POWERGATE_ENABLE));
++
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
+index 5a75d56d8558dd..ab43dec5277689 100644
+--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
++++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
+@@ -23,6 +23,7 @@
+ #include "xe_svm.h"
+ #include "xe_trace_bo.h"
+ #include "xe_vm.h"
++#include "xe_vram_types.h"
+
+ struct pagefault {
+ u64 page_addr;
+@@ -74,7 +75,7 @@ static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
+ }
+
+ static int xe_pf_begin(struct drm_exec *exec, struct xe_vma *vma,
+- bool atomic, unsigned int id)
++ bool atomic, struct xe_vram_region *vram)
+ {
+ struct xe_bo *bo = xe_vma_bo(vma);
+ struct xe_vm *vm = xe_vma_vm(vma);
+@@ -84,14 +85,16 @@ static int xe_pf_begin(struct drm_exec *exec, struct xe_vma *vma,
+ if (err)
+ return err;
+
+- if (atomic && IS_DGFX(vm->xe)) {
++ if (atomic && vram) {
++ xe_assert(vm->xe, IS_DGFX(vm->xe));
++
+ if (xe_vma_is_userptr(vma)) {
+ err = -EACCES;
+ return err;
+ }
+
+ /* Migrate to VRAM, move should invalidate the VMA first */
+- err = xe_bo_migrate(bo, XE_PL_VRAM0 + id);
++ err = xe_bo_migrate(bo, vram->placement);
+ if (err)
+ return err;
+ } else if (bo) {
+@@ -138,7 +141,7 @@ static int handle_vma_pagefault(struct xe_gt *gt, struct xe_vma *vma,
+ /* Lock VM and BOs dma-resv */
+ drm_exec_init(&exec, 0, 0);
+ drm_exec_until_all_locked(&exec) {
+- err = xe_pf_begin(&exec, vma, atomic, tile->id);
++ err = xe_pf_begin(&exec, vma, atomic, tile->mem.vram);
+ drm_exec_retry_on_contention(&exec);
+ if (xe_vm_validate_should_retry(&exec, err, &end))
+ err = -EAGAIN;
+@@ -573,7 +576,7 @@ static int handle_acc(struct xe_gt *gt, struct acc *acc)
+ /* Lock VM and BOs dma-resv */
+ drm_exec_init(&exec, 0, 0);
+ drm_exec_until_all_locked(&exec) {
+- ret = xe_pf_begin(&exec, vma, true, tile->id);
++ ret = xe_pf_begin(&exec, vma, true, tile->mem.vram);
+ drm_exec_retry_on_contention(&exec);
+ if (ret)
+ break;
+diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+index d84831a03610db..61a357946fe1ef 100644
+--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
++++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+@@ -33,6 +33,7 @@
+ #include "xe_migrate.h"
+ #include "xe_sriov.h"
+ #include "xe_ttm_vram_mgr.h"
++#include "xe_vram_types.h"
+ #include "xe_wopcm.h"
+
+ #define make_u64_from_u32(hi, lo) ((u64)((u64)(u32)(hi) << 32 | (u32)(lo)))
+@@ -1604,7 +1605,7 @@ static u64 pf_query_free_lmem(struct xe_gt *gt)
+ {
+ struct xe_tile *tile = gt->tile;
+
+- return xe_ttm_vram_get_avail(&tile->mem.vram.ttm.manager);
++ return xe_ttm_vram_get_avail(&tile->mem.vram->ttm.manager);
+ }
+
+ static u64 pf_query_max_lmem(struct xe_gt *gt)
+diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
+index 0104afbc941c84..439725fc4fe61a 100644
+--- a/drivers/gpu/drm/xe/xe_guc_submit.c
++++ b/drivers/gpu/drm/xe/xe_guc_submit.c
+@@ -44,6 +44,7 @@
+ #include "xe_ring_ops_types.h"
+ #include "xe_sched_job.h"
+ #include "xe_trace.h"
++#include "xe_uc_fw.h"
+ #include "xe_vm.h"
+
+ static struct xe_guc *
+@@ -1413,7 +1414,17 @@ static void __guc_exec_queue_process_msg_cleanup(struct xe_sched_msg *msg)
+ xe_gt_assert(guc_to_gt(guc), !(q->flags & EXEC_QUEUE_FLAG_PERMANENT));
+ trace_xe_exec_queue_cleanup_entity(q);
+
+- if (exec_queue_registered(q))
++ /*
++ * Expected state transitions for cleanup:
++ * - If the exec queue is registered and GuC firmware is running, we must first
++ * disable scheduling and deregister the queue to ensure proper teardown and
++ * resource release in the GuC, then destroy the exec queue on driver side.
++ * - If the GuC is already stopped (e.g., during driver unload or GPU reset),
++ * we cannot expect a response for the deregister request. In this case,
++ * it is safe to directly destroy the exec queue on driver side, as the GuC
++ * will not process further requests and all resources must be cleaned up locally.
++ */
++ if (exec_queue_registered(q) && xe_uc_fw_is_running(&guc->fw))
+ disable_scheduling_deregister(guc, q);
+ else
+ __guc_exec_queue_destroy(guc, q);
+diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
+index 84f412fd3c5d2a..13e287e0370967 100644
+--- a/drivers/gpu/drm/xe/xe_migrate.c
++++ b/drivers/gpu/drm/xe/xe_migrate.c
+@@ -34,6 +34,7 @@
+ #include "xe_sync.h"
+ #include "xe_trace_bo.h"
+ #include "xe_vm.h"
++#include "xe_vram.h"
+
+ /**
+ * struct xe_migrate - migrate context.
+@@ -130,34 +131,36 @@ static u64 xe_migrate_vram_ofs(struct xe_device *xe, u64 addr, bool is_comp_pte)
+ u64 identity_offset = IDENTITY_OFFSET;
+
+ if (GRAPHICS_VER(xe) >= 20 && is_comp_pte)
+- identity_offset += DIV_ROUND_UP_ULL(xe->mem.vram.actual_physical_size, SZ_1G);
++ identity_offset += DIV_ROUND_UP_ULL(xe_vram_region_actual_physical_size
++ (xe->mem.vram), SZ_1G);
+
+- addr -= xe->mem.vram.dpa_base;
++ addr -= xe_vram_region_dpa_base(xe->mem.vram);
+ return addr + (identity_offset << xe_pt_shift(2));
+ }
+
+ static void xe_migrate_program_identity(struct xe_device *xe, struct xe_vm *vm, struct xe_bo *bo,
+ u64 map_ofs, u64 vram_offset, u16 pat_index, u64 pt_2m_ofs)
+ {
++ struct xe_vram_region *vram = xe->mem.vram;
++ resource_size_t dpa_base = xe_vram_region_dpa_base(vram);
+ u64 pos, ofs, flags;
+ u64 entry;
+ /* XXX: Unclear if this should be usable_size? */
+- u64 vram_limit = xe->mem.vram.actual_physical_size +
+- xe->mem.vram.dpa_base;
++ u64 vram_limit = xe_vram_region_actual_physical_size(vram) + dpa_base;
+ u32 level = 2;
+
+ ofs = map_ofs + XE_PAGE_SIZE * level + vram_offset * 8;
+ flags = vm->pt_ops->pte_encode_addr(xe, 0, pat_index, level,
+ true, 0);
+
+- xe_assert(xe, IS_ALIGNED(xe->mem.vram.usable_size, SZ_2M));
++ xe_assert(xe, IS_ALIGNED(xe_vram_region_usable_size(vram), SZ_2M));
+
+ /*
+ * Use 1GB pages when possible, last chunk always use 2M
+ * pages as mixing reserved memory (stolen, WOCPM) with a single
+ * mapping is not allowed on certain platforms.
+ */
+- for (pos = xe->mem.vram.dpa_base; pos < vram_limit;
++ for (pos = dpa_base; pos < vram_limit;
+ pos += SZ_1G, ofs += 8) {
+ if (pos + SZ_1G >= vram_limit) {
+ entry = vm->pt_ops->pde_encode_bo(bo, pt_2m_ofs,
+@@ -307,11 +310,11 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
+ /* Identity map the entire vram at 256GiB offset */
+ if (IS_DGFX(xe)) {
+ u64 pt30_ofs = xe_bo_size(bo) - 2 * XE_PAGE_SIZE;
++ resource_size_t actual_phy_size = xe_vram_region_actual_physical_size(xe->mem.vram);
+
+ xe_migrate_program_identity(xe, vm, bo, map_ofs, IDENTITY_OFFSET,
+ pat_index, pt30_ofs);
+- xe_assert(xe, xe->mem.vram.actual_physical_size <=
+- (MAX_NUM_PTE - IDENTITY_OFFSET) * SZ_1G);
++ xe_assert(xe, actual_phy_size <= (MAX_NUM_PTE - IDENTITY_OFFSET) * SZ_1G);
+
+ /*
+ * Identity map the entire vram for compressed pat_index for xe2+
+@@ -320,11 +323,11 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
+ if (GRAPHICS_VER(xe) >= 20 && xe_device_has_flat_ccs(xe)) {
+ u16 comp_pat_index = xe->pat.idx[XE_CACHE_NONE_COMPRESSION];
+ u64 vram_offset = IDENTITY_OFFSET +
+- DIV_ROUND_UP_ULL(xe->mem.vram.actual_physical_size, SZ_1G);
++ DIV_ROUND_UP_ULL(actual_phy_size, SZ_1G);
+ u64 pt31_ofs = xe_bo_size(bo) - XE_PAGE_SIZE;
+
+- xe_assert(xe, xe->mem.vram.actual_physical_size <= (MAX_NUM_PTE -
+- IDENTITY_OFFSET - IDENTITY_OFFSET / 2) * SZ_1G);
++ xe_assert(xe, actual_phy_size <= (MAX_NUM_PTE - IDENTITY_OFFSET -
++ IDENTITY_OFFSET / 2) * SZ_1G);
+ xe_migrate_program_identity(xe, vm, bo, map_ofs, vram_offset,
+ comp_pat_index, pt31_ofs);
+ }
+diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
+index 3c40ef426f0cb5..6c2637fc8f1abd 100644
+--- a/drivers/gpu/drm/xe/xe_pci.c
++++ b/drivers/gpu/drm/xe/xe_pci.c
+@@ -687,6 +687,8 @@ static int xe_info_init(struct xe_device *xe,
+ * All of these together determine the overall GT count.
+ */
+ for_each_tile(tile, xe, id) {
++ int err;
++
+ gt = tile->primary_gt;
+ gt->info.type = XE_GT_TYPE_MAIN;
+ gt->info.id = tile->id * xe->info.max_gt_per_tile;
+@@ -694,6 +696,10 @@ static int xe_info_init(struct xe_device *xe,
+ gt->info.engine_mask = graphics_desc->hw_engine_mask;
+ xe->info.gt_count++;
+
++ err = xe_tile_alloc_vram(tile);
++ if (err)
++ return err;
++
+ if (MEDIA_VER(xe) < 13 && media_desc)
+ gt->info.engine_mask |= media_desc->hw_engine_mask;
+
+@@ -799,6 +805,8 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+ if (err)
+ return err;
+
++ xe_vram_resize_bar(xe);
++
+ err = xe_device_probe_early(xe);
+ /*
+ * In Boot Survivability mode, no drm card is exposed and driver
+diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
+index 83fe77ce62f761..f2a3d4ced068c5 100644
+--- a/drivers/gpu/drm/xe/xe_query.c
++++ b/drivers/gpu/drm/xe/xe_query.c
+@@ -27,6 +27,7 @@
+ #include "xe_oa.h"
+ #include "xe_pxp.h"
+ #include "xe_ttm_vram_mgr.h"
++#include "xe_vram_types.h"
+ #include "xe_wa.h"
+
+ static const u16 xe_to_user_engine_class[] = {
+@@ -334,7 +335,7 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
+ config->num_params = num_params;
+ config->info[DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID] =
+ xe->info.devid | (xe->info.revid << 16);
+- if (xe_device_get_root_tile(xe)->mem.vram.usable_size)
++ if (xe->mem.vram)
+ config->info[DRM_XE_QUERY_CONFIG_FLAGS] |=
+ DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM;
+ if (xe->info.has_usm && IS_ENABLED(CONFIG_DRM_XE_GPUSVM))
+@@ -407,7 +408,7 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
+ gt_list->gt_list[iter].near_mem_regions = 0x1;
+ else
+ gt_list->gt_list[iter].near_mem_regions =
+- BIT(gt_to_tile(gt)->id) << 1;
++ BIT(gt_to_tile(gt)->mem.vram->id) << 1;
+ gt_list->gt_list[iter].far_mem_regions = xe->info.mem_region_mask ^
+ gt_list->gt_list[iter].near_mem_regions;
+
+diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
+index a7ff5975873f99..10c8a1bcb86e88 100644
+--- a/drivers/gpu/drm/xe/xe_svm.c
++++ b/drivers/gpu/drm/xe/xe_svm.c
+@@ -17,6 +17,7 @@
+ #include "xe_ttm_vram_mgr.h"
+ #include "xe_vm.h"
+ #include "xe_vm_types.h"
++#include "xe_vram_types.h"
+
+ static bool xe_svm_range_in_vram(struct xe_svm_range *range)
+ {
+@@ -306,21 +307,15 @@ static struct xe_vram_region *page_to_vr(struct page *page)
+ return container_of(page_pgmap(page), struct xe_vram_region, pagemap);
+ }
+
+-static struct xe_tile *vr_to_tile(struct xe_vram_region *vr)
+-{
+- return container_of(vr, struct xe_tile, mem.vram);
+-}
+-
+ static u64 xe_vram_region_page_to_dpa(struct xe_vram_region *vr,
+ struct page *page)
+ {
+ u64 dpa;
+- struct xe_tile *tile = vr_to_tile(vr);
+ u64 pfn = page_to_pfn(page);
+ u64 offset;
+
+- xe_tile_assert(tile, is_device_private_page(page));
+- xe_tile_assert(tile, (pfn << PAGE_SHIFT) >= vr->hpa_base);
++ xe_assert(vr->xe, is_device_private_page(page));
++ xe_assert(vr->xe, (pfn << PAGE_SHIFT) >= vr->hpa_base);
+
+ offset = (pfn << PAGE_SHIFT) - vr->hpa_base;
+ dpa = vr->dpa_base + offset;
+@@ -337,7 +332,7 @@ static int xe_svm_copy(struct page **pages, dma_addr_t *dma_addr,
+ unsigned long npages, const enum xe_svm_copy_dir dir)
+ {
+ struct xe_vram_region *vr = NULL;
+- struct xe_tile *tile;
++ struct xe_device *xe;
+ struct dma_fence *fence = NULL;
+ unsigned long i;
+ #define XE_VRAM_ADDR_INVALID ~0x0ull
+@@ -370,7 +365,7 @@ static int xe_svm_copy(struct page **pages, dma_addr_t *dma_addr,
+
+ if (!vr && spage) {
+ vr = page_to_vr(spage);
+- tile = vr_to_tile(vr);
++ xe = vr->xe;
+ }
+ XE_WARN_ON(spage && page_to_vr(spage) != vr);
+
+@@ -402,18 +397,18 @@ static int xe_svm_copy(struct page **pages, dma_addr_t *dma_addr,
+
+ if (vram_addr != XE_VRAM_ADDR_INVALID) {
+ if (sram) {
+- vm_dbg(&tile->xe->drm,
++ vm_dbg(&xe->drm,
+ "COPY TO SRAM - 0x%016llx -> 0x%016llx, NPAGES=%ld",
+ vram_addr, (u64)dma_addr[pos], i - pos + incr);
+- __fence = xe_migrate_from_vram(tile->migrate,
++ __fence = xe_migrate_from_vram(vr->migrate,
+ i - pos + incr,
+ vram_addr,
+ dma_addr + pos);
+ } else {
+- vm_dbg(&tile->xe->drm,
++ vm_dbg(&xe->drm,
+ "COPY TO VRAM - 0x%016llx -> 0x%016llx, NPAGES=%ld",
+ (u64)dma_addr[pos], vram_addr, i - pos + incr);
+- __fence = xe_migrate_to_vram(tile->migrate,
++ __fence = xe_migrate_to_vram(vr->migrate,
+ i - pos + incr,
+ dma_addr + pos,
+ vram_addr);
+@@ -438,17 +433,17 @@ static int xe_svm_copy(struct page **pages, dma_addr_t *dma_addr,
+ /* Extra mismatched device page, copy it */
+ if (!match && last && vram_addr != XE_VRAM_ADDR_INVALID) {
+ if (sram) {
+- vm_dbg(&tile->xe->drm,
++ vm_dbg(&xe->drm,
+ "COPY TO SRAM - 0x%016llx -> 0x%016llx, NPAGES=%d",
+ vram_addr, (u64)dma_addr[pos], 1);
+- __fence = xe_migrate_from_vram(tile->migrate, 1,
++ __fence = xe_migrate_from_vram(vr->migrate, 1,
+ vram_addr,
+ dma_addr + pos);
+ } else {
+- vm_dbg(&tile->xe->drm,
++ vm_dbg(&xe->drm,
+ "COPY TO VRAM - 0x%016llx -> 0x%016llx, NPAGES=%d",
+ (u64)dma_addr[pos], vram_addr, 1);
+- __fence = xe_migrate_to_vram(tile->migrate, 1,
++ __fence = xe_migrate_to_vram(vr->migrate, 1,
+ dma_addr + pos,
+ vram_addr);
+ }
+@@ -506,9 +501,9 @@ static u64 block_offset_to_pfn(struct xe_vram_region *vr, u64 offset)
+ return PHYS_PFN(offset + vr->hpa_base);
+ }
+
+-static struct drm_buddy *tile_to_buddy(struct xe_tile *tile)
++static struct drm_buddy *vram_to_buddy(struct xe_vram_region *vram)
+ {
+- return &tile->mem.vram.ttm.mm;
++ return &vram->ttm.mm;
+ }
+
+ static int xe_svm_populate_devmem_pfn(struct drm_pagemap_devmem *devmem_allocation,
+@@ -522,8 +517,7 @@ static int xe_svm_populate_devmem_pfn(struct drm_pagemap_devmem *devmem_allocati
+
+ list_for_each_entry(block, blocks, link) {
+ struct xe_vram_region *vr = block->private;
+- struct xe_tile *tile = vr_to_tile(vr);
+- struct drm_buddy *buddy = tile_to_buddy(tile);
++ struct drm_buddy *buddy = vram_to_buddy(vr);
+ u64 block_pfn = block_offset_to_pfn(vr, drm_buddy_block_offset(block));
+ int i;
+
+@@ -683,20 +677,14 @@ u64 xe_svm_find_vma_start(struct xe_vm *vm, u64 start, u64 end, struct xe_vma *v
+ }
+
+ #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP)
+-static struct xe_vram_region *tile_to_vr(struct xe_tile *tile)
+-{
+- return &tile->mem.vram;
+-}
+-
+ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
+ unsigned long start, unsigned long end,
+ struct mm_struct *mm,
+ unsigned long timeslice_ms)
+ {
+- struct xe_tile *tile = container_of(dpagemap, typeof(*tile), mem.vram.dpagemap);
+- struct xe_device *xe = tile_to_xe(tile);
++ struct xe_vram_region *vr = container_of(dpagemap, typeof(*vr), dpagemap);
++ struct xe_device *xe = vr->xe;
+ struct device *dev = xe->drm.dev;
+- struct xe_vram_region *vr = tile_to_vr(tile);
+ struct drm_buddy_block *block;
+ struct list_head *blocks;
+ struct xe_bo *bo;
+@@ -709,9 +697,9 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
+ xe_pm_runtime_get(xe);
+
+ retry:
+- bo = xe_bo_create_locked(tile_to_xe(tile), NULL, NULL, end - start,
++ bo = xe_bo_create_locked(vr->xe, NULL, NULL, end - start,
+ ttm_bo_type_device,
+- XE_BO_FLAG_VRAM_IF_DGFX(tile) |
++ (IS_DGFX(xe) ? XE_BO_FLAG_VRAM(vr) : XE_BO_FLAG_SYSTEM) |
+ XE_BO_FLAG_CPU_ADDR_MIRROR);
+ if (IS_ERR(bo)) {
+ err = PTR_ERR(bo);
+@@ -721,9 +709,7 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
+ }
+
+ drm_pagemap_devmem_init(&bo->devmem_allocation, dev, mm,
+- &dpagemap_devmem_ops,
+- &tile->mem.vram.dpagemap,
+- end - start);
++ &dpagemap_devmem_ops, dpagemap, end - start);
+
+ blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks;
+ list_for_each_entry(block, blocks, link)
+@@ -999,6 +985,11 @@ int xe_svm_range_get_pages(struct xe_vm *vm, struct xe_svm_range *range,
+
+ #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP)
+
++static struct drm_pagemap *tile_local_pagemap(struct xe_tile *tile)
++{
++ return &tile->mem.vram->dpagemap;
++}
++
+ /**
+ * xe_svm_alloc_vram()- Allocate device memory pages for range,
+ * migrating existing data.
+@@ -1016,7 +1007,7 @@ int xe_svm_alloc_vram(struct xe_tile *tile, struct xe_svm_range *range,
+ xe_assert(tile_to_xe(tile), range->base.flags.migrate_devmem);
+ range_debug(range, "ALLOCATE VRAM");
+
+- dpagemap = xe_tile_local_pagemap(tile);
++ dpagemap = tile_local_pagemap(tile);
+ return drm_pagemap_populate_mm(dpagemap, xe_svm_range_start(range),
+ xe_svm_range_end(range),
+ range->base.gpusvm->mm,
+diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
+index 86e9811e60ba08..e34edff0eaa105 100644
+--- a/drivers/gpu/drm/xe/xe_tile.c
++++ b/drivers/gpu/drm/xe/xe_tile.c
+@@ -7,6 +7,7 @@
+
+ #include <drm/drm_managed.h>
+
++#include "xe_bo.h"
+ #include "xe_device.h"
+ #include "xe_ggtt.h"
+ #include "xe_gt.h"
+@@ -19,6 +20,8 @@
+ #include "xe_tile_sysfs.h"
+ #include "xe_ttm_vram_mgr.h"
+ #include "xe_wa.h"
++#include "xe_vram.h"
++#include "xe_vram_types.h"
+
+ /**
+ * DOC: Multi-tile Design
+@@ -95,6 +98,31 @@ static int xe_tile_alloc(struct xe_tile *tile)
+ return 0;
+ }
+
++/**
++ * xe_tile_alloc_vram - Perform per-tile VRAM structs allocation
++ * @tile: Tile to perform allocations for
++ *
++ * Allocates VRAM per-tile data structures using DRM-managed allocations.
++ * Does not touch the hardware.
++ *
++ * Returns -ENOMEM if allocations fail, otherwise 0.
++ */
++int xe_tile_alloc_vram(struct xe_tile *tile)
++{
++ struct xe_device *xe = tile_to_xe(tile);
++ struct xe_vram_region *vram;
++
++ if (!IS_DGFX(xe))
++ return 0;
++
++ vram = xe_vram_region_alloc(xe, tile->id, XE_PL_VRAM0 + tile->id);
++ if (!vram)
++ return -ENOMEM;
++ tile->mem.vram = vram;
++
++ return 0;
++}
++
+ /**
+ * xe_tile_init_early - Initialize the tile and primary GT
+ * @tile: Tile to initialize
+@@ -127,21 +155,6 @@ int xe_tile_init_early(struct xe_tile *tile, struct xe_device *xe, u8 id)
+ }
+ ALLOW_ERROR_INJECTION(xe_tile_init_early, ERRNO); /* See xe_pci_probe() */
+
+-static int tile_ttm_mgr_init(struct xe_tile *tile)
+-{
+- struct xe_device *xe = tile_to_xe(tile);
+- int err;
+-
+- if (tile->mem.vram.usable_size) {
+- err = xe_ttm_vram_mgr_init(tile, &tile->mem.vram.ttm);
+- if (err)
+- return err;
+- xe->info.mem_region_mask |= BIT(tile->id) << 1;
+- }
+-
+- return 0;
+-}
+-
+ /**
+ * xe_tile_init_noalloc - Init tile up to the point where allocations can happen.
+ * @tile: The tile to initialize.
+@@ -159,16 +172,19 @@ static int tile_ttm_mgr_init(struct xe_tile *tile)
+ int xe_tile_init_noalloc(struct xe_tile *tile)
+ {
+ struct xe_device *xe = tile_to_xe(tile);
+- int err;
+-
+- err = tile_ttm_mgr_init(tile);
+- if (err)
+- return err;
+
+ xe_wa_apply_tile_workarounds(tile);
+
+ if (xe->info.has_usm && IS_DGFX(xe))
+- xe_devm_add(tile, &tile->mem.vram);
++ xe_devm_add(tile, tile->mem.vram);
++
++ if (IS_DGFX(xe) && !ttm_resource_manager_used(&tile->mem.vram->ttm.manager)) {
++ int err = xe_ttm_vram_mgr_init(xe, tile->mem.vram);
++
++ if (err)
++ return err;
++ xe->info.mem_region_mask |= BIT(tile->mem.vram->id) << 1;
++ }
+
+ return xe_tile_sysfs_init(tile);
+ }
+diff --git a/drivers/gpu/drm/xe/xe_tile.h b/drivers/gpu/drm/xe/xe_tile.h
+index cc33e873398309..dceb6297aa01df 100644
+--- a/drivers/gpu/drm/xe/xe_tile.h
++++ b/drivers/gpu/drm/xe/xe_tile.h
+@@ -14,19 +14,9 @@ int xe_tile_init_early(struct xe_tile *tile, struct xe_device *xe, u8 id);
+ int xe_tile_init_noalloc(struct xe_tile *tile);
+ int xe_tile_init(struct xe_tile *tile);
+
+-void xe_tile_migrate_wait(struct xe_tile *tile);
++int xe_tile_alloc_vram(struct xe_tile *tile);
+
+-#if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP)
+-static inline struct drm_pagemap *xe_tile_local_pagemap(struct xe_tile *tile)
+-{
+- return &tile->mem.vram.dpagemap;
+-}
+-#else
+-static inline struct drm_pagemap *xe_tile_local_pagemap(struct xe_tile *tile)
+-{
+- return NULL;
+-}
+-#endif
++void xe_tile_migrate_wait(struct xe_tile *tile);
+
+ static inline bool xe_tile_is_root(struct xe_tile *tile)
+ {
+diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+index d9c9d2547aadf5..9a9733447230b6 100644
+--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
++++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+@@ -25,6 +25,7 @@
+ #include "xe_ttm_stolen_mgr.h"
+ #include "xe_ttm_vram_mgr.h"
+ #include "xe_wa.h"
++#include "xe_vram.h"
+
+ struct xe_ttm_stolen_mgr {
+ struct xe_ttm_vram_mgr base;
+@@ -82,15 +83,16 @@ static u32 get_wopcm_size(struct xe_device *xe)
+
+ static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
+ {
+- struct xe_tile *tile = xe_device_get_root_tile(xe);
++ struct xe_vram_region *tile_vram = xe_device_get_root_tile(xe)->mem.vram;
++ resource_size_t tile_io_start = xe_vram_region_io_start(tile_vram);
+ struct xe_mmio *mmio = xe_root_tile_mmio(xe);
+ struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+ u64 stolen_size, wopcm_size;
+ u64 tile_offset;
+ u64 tile_size;
+
+- tile_offset = tile->mem.vram.io_start - xe->mem.vram.io_start;
+- tile_size = tile->mem.vram.actual_physical_size;
++ tile_offset = tile_io_start - xe_vram_region_io_start(xe->mem.vram);
++ tile_size = xe_vram_region_actual_physical_size(tile_vram);
+
+ /* Use DSM base address instead for stolen memory */
+ mgr->stolen_base = (xe_mmio_read64_2x32(mmio, DSMBASE) & BDSM_MASK) - tile_offset;
+@@ -107,7 +109,7 @@ static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
+
+ /* Verify usage fits in the actual resource available */
+ if (mgr->stolen_base + stolen_size <= pci_resource_len(pdev, LMEM_BAR))
+- mgr->io_base = tile->mem.vram.io_start + mgr->stolen_base;
++ mgr->io_base = tile_io_start + mgr->stolen_base;
+
+ /*
+ * There may be few KB of platform dependent reserved memory at the end
+diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+index 9e375a40aee90a..9175b4a2214b8c 100644
+--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
++++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+@@ -15,6 +15,7 @@
+ #include "xe_gt.h"
+ #include "xe_res_cursor.h"
+ #include "xe_ttm_vram_mgr.h"
++#include "xe_vram_types.h"
+
+ static inline struct drm_buddy_block *
+ xe_ttm_vram_mgr_first_block(struct list_head *list)
+@@ -337,13 +338,20 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
+ return drmm_add_action_or_reset(&xe->drm, ttm_vram_mgr_fini, mgr);
+ }
+
+-int xe_ttm_vram_mgr_init(struct xe_tile *tile, struct xe_ttm_vram_mgr *mgr)
++/**
++ * xe_ttm_vram_mgr_init - initialize TTM VRAM region
++ * @xe: pointer to Xe device
++ * @vram: pointer to xe_vram_region that contains the memory region attributes
++ *
++ * Initialize the Xe TTM for given @vram region using the given parameters.
++ *
++ * Returns 0 for success, negative error code otherwise.
++ */
++int xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_vram_region *vram)
+ {
+- struct xe_device *xe = tile_to_xe(tile);
+- struct xe_vram_region *vram = &tile->mem.vram;
+-
+- return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + tile->id,
+- vram->usable_size, vram->io_size,
++ return __xe_ttm_vram_mgr_init(xe, &vram->ttm, vram->placement,
++ xe_vram_region_usable_size(vram),
++ xe_vram_region_io_size(vram),
+ PAGE_SIZE);
+ }
+
+@@ -392,7 +400,7 @@ int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
+ */
+ xe_res_first(res, offset, length, &cursor);
+ for_each_sgtable_sg((*sgt), sg, i) {
+- phys_addr_t phys = cursor.start + tile->mem.vram.io_start;
++ phys_addr_t phys = cursor.start + xe_vram_region_io_start(tile->mem.vram);
+ size_t size = min_t(u64, cursor.size, SZ_2G);
+ dma_addr_t addr;
+
+diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+index cc76050e376dd9..87b7fae5edba1a 100644
+--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
++++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+@@ -11,11 +11,12 @@
+ enum dma_data_direction;
+ struct xe_device;
+ struct xe_tile;
++struct xe_vram_region;
+
+ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
+ u32 mem_type, u64 size, u64 io_size,
+ u64 default_page_size);
+-int xe_ttm_vram_mgr_init(struct xe_tile *tile, struct xe_ttm_vram_mgr *mgr);
++int xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_vram_region *vram);
+ int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
+ struct ttm_resource *res,
+ u64 offset, u64 length,
+diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
+index 5146999d27fa2d..bf44cd5bf49c0f 100644
+--- a/drivers/gpu/drm/xe/xe_vm.c
++++ b/drivers/gpu/drm/xe/xe_vm.c
+@@ -2894,7 +2894,7 @@ static void vm_bind_ioctl_ops_unwind(struct xe_vm *vm,
+ }
+
+ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma,
+- bool validate)
++ bool res_evict, bool validate)
+ {
+ struct xe_bo *bo = xe_vma_bo(vma);
+ struct xe_vm *vm = xe_vma_vm(vma);
+@@ -2905,7 +2905,8 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma,
+ err = drm_exec_lock_obj(exec, &bo->ttm.base);
+ if (!err && validate)
+ err = xe_bo_validate(bo, vm,
+- !xe_vm_in_preempt_fence_mode(vm));
++ !xe_vm_in_preempt_fence_mode(vm) &&
++ res_evict);
+ }
+
+ return err;
+@@ -2978,14 +2979,23 @@ static int prefetch_ranges(struct xe_vm *vm, struct xe_vma_op *op)
+ }
+
+ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
+- struct xe_vma_op *op)
++ struct xe_vma_ops *vops, struct xe_vma_op *op)
+ {
+ int err = 0;
++ bool res_evict;
++
++ /*
++ * We only allow evicting a BO within the VM if it is not part of an
++ * array of binds, as an array of binds can evict another BO within the
++ * bind.
++ */
++ res_evict = !(vops->flags & XE_VMA_OPS_ARRAY_OF_BINDS);
+
+ switch (op->base.op) {
+ case DRM_GPUVA_OP_MAP:
+ if (!op->map.invalidate_on_bind)
+ err = vma_lock_and_validate(exec, op->map.vma,
++ res_evict,
+ !xe_vm_in_fault_mode(vm) ||
+ op->map.immediate);
+ break;
+@@ -2996,11 +3006,13 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
+
+ err = vma_lock_and_validate(exec,
+ gpuva_to_vma(op->base.remap.unmap->va),
+- false);
++ res_evict, false);
+ if (!err && op->remap.prev)
+- err = vma_lock_and_validate(exec, op->remap.prev, true);
++ err = vma_lock_and_validate(exec, op->remap.prev,
++ res_evict, true);
+ if (!err && op->remap.next)
+- err = vma_lock_and_validate(exec, op->remap.next, true);
++ err = vma_lock_and_validate(exec, op->remap.next,
++ res_evict, true);
+ break;
+ case DRM_GPUVA_OP_UNMAP:
+ err = check_ufence(gpuva_to_vma(op->base.unmap.va));
+@@ -3009,7 +3021,7 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
+
+ err = vma_lock_and_validate(exec,
+ gpuva_to_vma(op->base.unmap.va),
+- false);
++ res_evict, false);
+ break;
+ case DRM_GPUVA_OP_PREFETCH:
+ {
+@@ -3025,7 +3037,7 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
+
+ err = vma_lock_and_validate(exec,
+ gpuva_to_vma(op->base.prefetch.va),
+- false);
++ res_evict, false);
+ if (!err && !xe_vma_has_no_bo(vma))
+ err = xe_bo_migrate(xe_vma_bo(vma),
+ region_to_mem_type[region]);
+@@ -3069,7 +3081,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec,
+ return err;
+
+ list_for_each_entry(op, &vops->list, link) {
+- err = op_lock_and_prep(exec, vm, op);
++ err = op_lock_and_prep(exec, vm, vops, op);
+ if (err)
+ return err;
+ }
+@@ -3698,6 +3710,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+ }
+
+ xe_vma_ops_init(&vops, vm, q, syncs, num_syncs);
++ if (args->num_binds > 1)
++ vops.flags |= XE_VMA_OPS_ARRAY_OF_BINDS;
+ for (i = 0; i < args->num_binds; ++i) {
+ u64 range = bind_ops[i].range;
+ u64 addr = bind_ops[i].addr;
+diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
+index 6058cf739388bc..f6616d595999ff 100644
+--- a/drivers/gpu/drm/xe/xe_vm_types.h
++++ b/drivers/gpu/drm/xe/xe_vm_types.h
+@@ -467,6 +467,8 @@ struct xe_vma_ops {
+ struct xe_vm_pgtable_update_ops pt_update_ops[XE_MAX_TILES_PER_DEVICE];
+ /** @flag: signify the properties within xe_vma_ops*/
+ #define XE_VMA_OPS_FLAG_HAS_SVM_PREFETCH BIT(0)
++#define XE_VMA_OPS_FLAG_MADVISE BIT(1)
++#define XE_VMA_OPS_ARRAY_OF_BINDS BIT(2)
+ u32 flags;
+ #ifdef TEST_VM_OPS_ERROR
+ /** @inject_error: inject error to test error handling */
+diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
+index e421a74fb87c66..652df7a5f4f65d 100644
+--- a/drivers/gpu/drm/xe/xe_vram.c
++++ b/drivers/gpu/drm/xe/xe_vram.c
+@@ -3,6 +3,7 @@
+ * Copyright © 2021-2024 Intel Corporation
+ */
+
++#include <kunit/visibility.h>
+ #include <linux/pci.h>
+
+ #include <drm/drm_managed.h>
+@@ -19,19 +20,41 @@
+ #include "xe_mmio.h"
+ #include "xe_module.h"
+ #include "xe_sriov.h"
++#include "xe_ttm_vram_mgr.h"
+ #include "xe_vram.h"
++#include "xe_vram_types.h"
+
+ #define BAR_SIZE_SHIFT 20
+
+-static void
+-_resize_bar(struct xe_device *xe, int resno, resource_size_t size)
++/*
++ * Release all the BARs that could influence/block LMEMBAR resizing, i.e.
++ * assigned IORESOURCE_MEM_64 BARs
++ */
++static void release_bars(struct pci_dev *pdev)
++{
++ struct resource *res;
++ int i;
++
++ pci_dev_for_each_resource(pdev, res, i) {
++ /* Resource already un-assigned, do not reset it */
++ if (!res->parent)
++ continue;
++
++ /* No need to release unrelated BARs */
++ if (!(res->flags & IORESOURCE_MEM_64))
++ continue;
++
++ pci_release_resource(pdev, i);
++ }
++}
++
++static void resize_bar(struct xe_device *xe, int resno, resource_size_t size)
+ {
+ struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+ int bar_size = pci_rebar_bytes_to_size(size);
+ int ret;
+
+- if (pci_resource_len(pdev, resno))
+- pci_release_resource(pdev, resno);
++ release_bars(pdev);
+
+ ret = pci_resize_resource(pdev, resno, bar_size);
+ if (ret) {
+@@ -47,7 +70,7 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
+ * if force_vram_bar_size is set, attempt to set to the requested size
+ * else set to maximum possible size
+ */
+-static void resize_vram_bar(struct xe_device *xe)
++void xe_vram_resize_bar(struct xe_device *xe)
+ {
+ int force_vram_bar_size = xe_modparam.force_vram_bar_size;
+ struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+@@ -116,7 +139,7 @@ static void resize_vram_bar(struct xe_device *xe)
+ pci_read_config_dword(pdev, PCI_COMMAND, &pci_cmd);
+ pci_write_config_dword(pdev, PCI_COMMAND, pci_cmd & ~PCI_COMMAND_MEMORY);
+
+- _resize_bar(xe, LMEM_BAR, rebar_size);
++ resize_bar(xe, LMEM_BAR, rebar_size);
+
+ pci_assign_unassigned_bus_resources(pdev->bus);
+ pci_write_config_dword(pdev, PCI_COMMAND, pci_cmd);
+@@ -136,7 +159,7 @@ static bool resource_is_valid(struct pci_dev *pdev, int bar)
+ return true;
+ }
+
+-static int determine_lmem_bar_size(struct xe_device *xe)
++static int determine_lmem_bar_size(struct xe_device *xe, struct xe_vram_region *lmem_bar)
+ {
+ struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+
+@@ -145,18 +168,16 @@ static int determine_lmem_bar_size(struct xe_device *xe)
+ return -ENXIO;
+ }
+
+- resize_vram_bar(xe);
+-
+- xe->mem.vram.io_start = pci_resource_start(pdev, LMEM_BAR);
+- xe->mem.vram.io_size = pci_resource_len(pdev, LMEM_BAR);
+- if (!xe->mem.vram.io_size)
++ lmem_bar->io_start = pci_resource_start(pdev, LMEM_BAR);
++ lmem_bar->io_size = pci_resource_len(pdev, LMEM_BAR);
++ if (!lmem_bar->io_size)
+ return -EIO;
+
+ /* XXX: Need to change when xe link code is ready */
+- xe->mem.vram.dpa_base = 0;
++ lmem_bar->dpa_base = 0;
+
+ /* set up a map to the total memory area. */
+- xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.io_size);
++ lmem_bar->mapping = devm_ioremap_wc(&pdev->dev, lmem_bar->io_start, lmem_bar->io_size);
+
+ return 0;
+ }
+@@ -278,13 +299,71 @@ static void vram_fini(void *arg)
+ struct xe_tile *tile;
+ int id;
+
+- if (xe->mem.vram.mapping)
+- iounmap(xe->mem.vram.mapping);
+-
+- xe->mem.vram.mapping = NULL;
++ xe->mem.vram->mapping = NULL;
+
+ for_each_tile(tile, xe, id)
+- tile->mem.vram.mapping = NULL;
++ tile->mem.vram->mapping = NULL;
++}
++
++struct xe_vram_region *xe_vram_region_alloc(struct xe_device *xe, u8 id, u32 placement)
++{
++ struct xe_vram_region *vram;
++ struct drm_device *drm = &xe->drm;
++
++ xe_assert(xe, id < xe->info.tile_count);
++
++ vram = drmm_kzalloc(drm, sizeof(*vram), GFP_KERNEL);
++ if (!vram)
++ return NULL;
++
++ vram->xe = xe;
++ vram->id = id;
++ vram->placement = placement;
++#if defined(CONFIG_DRM_XE_PAGEMAP)
++ vram->migrate = xe->tiles[id].migrate;
++#endif
++ return vram;
++}
++
++static void print_vram_region_info(struct xe_device *xe, struct xe_vram_region *vram)
++{
++ struct drm_device *drm = &xe->drm;
++
++ if (vram->io_size < vram->usable_size)
++ drm_info(drm, "Small BAR device\n");
++
++ drm_info(drm,
++ "VRAM[%u]: Actual physical size %pa, usable size exclude stolen %pa, CPU accessible size %pa\n",
++ vram->id, &vram->actual_physical_size, &vram->usable_size, &vram->io_size);
++ drm_info(drm, "VRAM[%u]: DPA range: [%pa-%llx], io range: [%pa-%llx]\n",
++ vram->id, &vram->dpa_base, vram->dpa_base + (u64)vram->actual_physical_size,
++ &vram->io_start, vram->io_start + (u64)vram->io_size);
++}
++
++static int vram_region_init(struct xe_device *xe, struct xe_vram_region *vram,
++ struct xe_vram_region *lmem_bar, u64 offset, u64 usable_size,
++ u64 region_size, resource_size_t remain_io_size)
++{
++ /* Check if VRAM region is already initialized */
++ if (vram->mapping)
++ return 0;
++
++ vram->actual_physical_size = region_size;
++ vram->io_start = lmem_bar->io_start + offset;
++ vram->io_size = min_t(u64, usable_size, remain_io_size);
++
++ if (!vram->io_size) {
++ drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n");
++ return -ENODEV;
++ }
++
++ vram->dpa_base = lmem_bar->dpa_base + offset;
++ vram->mapping = lmem_bar->mapping + offset;
++ vram->usable_size = usable_size;
++
++ print_vram_region_info(xe, vram);
++
++ return 0;
+ }
+
+ /**
+@@ -298,78 +377,108 @@ static void vram_fini(void *arg)
+ int xe_vram_probe(struct xe_device *xe)
+ {
+ struct xe_tile *tile;
+- resource_size_t io_size;
++ struct xe_vram_region lmem_bar;
++ resource_size_t remain_io_size;
+ u64 available_size = 0;
+ u64 total_size = 0;
+- u64 tile_offset;
+- u64 tile_size;
+- u64 vram_size;
+ int err;
+ u8 id;
+
+ if (!IS_DGFX(xe))
+ return 0;
+
+- /* Get the size of the root tile's vram for later accessibility comparison */
+- tile = xe_device_get_root_tile(xe);
+- err = tile_vram_size(tile, &vram_size, &tile_size, &tile_offset);
++ err = determine_lmem_bar_size(xe, &lmem_bar);
+ if (err)
+ return err;
++ drm_info(&xe->drm, "VISIBLE VRAM: %pa, %pa\n", &lmem_bar.io_start, &lmem_bar.io_size);
+
+- err = determine_lmem_bar_size(xe);
+- if (err)
+- return err;
++ remain_io_size = lmem_bar.io_size;
+
+- drm_info(&xe->drm, "VISIBLE VRAM: %pa, %pa\n", &xe->mem.vram.io_start,
+- &xe->mem.vram.io_size);
+-
+- io_size = xe->mem.vram.io_size;
+-
+- /* tile specific ranges */
+ for_each_tile(tile, xe, id) {
+- err = tile_vram_size(tile, &vram_size, &tile_size, &tile_offset);
++ u64 region_size;
++ u64 usable_size;
++ u64 tile_offset;
++
++ err = tile_vram_size(tile, &usable_size, ®ion_size, &tile_offset);
+ if (err)
+ return err;
+
+- tile->mem.vram.actual_physical_size = tile_size;
+- tile->mem.vram.io_start = xe->mem.vram.io_start + tile_offset;
+- tile->mem.vram.io_size = min_t(u64, vram_size, io_size);
++ total_size += region_size;
++ available_size += usable_size;
+
+- if (!tile->mem.vram.io_size) {
+- drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n");
+- return -ENODEV;
++ err = vram_region_init(xe, tile->mem.vram, &lmem_bar, tile_offset, usable_size,
++ region_size, remain_io_size);
++ if (err)
++ return err;
++
++ if (total_size > lmem_bar.io_size) {
++ drm_info(&xe->drm, "VRAM: %pa is larger than resource %pa\n",
++ &total_size, &lmem_bar.io_size);
+ }
+
+- tile->mem.vram.dpa_base = xe->mem.vram.dpa_base + tile_offset;
+- tile->mem.vram.usable_size = vram_size;
+- tile->mem.vram.mapping = xe->mem.vram.mapping + tile_offset;
++ remain_io_size -= min_t(u64, tile->mem.vram->actual_physical_size, remain_io_size);
++ }
+
+- if (tile->mem.vram.io_size < tile->mem.vram.usable_size)
+- drm_info(&xe->drm, "Small BAR device\n");
+- drm_info(&xe->drm, "VRAM[%u, %u]: Actual physical size %pa, usable size exclude stolen %pa, CPU accessible size %pa\n", id,
+- tile->id, &tile->mem.vram.actual_physical_size, &tile->mem.vram.usable_size, &tile->mem.vram.io_size);
+- drm_info(&xe->drm, "VRAM[%u, %u]: DPA range: [%pa-%llx], io range: [%pa-%llx]\n", id, tile->id,
+- &tile->mem.vram.dpa_base, tile->mem.vram.dpa_base + (u64)tile->mem.vram.actual_physical_size,
+- &tile->mem.vram.io_start, tile->mem.vram.io_start + (u64)tile->mem.vram.io_size);
++ err = vram_region_init(xe, xe->mem.vram, &lmem_bar, 0, available_size, total_size,
++ lmem_bar.io_size);
++ if (err)
++ return err;
+
+- /* calculate total size using tile size to get the correct HW sizing */
+- total_size += tile_size;
+- available_size += vram_size;
++ return devm_add_action_or_reset(xe->drm.dev, vram_fini, xe);
++}
+
+- if (total_size > xe->mem.vram.io_size) {
+- drm_info(&xe->drm, "VRAM: %pa is larger than resource %pa\n",
+- &total_size, &xe->mem.vram.io_size);
+- }
++/**
++ * xe_vram_region_io_start - Get the IO start of a VRAM region
++ * @vram: the VRAM region
++ *
++ * Return: the IO start of the VRAM region, or 0 if not valid
++ */
++resource_size_t xe_vram_region_io_start(const struct xe_vram_region *vram)
++{
++ return vram ? vram->io_start : 0;
++}
+
+- io_size -= min_t(u64, tile_size, io_size);
+- }
++/**
++ * xe_vram_region_io_size - Get the IO size of a VRAM region
++ * @vram: the VRAM region
++ *
++ * Return: the IO size of the VRAM region, or 0 if not valid
++ */
++resource_size_t xe_vram_region_io_size(const struct xe_vram_region *vram)
++{
++ return vram ? vram->io_size : 0;
++}
+
+- xe->mem.vram.actual_physical_size = total_size;
++/**
++ * xe_vram_region_dpa_base - Get the DPA base of a VRAM region
++ * @vram: the VRAM region
++ *
++ * Return: the DPA base of the VRAM region, or 0 if not valid
++ */
++resource_size_t xe_vram_region_dpa_base(const struct xe_vram_region *vram)
++{
++ return vram ? vram->dpa_base : 0;
++}
+
+- drm_info(&xe->drm, "Total VRAM: %pa, %pa\n", &xe->mem.vram.io_start,
+- &xe->mem.vram.actual_physical_size);
+- drm_info(&xe->drm, "Available VRAM: %pa, %pa\n", &xe->mem.vram.io_start,
+- &available_size);
++/**
++ * xe_vram_region_usable_size - Get the usable size of a VRAM region
++ * @vram: the VRAM region
++ *
++ * Return: the usable size of the VRAM region, or 0 if not valid
++ */
++resource_size_t xe_vram_region_usable_size(const struct xe_vram_region *vram)
++{
++ return vram ? vram->usable_size : 0;
++}
+
+- return devm_add_action_or_reset(xe->drm.dev, vram_fini, xe);
++/**
++ * xe_vram_region_actual_physical_size - Get the actual physical size of a VRAM region
++ * @vram: the VRAM region
++ *
++ * Return: the actual physical size of the VRAM region, or 0 if not valid
++ */
++resource_size_t xe_vram_region_actual_physical_size(const struct xe_vram_region *vram)
++{
++ return vram ? vram->actual_physical_size : 0;
+ }
++EXPORT_SYMBOL_IF_KUNIT(xe_vram_region_actual_physical_size);
+diff --git a/drivers/gpu/drm/xe/xe_vram.h b/drivers/gpu/drm/xe/xe_vram.h
+index e31cc04ec0db20..13505cfb184dc4 100644
+--- a/drivers/gpu/drm/xe/xe_vram.h
++++ b/drivers/gpu/drm/xe/xe_vram.h
+@@ -6,8 +6,20 @@
+ #ifndef _XE_VRAM_H_
+ #define _XE_VRAM_H_
+
++#include <linux/types.h>
++
+ struct xe_device;
++struct xe_vram_region;
+
++void xe_vram_resize_bar(struct xe_device *xe);
+ int xe_vram_probe(struct xe_device *xe);
+
++struct xe_vram_region *xe_vram_region_alloc(struct xe_device *xe, u8 id, u32 placement);
++
++resource_size_t xe_vram_region_io_start(const struct xe_vram_region *vram);
++resource_size_t xe_vram_region_io_size(const struct xe_vram_region *vram);
++resource_size_t xe_vram_region_dpa_base(const struct xe_vram_region *vram);
++resource_size_t xe_vram_region_usable_size(const struct xe_vram_region *vram);
++resource_size_t xe_vram_region_actual_physical_size(const struct xe_vram_region *vram);
++
+ #endif
+diff --git a/drivers/gpu/drm/xe/xe_vram_types.h b/drivers/gpu/drm/xe/xe_vram_types.h
+new file mode 100644
+index 00000000000000..83772dcbf1aff9
+--- /dev/null
++++ b/drivers/gpu/drm/xe/xe_vram_types.h
+@@ -0,0 +1,85 @@
++/* SPDX-License-Identifier: MIT */
++/*
++ * Copyright © 2025 Intel Corporation
++ */
++
++#ifndef _XE_VRAM_TYPES_H_
++#define _XE_VRAM_TYPES_H_
++
++#if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP)
++#include <drm/drm_pagemap.h>
++#endif
++
++#include "xe_ttm_vram_mgr_types.h"
++
++struct xe_device;
++struct xe_migrate;
++
++/**
++ * struct xe_vram_region - memory region structure
++ * This is used to describe a memory region in xe
++ * device, such as HBM memory or CXL extension memory.
++ */
++struct xe_vram_region {
++ /** @xe: Back pointer to xe device */
++ struct xe_device *xe;
++ /**
++ * @id: VRAM region instance id
++ *
++ * The value should be unique for VRAM region.
++ */
++ u8 id;
++ /** @io_start: IO start address of this VRAM instance */
++ resource_size_t io_start;
++ /**
++ * @io_size: IO size of this VRAM instance
++ *
++ * This represents how much of this VRAM we can access
++ * via the CPU through the VRAM BAR. This can be smaller
++ * than @usable_size, in which case only part of VRAM is CPU
++ * accessible (typically the first 256M). This
++ * configuration is known as small-bar.
++ */
++ resource_size_t io_size;
++ /** @dpa_base: This memory regions's DPA (device physical address) base */
++ resource_size_t dpa_base;
++ /**
++ * @usable_size: usable size of VRAM
++ *
++ * Usable size of VRAM excluding reserved portions
++ * (e.g stolen mem)
++ */
++ resource_size_t usable_size;
++ /**
++ * @actual_physical_size: Actual VRAM size
++ *
++ * Actual VRAM size including reserved portions
++ * (e.g stolen mem)
++ */
++ resource_size_t actual_physical_size;
++ /** @mapping: pointer to VRAM mappable space */
++ void __iomem *mapping;
++ /** @ttm: VRAM TTM manager */
++ struct xe_ttm_vram_mgr ttm;
++ /** @placement: TTM placement dedicated for this region */
++ u32 placement;
++#if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP)
++ /** @migrate: Back pointer to migrate */
++ struct xe_migrate *migrate;
++ /** @pagemap: Used to remap device memory as ZONE_DEVICE */
++ struct dev_pagemap pagemap;
++ /**
++ * @dpagemap: The struct drm_pagemap of the ZONE_DEVICE memory
++ * pages of this tile.
++ */
++ struct drm_pagemap dpagemap;
++ /**
++ * @hpa_base: base host physical address
++ *
++ * This is generated when remap device memory as ZONE_DEVICE
++ */
++ resource_size_t hpa_base;
++#endif
++};
++
++#endif
+diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
+index f45f856a127fe7..2c743e35c1d333 100644
+--- a/drivers/hid/hid-input.c
++++ b/drivers/hid/hid-input.c
+@@ -622,7 +622,10 @@ static void hidinput_update_battery(struct hid_device *dev, unsigned int usage,
+ return;
+ }
+
+- if (value == 0 || value < dev->battery_min || value > dev->battery_max)
++ if ((usage & HID_USAGE_PAGE) == HID_UP_DIGITIZER && value == 0)
++ return;
++
++ if (value < dev->battery_min || value > dev->battery_max)
+ return;
+
+ capacity = hidinput_scale_battery_capacity(dev, value);
+diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c
+index 22c6314a88436b..a9ff84f0bd9bbe 100644
+--- a/drivers/hid/hid-multitouch.c
++++ b/drivers/hid/hid-multitouch.c
+@@ -92,9 +92,8 @@ enum report_mode {
+ TOUCHPAD_REPORT_ALL = TOUCHPAD_REPORT_BUTTONS | TOUCHPAD_REPORT_CONTACTS,
+ };
+
+-#define MT_IO_FLAGS_RUNNING 0
+-#define MT_IO_FLAGS_ACTIVE_SLOTS 1
+-#define MT_IO_FLAGS_PENDING_SLOTS 2
++#define MT_IO_SLOTS_MASK GENMASK(7, 0) /* reserve first 8 bits for slot tracking */
++#define MT_IO_FLAGS_RUNNING 32
+
+ static const bool mtrue = true; /* default for true */
+ static const bool mfalse; /* default for false */
+@@ -169,7 +168,11 @@ struct mt_device {
+ struct mt_class mtclass; /* our mt device class */
+ struct timer_list release_timer; /* to release sticky fingers */
+ struct hid_device *hdev; /* hid_device we're attached to */
+- unsigned long mt_io_flags; /* mt flags (MT_IO_FLAGS_*) */
++ unsigned long mt_io_flags; /* mt flags (MT_IO_FLAGS_RUNNING)
++ * first 8 bits are reserved for keeping the slot
++ * states, this is fine because we only support up
++ * to 250 slots (MT_MAX_MAXCONTACT)
++ */
+ __u8 inputmode_value; /* InputMode HID feature value */
+ __u8 maxcontacts;
+ bool is_buttonpad; /* is this device a button pad? */
+@@ -977,6 +980,7 @@ static void mt_release_pending_palms(struct mt_device *td,
+
+ for_each_set_bit(slotnum, app->pending_palm_slots, td->maxcontacts) {
+ clear_bit(slotnum, app->pending_palm_slots);
++ clear_bit(slotnum, &td->mt_io_flags);
+
+ input_mt_slot(input, slotnum);
+ input_mt_report_slot_inactive(input);
+@@ -1008,12 +1012,6 @@ static void mt_sync_frame(struct mt_device *td, struct mt_application *app,
+
+ app->num_received = 0;
+ app->left_button_state = 0;
+-
+- if (test_bit(MT_IO_FLAGS_ACTIVE_SLOTS, &td->mt_io_flags))
+- set_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags);
+- else
+- clear_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags);
+- clear_bit(MT_IO_FLAGS_ACTIVE_SLOTS, &td->mt_io_flags);
+ }
+
+ static int mt_compute_timestamp(struct mt_application *app, __s32 value)
+@@ -1188,7 +1186,9 @@ static int mt_process_slot(struct mt_device *td, struct input_dev *input,
+ input_event(input, EV_ABS, ABS_MT_TOUCH_MAJOR, major);
+ input_event(input, EV_ABS, ABS_MT_TOUCH_MINOR, minor);
+
+- set_bit(MT_IO_FLAGS_ACTIVE_SLOTS, &td->mt_io_flags);
++ set_bit(slotnum, &td->mt_io_flags);
++ } else {
++ clear_bit(slotnum, &td->mt_io_flags);
+ }
+
+ return 0;
+@@ -1323,7 +1323,7 @@ static void mt_touch_report(struct hid_device *hid,
+ * defect.
+ */
+ if (app->quirks & MT_QUIRK_STICKY_FINGERS) {
+- if (test_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags))
++ if (td->mt_io_flags & MT_IO_SLOTS_MASK)
+ mod_timer(&td->release_timer,
+ jiffies + msecs_to_jiffies(100));
+ else
+@@ -1711,6 +1711,7 @@ static int mt_input_configured(struct hid_device *hdev, struct hid_input *hi)
+ case HID_CP_CONSUMER_CONTROL:
+ case HID_GD_WIRELESS_RADIO_CTLS:
+ case HID_GD_SYSTEM_MULTIAXIS:
++ case HID_DG_PEN:
+ /* already handled by hid core */
+ break;
+ case HID_DG_TOUCHSCREEN:
+@@ -1782,6 +1783,7 @@ static void mt_release_contacts(struct hid_device *hid)
+ for (i = 0; i < mt->num_slots; i++) {
+ input_mt_slot(input_dev, i);
+ input_mt_report_slot_inactive(input_dev);
++ clear_bit(i, &td->mt_io_flags);
+ }
+ input_mt_sync_frame(input_dev);
+ input_sync(input_dev);
+@@ -1804,7 +1806,7 @@ static void mt_expired_timeout(struct timer_list *t)
+ */
+ if (test_and_set_bit_lock(MT_IO_FLAGS_RUNNING, &td->mt_io_flags))
+ return;
+- if (test_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags))
++ if (td->mt_io_flags & MT_IO_SLOTS_MASK)
+ mt_release_contacts(hdev);
+ clear_bit_unlock(MT_IO_FLAGS_RUNNING, &td->mt_io_flags);
+ }
+diff --git a/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c b/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c
+index e6ba2ddcc9cbc6..16f780bc879b12 100644
+--- a/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c
++++ b/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c
+@@ -280,8 +280,7 @@ int reset_tic(struct quickspi_device *qsdev)
+
+ qsdev->reset_ack = false;
+
+- /* First interrupt uses level trigger to avoid missing interrupt */
+- thc_int_trigger_type_select(qsdev->thc_hw, false);
++ thc_int_trigger_type_select(qsdev->thc_hw, true);
+
+ ret = acpi_tic_reset(qsdev);
+ if (ret)
+diff --git a/drivers/media/platform/nxp/imx8-isi/imx8-isi-m2m.c b/drivers/media/platform/nxp/imx8-isi/imx8-isi-m2m.c
+index 22e49d3a128732..2012dbd6a29202 100644
+--- a/drivers/media/platform/nxp/imx8-isi/imx8-isi-m2m.c
++++ b/drivers/media/platform/nxp/imx8-isi/imx8-isi-m2m.c
+@@ -43,7 +43,6 @@ struct mxc_isi_m2m_ctx_queue_data {
+ struct v4l2_pix_format_mplane format;
+ const struct mxc_isi_format_info *info;
+ u32 sequence;
+- bool streaming;
+ };
+
+ struct mxc_isi_m2m_ctx {
+@@ -236,6 +235,66 @@ static void mxc_isi_m2m_vb2_buffer_queue(struct vb2_buffer *vb2)
+ v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
+ }
+
++static int mxc_isi_m2m_vb2_prepare_streaming(struct vb2_queue *q)
++{
++ struct mxc_isi_m2m_ctx *ctx = vb2_get_drv_priv(q);
++ const struct v4l2_pix_format_mplane *out_pix = &ctx->queues.out.format;
++ const struct v4l2_pix_format_mplane *cap_pix = &ctx->queues.cap.format;
++ const struct mxc_isi_format_info *cap_info = ctx->queues.cap.info;
++ const struct mxc_isi_format_info *out_info = ctx->queues.out.info;
++ struct mxc_isi_m2m *m2m = ctx->m2m;
++ bool bypass;
++ int ret;
++
++ guard(mutex)(&m2m->lock);
++
++ if (m2m->usage_count == INT_MAX)
++ return -EOVERFLOW;
++
++ bypass = cap_pix->width == out_pix->width &&
++ cap_pix->height == out_pix->height &&
++ cap_info->encoding == out_info->encoding;
++
++ /*
++ * Acquire the pipe and initialize the channel with the first user of
++ * the M2M device.
++ */
++ if (m2m->usage_count == 0) {
++ ret = mxc_isi_channel_acquire(m2m->pipe,
++ &mxc_isi_m2m_frame_write_done,
++ bypass);
++ if (ret)
++ return ret;
++
++ mxc_isi_channel_get(m2m->pipe);
++ }
++
++ m2m->usage_count++;
++
++ /*
++ * Allocate resources for the channel, counting how many users require
++ * buffer chaining.
++ */
++ if (!ctx->chained && out_pix->width > MXC_ISI_MAX_WIDTH_UNCHAINED) {
++ ret = mxc_isi_channel_chain(m2m->pipe, bypass);
++ if (ret)
++ goto err_deinit;
++
++ m2m->chained_count++;
++ ctx->chained = true;
++ }
++
++ return 0;
++
++err_deinit:
++ if (--m2m->usage_count == 0) {
++ mxc_isi_channel_put(m2m->pipe);
++ mxc_isi_channel_release(m2m->pipe);
++ }
++
++ return ret;
++}
++
+ static int mxc_isi_m2m_vb2_start_streaming(struct vb2_queue *q,
+ unsigned int count)
+ {
+@@ -265,13 +324,44 @@ static void mxc_isi_m2m_vb2_stop_streaming(struct vb2_queue *q)
+ }
+ }
+
++static void mxc_isi_m2m_vb2_unprepare_streaming(struct vb2_queue *q)
++{
++ struct mxc_isi_m2m_ctx *ctx = vb2_get_drv_priv(q);
++ struct mxc_isi_m2m *m2m = ctx->m2m;
++
++ guard(mutex)(&m2m->lock);
++
++ /*
++ * If the last context is this one, reset it to make sure the device
++ * will be reconfigured when streaming is restarted.
++ */
++ if (m2m->last_ctx == ctx)
++ m2m->last_ctx = NULL;
++
++ /* Free the channel resources if this is the last chained context. */
++ if (ctx->chained && --m2m->chained_count == 0)
++ mxc_isi_channel_unchain(m2m->pipe);
++ ctx->chained = false;
++
++ /* Turn off the light with the last user. */
++ if (--m2m->usage_count == 0) {
++ mxc_isi_channel_disable(m2m->pipe);
++ mxc_isi_channel_put(m2m->pipe);
++ mxc_isi_channel_release(m2m->pipe);
++ }
++
++ WARN_ON(m2m->usage_count < 0);
++}
++
+ static const struct vb2_ops mxc_isi_m2m_vb2_qops = {
+ .queue_setup = mxc_isi_m2m_vb2_queue_setup,
+ .buf_init = mxc_isi_m2m_vb2_buffer_init,
+ .buf_prepare = mxc_isi_m2m_vb2_buffer_prepare,
+ .buf_queue = mxc_isi_m2m_vb2_buffer_queue,
++ .prepare_streaming = mxc_isi_m2m_vb2_prepare_streaming,
+ .start_streaming = mxc_isi_m2m_vb2_start_streaming,
+ .stop_streaming = mxc_isi_m2m_vb2_stop_streaming,
++ .unprepare_streaming = mxc_isi_m2m_vb2_unprepare_streaming,
+ };
+
+ static int mxc_isi_m2m_queue_init(void *priv, struct vb2_queue *src_vq,
+@@ -481,136 +571,6 @@ static int mxc_isi_m2m_s_fmt_vid(struct file *file, void *fh,
+ return 0;
+ }
+
+-static int mxc_isi_m2m_streamon(struct file *file, void *fh,
+- enum v4l2_buf_type type)
+-{
+- struct mxc_isi_m2m_ctx *ctx = to_isi_m2m_ctx(fh);
+- struct mxc_isi_m2m_ctx_queue_data *q = mxc_isi_m2m_ctx_qdata(ctx, type);
+- const struct v4l2_pix_format_mplane *out_pix = &ctx->queues.out.format;
+- const struct v4l2_pix_format_mplane *cap_pix = &ctx->queues.cap.format;
+- const struct mxc_isi_format_info *cap_info = ctx->queues.cap.info;
+- const struct mxc_isi_format_info *out_info = ctx->queues.out.info;
+- struct mxc_isi_m2m *m2m = ctx->m2m;
+- bool bypass;
+- int ret;
+-
+- if (q->streaming)
+- return 0;
+-
+- mutex_lock(&m2m->lock);
+-
+- if (m2m->usage_count == INT_MAX) {
+- ret = -EOVERFLOW;
+- goto unlock;
+- }
+-
+- bypass = cap_pix->width == out_pix->width &&
+- cap_pix->height == out_pix->height &&
+- cap_info->encoding == out_info->encoding;
+-
+- /*
+- * Acquire the pipe and initialize the channel with the first user of
+- * the M2M device.
+- */
+- if (m2m->usage_count == 0) {
+- ret = mxc_isi_channel_acquire(m2m->pipe,
+- &mxc_isi_m2m_frame_write_done,
+- bypass);
+- if (ret)
+- goto unlock;
+-
+- mxc_isi_channel_get(m2m->pipe);
+- }
+-
+- m2m->usage_count++;
+-
+- /*
+- * Allocate resources for the channel, counting how many users require
+- * buffer chaining.
+- */
+- if (!ctx->chained && out_pix->width > MXC_ISI_MAX_WIDTH_UNCHAINED) {
+- ret = mxc_isi_channel_chain(m2m->pipe, bypass);
+- if (ret)
+- goto deinit;
+-
+- m2m->chained_count++;
+- ctx->chained = true;
+- }
+-
+- /*
+- * Drop the lock to start the stream, as the .device_run() operation
+- * needs to acquire it.
+- */
+- mutex_unlock(&m2m->lock);
+- ret = v4l2_m2m_ioctl_streamon(file, fh, type);
+- if (ret) {
+- /* Reacquire the lock for the cleanup path. */
+- mutex_lock(&m2m->lock);
+- goto unchain;
+- }
+-
+- q->streaming = true;
+-
+- return 0;
+-
+-unchain:
+- if (ctx->chained && --m2m->chained_count == 0)
+- mxc_isi_channel_unchain(m2m->pipe);
+- ctx->chained = false;
+-
+-deinit:
+- if (--m2m->usage_count == 0) {
+- mxc_isi_channel_put(m2m->pipe);
+- mxc_isi_channel_release(m2m->pipe);
+- }
+-
+-unlock:
+- mutex_unlock(&m2m->lock);
+- return ret;
+-}
+-
+-static int mxc_isi_m2m_streamoff(struct file *file, void *fh,
+- enum v4l2_buf_type type)
+-{
+- struct mxc_isi_m2m_ctx *ctx = to_isi_m2m_ctx(fh);
+- struct mxc_isi_m2m_ctx_queue_data *q = mxc_isi_m2m_ctx_qdata(ctx, type);
+- struct mxc_isi_m2m *m2m = ctx->m2m;
+-
+- v4l2_m2m_ioctl_streamoff(file, fh, type);
+-
+- if (!q->streaming)
+- return 0;
+-
+- mutex_lock(&m2m->lock);
+-
+- /*
+- * If the last context is this one, reset it to make sure the device
+- * will be reconfigured when streaming is restarted.
+- */
+- if (m2m->last_ctx == ctx)
+- m2m->last_ctx = NULL;
+-
+- /* Free the channel resources if this is the last chained context. */
+- if (ctx->chained && --m2m->chained_count == 0)
+- mxc_isi_channel_unchain(m2m->pipe);
+- ctx->chained = false;
+-
+- /* Turn off the light with the last user. */
+- if (--m2m->usage_count == 0) {
+- mxc_isi_channel_disable(m2m->pipe);
+- mxc_isi_channel_put(m2m->pipe);
+- mxc_isi_channel_release(m2m->pipe);
+- }
+-
+- WARN_ON(m2m->usage_count < 0);
+-
+- mutex_unlock(&m2m->lock);
+-
+- q->streaming = false;
+-
+- return 0;
+-}
+-
+ static const struct v4l2_ioctl_ops mxc_isi_m2m_ioctl_ops = {
+ .vidioc_querycap = mxc_isi_m2m_querycap,
+
+@@ -631,8 +591,8 @@ static const struct v4l2_ioctl_ops mxc_isi_m2m_ioctl_ops = {
+ .vidioc_prepare_buf = v4l2_m2m_ioctl_prepare_buf,
+ .vidioc_create_bufs = v4l2_m2m_ioctl_create_bufs,
+
+- .vidioc_streamon = mxc_isi_m2m_streamon,
+- .vidioc_streamoff = mxc_isi_m2m_streamoff,
++ .vidioc_streamon = v4l2_m2m_ioctl_streamon,
++ .vidioc_streamoff = v4l2_m2m_ioctl_streamoff,
+
+ .vidioc_subscribe_event = v4l2_ctrl_subscribe_event,
+ .vidioc_unsubscribe_event = v4l2_event_unsubscribe,
+diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
+index fe74dbd2c9663b..c82ea6043d4086 100644
+--- a/drivers/net/can/m_can/m_can.c
++++ b/drivers/net/can/m_can/m_can.c
+@@ -812,6 +812,9 @@ static int m_can_handle_state_change(struct net_device *dev,
+ u32 timestamp = 0;
+
+ switch (new_state) {
++ case CAN_STATE_ERROR_ACTIVE:
++ cdev->can.state = CAN_STATE_ERROR_ACTIVE;
++ break;
+ case CAN_STATE_ERROR_WARNING:
+ /* error warning state */
+ cdev->can.can_stats.error_warning++;
+@@ -841,6 +844,12 @@ static int m_can_handle_state_change(struct net_device *dev,
+ __m_can_get_berr_counter(dev, &bec);
+
+ switch (new_state) {
++ case CAN_STATE_ERROR_ACTIVE:
++ cf->can_id |= CAN_ERR_CRTL | CAN_ERR_CNT;
++ cf->data[1] = CAN_ERR_CRTL_ACTIVE;
++ cf->data[6] = bec.txerr;
++ cf->data[7] = bec.rxerr;
++ break;
+ case CAN_STATE_ERROR_WARNING:
+ /* error warning state */
+ cf->can_id |= CAN_ERR_CRTL | CAN_ERR_CNT;
+@@ -877,30 +886,33 @@ static int m_can_handle_state_change(struct net_device *dev,
+ return 1;
+ }
+
+-static int m_can_handle_state_errors(struct net_device *dev, u32 psr)
++static enum can_state
++m_can_state_get_by_psr(struct m_can_classdev *cdev)
+ {
+- struct m_can_classdev *cdev = netdev_priv(dev);
+- int work_done = 0;
++ u32 reg_psr;
+
+- if (psr & PSR_EW && cdev->can.state != CAN_STATE_ERROR_WARNING) {
+- netdev_dbg(dev, "entered error warning state\n");
+- work_done += m_can_handle_state_change(dev,
+- CAN_STATE_ERROR_WARNING);
+- }
++ reg_psr = m_can_read(cdev, M_CAN_PSR);
+
+- if (psr & PSR_EP && cdev->can.state != CAN_STATE_ERROR_PASSIVE) {
+- netdev_dbg(dev, "entered error passive state\n");
+- work_done += m_can_handle_state_change(dev,
+- CAN_STATE_ERROR_PASSIVE);
+- }
++ if (reg_psr & PSR_BO)
++ return CAN_STATE_BUS_OFF;
++ if (reg_psr & PSR_EP)
++ return CAN_STATE_ERROR_PASSIVE;
++ if (reg_psr & PSR_EW)
++ return CAN_STATE_ERROR_WARNING;
+
+- if (psr & PSR_BO && cdev->can.state != CAN_STATE_BUS_OFF) {
+- netdev_dbg(dev, "entered error bus off state\n");
+- work_done += m_can_handle_state_change(dev,
+- CAN_STATE_BUS_OFF);
+- }
++ return CAN_STATE_ERROR_ACTIVE;
++}
+
+- return work_done;
++static int m_can_handle_state_errors(struct net_device *dev)
++{
++ struct m_can_classdev *cdev = netdev_priv(dev);
++ enum can_state new_state;
++
++ new_state = m_can_state_get_by_psr(cdev);
++ if (new_state == cdev->can.state)
++ return 0;
++
++ return m_can_handle_state_change(dev, new_state);
+ }
+
+ static void m_can_handle_other_err(struct net_device *dev, u32 irqstatus)
+@@ -1031,8 +1043,7 @@ static int m_can_rx_handler(struct net_device *dev, int quota, u32 irqstatus)
+ }
+
+ if (irqstatus & IR_ERR_STATE)
+- work_done += m_can_handle_state_errors(dev,
+- m_can_read(cdev, M_CAN_PSR));
++ work_done += m_can_handle_state_errors(dev);
+
+ if (irqstatus & IR_ERR_BUS_30X)
+ work_done += m_can_handle_bus_errors(dev, irqstatus,
+@@ -1606,7 +1617,7 @@ static int m_can_start(struct net_device *dev)
+ netdev_queue_set_dql_min_limit(netdev_get_tx_queue(cdev->net, 0),
+ cdev->tx_max_coalesced_frames);
+
+- cdev->can.state = CAN_STATE_ERROR_ACTIVE;
++ cdev->can.state = m_can_state_get_by_psr(cdev);
+
+ m_can_enable_all_interrupts(cdev);
+
+@@ -2494,12 +2505,11 @@ int m_can_class_suspend(struct device *dev)
+ }
+
+ m_can_clk_stop(cdev);
++ cdev->can.state = CAN_STATE_SLEEPING;
+ }
+
+ pinctrl_pm_select_sleep_state(dev);
+
+- cdev->can.state = CAN_STATE_SLEEPING;
+-
+ return ret;
+ }
+ EXPORT_SYMBOL_GPL(m_can_class_suspend);
+@@ -2512,8 +2522,6 @@ int m_can_class_resume(struct device *dev)
+
+ pinctrl_pm_select_default_state(dev);
+
+- cdev->can.state = CAN_STATE_ERROR_ACTIVE;
+-
+ if (netif_running(ndev)) {
+ ret = m_can_clk_start(cdev);
+ if (ret)
+@@ -2531,6 +2539,8 @@ int m_can_class_resume(struct device *dev)
+ if (cdev->ops->init)
+ ret = cdev->ops->init(cdev);
+
++ cdev->can.state = m_can_state_get_by_psr(cdev);
++
+ m_can_write(cdev, M_CAN_IE, cdev->active_interrupts);
+ } else {
+ ret = m_can_start(ndev);
+diff --git a/drivers/net/can/m_can/m_can_platform.c b/drivers/net/can/m_can/m_can_platform.c
+index b832566efda042..057eaa7b8b4b29 100644
+--- a/drivers/net/can/m_can/m_can_platform.c
++++ b/drivers/net/can/m_can/m_can_platform.c
+@@ -180,7 +180,7 @@ static void m_can_plat_remove(struct platform_device *pdev)
+ struct m_can_classdev *mcan_class = &priv->cdev;
+
+ m_can_class_unregister(mcan_class);
+-
++ pm_runtime_disable(mcan_class->dev);
+ m_can_class_free_dev(mcan_class->net);
+ }
+
+diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
+index c9482d6e947b0c..69b8d6da651bf4 100644
+--- a/drivers/net/can/usb/gs_usb.c
++++ b/drivers/net/can/usb/gs_usb.c
+@@ -289,11 +289,6 @@ struct gs_host_frame {
+ #define GS_MAX_RX_URBS 30
+ #define GS_NAPI_WEIGHT 32
+
+-/* Maximum number of interfaces the driver supports per device.
+- * Current hardware only supports 3 interfaces. The future may vary.
+- */
+-#define GS_MAX_INTF 3
+-
+ struct gs_tx_context {
+ struct gs_can *dev;
+ unsigned int echo_id;
+@@ -324,7 +319,6 @@ struct gs_can {
+
+ /* usb interface struct */
+ struct gs_usb {
+- struct gs_can *canch[GS_MAX_INTF];
+ struct usb_anchor rx_submitted;
+ struct usb_device *udev;
+
+@@ -336,9 +330,11 @@ struct gs_usb {
+
+ unsigned int hf_size_rx;
+ u8 active_channels;
++ u8 channel_cnt;
+
+ unsigned int pipe_in;
+ unsigned int pipe_out;
++ struct gs_can *canch[] __counted_by(channel_cnt);
+ };
+
+ /* 'allocate' a tx context.
+@@ -599,7 +595,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
+ }
+
+ /* device reports out of range channel id */
+- if (hf->channel >= GS_MAX_INTF)
++ if (hf->channel >= parent->channel_cnt)
+ goto device_detach;
+
+ dev = parent->canch[hf->channel];
+@@ -699,7 +695,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
+ /* USB failure take down all interfaces */
+ if (rc == -ENODEV) {
+ device_detach:
+- for (rc = 0; rc < GS_MAX_INTF; rc++) {
++ for (rc = 0; rc < parent->channel_cnt; rc++) {
+ if (parent->canch[rc])
+ netif_device_detach(parent->canch[rc]->netdev);
+ }
+@@ -1249,6 +1245,7 @@ static struct gs_can *gs_make_candev(unsigned int channel,
+
+ netdev->flags |= IFF_ECHO; /* we support full roundtrip echo */
+ netdev->dev_id = channel;
++ netdev->dev_port = channel;
+
+ /* dev setup */
+ strcpy(dev->bt_const.name, KBUILD_MODNAME);
+@@ -1460,17 +1457,19 @@ static int gs_usb_probe(struct usb_interface *intf,
+ icount = dconf.icount + 1;
+ dev_info(&intf->dev, "Configuring for %u interfaces\n", icount);
+
+- if (icount > GS_MAX_INTF) {
++ if (icount > type_max(parent->channel_cnt)) {
+ dev_err(&intf->dev,
+ "Driver cannot handle more that %u CAN interfaces\n",
+- GS_MAX_INTF);
++ type_max(parent->channel_cnt));
+ return -EINVAL;
+ }
+
+- parent = kzalloc(sizeof(*parent), GFP_KERNEL);
++ parent = kzalloc(struct_size(parent, canch, icount), GFP_KERNEL);
+ if (!parent)
+ return -ENOMEM;
+
++ parent->channel_cnt = icount;
++
+ init_usb_anchor(&parent->rx_submitted);
+
+ usb_set_intfdata(intf, parent);
+@@ -1531,7 +1530,7 @@ static void gs_usb_disconnect(struct usb_interface *intf)
+ return;
+ }
+
+- for (i = 0; i < GS_MAX_INTF; i++)
++ for (i = 0; i < parent->channel_cnt; i++)
+ if (parent->canch[i])
+ gs_destroy_candev(parent->canch[i]);
+
+diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
+index 6d23c5c049b9a7..ffb10c758c2934 100644
+--- a/drivers/net/ethernet/airoha/airoha_eth.c
++++ b/drivers/net/ethernet/airoha/airoha_eth.c
+@@ -1872,6 +1872,20 @@ static u32 airoha_get_dsa_tag(struct sk_buff *skb, struct net_device *dev)
+ #endif
+ }
+
++static bool airoha_dev_tx_queue_busy(struct airoha_queue *q, u32 nr_frags)
++{
++ u32 tail = q->tail <= q->head ? q->tail + q->ndesc : q->tail;
++ u32 index = q->head + nr_frags;
++
++ /* completion napi can free out-of-order tx descriptors if hw QoS is
++ * enabled and packets with different priorities are queued to the same
++ * DMA ring. Take into account possible out-of-order reports checking
++ * if the tx queue is full using circular buffer head/tail pointers
++ * instead of the number of queued packets.
++ */
++ return index >= tail;
++}
++
+ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
+ struct net_device *dev)
+ {
+@@ -1925,7 +1939,7 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
+ txq = netdev_get_tx_queue(dev, qid);
+ nr_frags = 1 + skb_shinfo(skb)->nr_frags;
+
+- if (q->queued + nr_frags > q->ndesc) {
++ if (airoha_dev_tx_queue_busy(q, nr_frags)) {
+ /* not enough space in the queue */
+ netif_tx_stop_queue(txq);
+ spin_unlock_bh(&q->lock);
+diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+index 2e9b95a94f89fb..2ad672c17eec61 100644
+--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
++++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+@@ -1065,7 +1065,6 @@ static void xgbe_free_rx_data(struct xgbe_prv_data *pdata)
+
+ static int xgbe_phy_reset(struct xgbe_prv_data *pdata)
+ {
+- pdata->phy_link = -1;
+ pdata->phy_speed = SPEED_UNKNOWN;
+
+ return pdata->phy_if.phy_reset(pdata);
+diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
+index 1a37ec45e65020..7675bb98f02956 100644
+--- a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
++++ b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
+@@ -1555,6 +1555,7 @@ static int xgbe_phy_init(struct xgbe_prv_data *pdata)
+ pdata->phy.duplex = DUPLEX_FULL;
+ }
+
++ pdata->phy_link = 0;
+ pdata->phy.link = 0;
+
+ pdata->phy.pause_autoneg = pdata->pause_autoneg;
+diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
+index b4dc93a487184c..8b64e4667c21c1 100644
+--- a/drivers/net/ethernet/broadcom/tg3.c
++++ b/drivers/net/ethernet/broadcom/tg3.c
+@@ -5803,7 +5803,7 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, bool force_reset)
+ u32 current_speed = SPEED_UNKNOWN;
+ u8 current_duplex = DUPLEX_UNKNOWN;
+ bool current_link_up = false;
+- u32 local_adv, remote_adv, sgsr;
++ u32 local_adv = 0, remote_adv = 0, sgsr;
+
+ if ((tg3_asic_rev(tp) == ASIC_REV_5719 ||
+ tg3_asic_rev(tp) == ASIC_REV_5720) &&
+@@ -5944,9 +5944,6 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, bool force_reset)
+ else
+ current_duplex = DUPLEX_HALF;
+
+- local_adv = 0;
+- remote_adv = 0;
+-
+ if (bmcr & BMCR_ANENABLE) {
+ u32 common;
+
+diff --git a/drivers/net/ethernet/dlink/dl2k.c b/drivers/net/ethernet/dlink/dl2k.c
+index 1996d2e4e3e2c9..7077d705e471fb 100644
+--- a/drivers/net/ethernet/dlink/dl2k.c
++++ b/drivers/net/ethernet/dlink/dl2k.c
+@@ -508,25 +508,34 @@ static int alloc_list(struct net_device *dev)
+ for (i = 0; i < RX_RING_SIZE; i++) {
+ /* Allocated fixed size of skbuff */
+ struct sk_buff *skb;
++ dma_addr_t addr;
+
+ skb = netdev_alloc_skb_ip_align(dev, np->rx_buf_sz);
+ np->rx_skbuff[i] = skb;
+- if (!skb) {
+- free_list(dev);
+- return -ENOMEM;
+- }
++ if (!skb)
++ goto err_free_list;
++
++ addr = dma_map_single(&np->pdev->dev, skb->data,
++ np->rx_buf_sz, DMA_FROM_DEVICE);
++ if (dma_mapping_error(&np->pdev->dev, addr))
++ goto err_kfree_skb;
+
+ np->rx_ring[i].next_desc = cpu_to_le64(np->rx_ring_dma +
+ ((i + 1) % RX_RING_SIZE) *
+ sizeof(struct netdev_desc));
+ /* Rubicon now supports 40 bits of addressing space. */
+- np->rx_ring[i].fraginfo =
+- cpu_to_le64(dma_map_single(&np->pdev->dev, skb->data,
+- np->rx_buf_sz, DMA_FROM_DEVICE));
++ np->rx_ring[i].fraginfo = cpu_to_le64(addr);
+ np->rx_ring[i].fraginfo |= cpu_to_le64((u64)np->rx_buf_sz << 48);
+ }
+
+ return 0;
++
++err_kfree_skb:
++ dev_kfree_skb(np->rx_skbuff[i]);
++ np->rx_skbuff[i] = NULL;
++err_free_list:
++ free_list(dev);
++ return -ENOMEM;
+ }
+
+ static void rio_hw_init(struct net_device *dev)
+diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
+index bceaf9b05cb422..4cc6dcbfd367b8 100644
+--- a/drivers/net/ethernet/google/gve/gve.h
++++ b/drivers/net/ethernet/google/gve/gve.h
+@@ -100,6 +100,8 @@
+ */
+ #define GVE_DQO_QPL_ONDEMAND_ALLOC_THRESHOLD 96
+
++#define GVE_DQO_RX_HWTSTAMP_VALID 0x1
++
+ /* Each slot in the desc ring has a 1:1 mapping to a slot in the data ring */
+ struct gve_rx_desc_queue {
+ struct gve_rx_desc *desc_ring; /* the descriptor ring */
+diff --git a/drivers/net/ethernet/google/gve/gve_desc_dqo.h b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
+index d17da841b5a031..f7786b03c74447 100644
+--- a/drivers/net/ethernet/google/gve/gve_desc_dqo.h
++++ b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
+@@ -236,7 +236,8 @@ struct gve_rx_compl_desc_dqo {
+
+ u8 status_error1;
+
+- __le16 reserved5;
++ u8 reserved5;
++ u8 ts_sub_nsecs_low;
+ __le16 buf_id; /* Buffer ID which was sent on the buffer queue. */
+
+ union {
+diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+index 7380c2b7a2d85a..02e25be8a50d7c 100644
+--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
++++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+@@ -456,14 +456,20 @@ static void gve_rx_skb_hash(struct sk_buff *skb,
+ * Note that this means if the time delta between packet reception and the last
+ * clock read is greater than ~2 seconds, this will provide invalid results.
+ */
+-static void gve_rx_skb_hwtstamp(struct gve_rx_ring *rx, u32 hwts)
++static void gve_rx_skb_hwtstamp(struct gve_rx_ring *rx,
++ const struct gve_rx_compl_desc_dqo *desc)
+ {
+ u64 last_read = READ_ONCE(rx->gve->last_sync_nic_counter);
+ struct sk_buff *skb = rx->ctx.skb_head;
+- u32 low = (u32)last_read;
+- s32 diff = hwts - low;
+-
+- skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(last_read + diff);
++ u32 ts, low;
++ s32 diff;
++
++ if (desc->ts_sub_nsecs_low & GVE_DQO_RX_HWTSTAMP_VALID) {
++ ts = le32_to_cpu(desc->ts);
++ low = (u32)last_read;
++ diff = ts - low;
++ skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(last_read + diff);
++ }
+ }
+
+ static void gve_rx_free_skb(struct napi_struct *napi, struct gve_rx_ring *rx)
+@@ -919,7 +925,7 @@ static int gve_rx_complete_skb(struct gve_rx_ring *rx, struct napi_struct *napi,
+ gve_rx_skb_csum(rx->ctx.skb_head, desc, ptype);
+
+ if (rx->gve->ts_config.rx_filter == HWTSTAMP_FILTER_ALL)
+- gve_rx_skb_hwtstamp(rx, le32_to_cpu(desc->ts));
++ gve_rx_skb_hwtstamp(rx, desc);
+
+ /* RSC packets must set gso_size otherwise the TCP stack will complain
+ * that packets are larger than MTU.
+diff --git a/drivers/net/ethernet/intel/idpf/idpf_ptp.c b/drivers/net/ethernet/intel/idpf/idpf_ptp.c
+index ee21f2ff0cad98..63a41e688733bc 100644
+--- a/drivers/net/ethernet/intel/idpf/idpf_ptp.c
++++ b/drivers/net/ethernet/intel/idpf/idpf_ptp.c
+@@ -855,6 +855,9 @@ static void idpf_ptp_release_vport_tstamp(struct idpf_vport *vport)
+ head = &vport->tx_tstamp_caps->latches_in_use;
+ list_for_each_entry_safe(ptp_tx_tstamp, tmp, head, list_member) {
+ list_del(&ptp_tx_tstamp->list_member);
++ if (ptp_tx_tstamp->skb)
++ consume_skb(ptp_tx_tstamp->skb);
++
+ kfree(ptp_tx_tstamp);
+ }
+
+diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl_ptp.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl_ptp.c
+index 4f1fb0cefe516d..688a6f4e0acc81 100644
+--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl_ptp.c
++++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl_ptp.c
+@@ -517,6 +517,7 @@ idpf_ptp_get_tstamp_value(struct idpf_vport *vport,
+ shhwtstamps.hwtstamp = ns_to_ktime(tstamp);
+ skb_tstamp_tx(ptp_tx_tstamp->skb, &shhwtstamps);
+ consume_skb(ptp_tx_tstamp->skb);
++ ptp_tx_tstamp->skb = NULL;
+
+ list_add(&ptp_tx_tstamp->list_member,
+ &tx_tstamp_caps->latches_free);
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+index 6218bdb7f941f6..86b9caece1042a 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+@@ -12091,7 +12091,6 @@ static void ixgbe_remove(struct pci_dev *pdev)
+
+ devl_port_unregister(&adapter->devlink_port);
+ devl_unlock(adapter->devlink);
+- devlink_free(adapter->devlink);
+
+ ixgbe_stop_ipsec_offload(adapter);
+ ixgbe_clear_interrupt_scheme(adapter);
+@@ -12127,6 +12126,8 @@ static void ixgbe_remove(struct pci_dev *pdev)
+
+ if (disable_dev)
+ pci_disable_device(pdev);
++
++ devlink_free(adapter->devlink);
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h
+index a9bc96f6399dc0..e177d1d58696aa 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/defines.h
++++ b/drivers/net/ethernet/intel/ixgbevf/defines.h
+@@ -28,6 +28,7 @@
+
+ /* Link speed */
+ typedef u32 ixgbe_link_speed;
++#define IXGBE_LINK_SPEED_UNKNOWN 0
+ #define IXGBE_LINK_SPEED_1GB_FULL 0x0020
+ #define IXGBE_LINK_SPEED_10GB_FULL 0x0080
+ #define IXGBE_LINK_SPEED_100_FULL 0x0008
+diff --git a/drivers/net/ethernet/intel/ixgbevf/ipsec.c b/drivers/net/ethernet/intel/ixgbevf/ipsec.c
+index 65580b9cb06f21..fce35924ff8b51 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c
++++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c
+@@ -273,6 +273,9 @@ static int ixgbevf_ipsec_add_sa(struct net_device *dev,
+ adapter = netdev_priv(dev);
+ ipsec = adapter->ipsec;
+
++ if (!(adapter->pf_features & IXGBEVF_PF_SUP_IPSEC))
++ return -EOPNOTSUPP;
++
+ if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
+ NL_SET_ERR_MSG_MOD(extack, "Unsupported protocol for IPsec offload");
+ return -EINVAL;
+@@ -405,6 +408,9 @@ static void ixgbevf_ipsec_del_sa(struct net_device *dev,
+ adapter = netdev_priv(dev);
+ ipsec = adapter->ipsec;
+
++ if (!(adapter->pf_features & IXGBEVF_PF_SUP_IPSEC))
++ return;
++
+ if (xs->xso.dir == XFRM_DEV_OFFLOAD_IN) {
+ sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
+
+@@ -612,6 +618,10 @@ void ixgbevf_init_ipsec_offload(struct ixgbevf_adapter *adapter)
+ size_t size;
+
+ switch (adapter->hw.api_version) {
++ case ixgbe_mbox_api_17:
++ if (!(adapter->pf_features & IXGBEVF_PF_SUP_IPSEC))
++ return;
++ break;
+ case ixgbe_mbox_api_14:
+ break;
+ default:
+diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+index 3a379e6a3a2ab2..039187607e98f1 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
++++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+@@ -363,6 +363,13 @@ struct ixgbevf_adapter {
+ struct ixgbe_hw hw;
+ u16 msg_enable;
+
++ u32 pf_features;
++#define IXGBEVF_PF_SUP_IPSEC BIT(0)
++#define IXGBEVF_PF_SUP_ESX_MBX BIT(1)
++
++#define IXGBEVF_SUPPORTED_FEATURES (IXGBEVF_PF_SUP_IPSEC | \
++ IXGBEVF_PF_SUP_ESX_MBX)
++
+ struct ixgbevf_hw_stats stats;
+
+ unsigned long state;
+diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+index 535d0f71f52149..1ecfbbb952103d 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
++++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+@@ -2271,10 +2271,36 @@ static void ixgbevf_init_last_counter_stats(struct ixgbevf_adapter *adapter)
+ adapter->stats.base_vfmprc = adapter->stats.last_vfmprc;
+ }
+
++/**
++ * ixgbevf_set_features - Set features supported by PF
++ * @adapter: pointer to the adapter struct
++ *
++ * Negotiate with PF supported features and then set pf_features accordingly.
++ */
++static void ixgbevf_set_features(struct ixgbevf_adapter *adapter)
++{
++ u32 *pf_features = &adapter->pf_features;
++ struct ixgbe_hw *hw = &adapter->hw;
++ int err;
++
++ err = hw->mac.ops.negotiate_features(hw, pf_features);
++ if (err && err != -EOPNOTSUPP)
++ netdev_dbg(adapter->netdev,
++ "PF feature negotiation failed.\n");
++
++ /* Address also pre API 1.7 cases */
++ if (hw->api_version == ixgbe_mbox_api_14)
++ *pf_features |= IXGBEVF_PF_SUP_IPSEC;
++ else if (hw->api_version == ixgbe_mbox_api_15)
++ *pf_features |= IXGBEVF_PF_SUP_ESX_MBX;
++}
++
+ static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter)
+ {
+ struct ixgbe_hw *hw = &adapter->hw;
+ static const int api[] = {
++ ixgbe_mbox_api_17,
++ ixgbe_mbox_api_16,
+ ixgbe_mbox_api_15,
+ ixgbe_mbox_api_14,
+ ixgbe_mbox_api_13,
+@@ -2294,7 +2320,9 @@ static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter)
+ idx++;
+ }
+
+- if (hw->api_version >= ixgbe_mbox_api_15) {
++ ixgbevf_set_features(adapter);
++
++ if (adapter->pf_features & IXGBEVF_PF_SUP_ESX_MBX) {
+ hw->mbx.ops.init_params(hw);
+ memcpy(&hw->mbx.ops, &ixgbevf_mbx_ops,
+ sizeof(struct ixgbe_mbx_operations));
+@@ -2651,6 +2679,8 @@ static void ixgbevf_set_num_queues(struct ixgbevf_adapter *adapter)
+ case ixgbe_mbox_api_13:
+ case ixgbe_mbox_api_14:
+ case ixgbe_mbox_api_15:
++ case ixgbe_mbox_api_16:
++ case ixgbe_mbox_api_17:
+ if (adapter->xdp_prog &&
+ hw->mac.max_tx_queues == rss)
+ rss = rss > 3 ? 2 : 1;
+@@ -4645,6 +4675,8 @@ static int ixgbevf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+ case ixgbe_mbox_api_13:
+ case ixgbe_mbox_api_14:
+ case ixgbe_mbox_api_15:
++ case ixgbe_mbox_api_16:
++ case ixgbe_mbox_api_17:
+ netdev->max_mtu = IXGBE_MAX_JUMBO_FRAME_SIZE -
+ (ETH_HLEN + ETH_FCS_LEN);
+ break;
+diff --git a/drivers/net/ethernet/intel/ixgbevf/mbx.h b/drivers/net/ethernet/intel/ixgbevf/mbx.h
+index 835bbcc5cc8e63..a8ed23ee66aa84 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/mbx.h
++++ b/drivers/net/ethernet/intel/ixgbevf/mbx.h
+@@ -66,6 +66,8 @@ enum ixgbe_pfvf_api_rev {
+ ixgbe_mbox_api_13, /* API version 1.3, linux/freebsd VF driver */
+ ixgbe_mbox_api_14, /* API version 1.4, linux/freebsd VF driver */
+ ixgbe_mbox_api_15, /* API version 1.5, linux/freebsd VF driver */
++ ixgbe_mbox_api_16, /* API version 1.6, linux/freebsd VF driver */
++ ixgbe_mbox_api_17, /* API version 1.7, linux/freebsd VF driver */
+ /* This value should always be last */
+ ixgbe_mbox_api_unknown, /* indicates that API version is not known */
+ };
+@@ -102,6 +104,12 @@ enum ixgbe_pfvf_api_rev {
+
+ #define IXGBE_VF_GET_LINK_STATE 0x10 /* get vf link state */
+
++/* mailbox API, version 1.6 VF requests */
++#define IXGBE_VF_GET_PF_LINK_STATE 0x11 /* request PF to send link info */
++
++/* mailbox API, version 1.7 VF requests */
++#define IXGBE_VF_FEATURES_NEGOTIATE 0x12 /* get features supported by PF*/
++
+ /* length of permanent address message returned from PF */
+ #define IXGBE_VF_PERMADDR_MSG_LEN 4
+ /* word in permanent address message with the current multicast type */
+diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c b/drivers/net/ethernet/intel/ixgbevf/vf.c
+index dcaef34b88b64d..74d320879513c0 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/vf.c
++++ b/drivers/net/ethernet/intel/ixgbevf/vf.c
+@@ -313,6 +313,8 @@ int ixgbevf_get_reta_locked(struct ixgbe_hw *hw, u32 *reta, int num_rx_queues)
+ * is not supported for this device type.
+ */
+ switch (hw->api_version) {
++ case ixgbe_mbox_api_17:
++ case ixgbe_mbox_api_16:
+ case ixgbe_mbox_api_15:
+ case ixgbe_mbox_api_14:
+ case ixgbe_mbox_api_13:
+@@ -382,6 +384,8 @@ int ixgbevf_get_rss_key_locked(struct ixgbe_hw *hw, u8 *rss_key)
+ * or if the operation is not supported for this device type.
+ */
+ switch (hw->api_version) {
++ case ixgbe_mbox_api_17:
++ case ixgbe_mbox_api_16:
+ case ixgbe_mbox_api_15:
+ case ixgbe_mbox_api_14:
+ case ixgbe_mbox_api_13:
+@@ -552,6 +556,8 @@ static s32 ixgbevf_update_xcast_mode(struct ixgbe_hw *hw, int xcast_mode)
+ case ixgbe_mbox_api_13:
+ case ixgbe_mbox_api_14:
+ case ixgbe_mbox_api_15:
++ case ixgbe_mbox_api_16:
++ case ixgbe_mbox_api_17:
+ break;
+ default:
+ return -EOPNOTSUPP;
+@@ -624,6 +630,85 @@ static s32 ixgbevf_hv_get_link_state_vf(struct ixgbe_hw *hw, bool *link_state)
+ return -EOPNOTSUPP;
+ }
+
++/**
++ * ixgbevf_get_pf_link_state - Get PF's link status
++ * @hw: pointer to the HW structure
++ * @speed: link speed
++ * @link_up: indicate if link is up/down
++ *
++ * Ask PF to provide link_up state and speed of the link.
++ *
++ * Return: IXGBE_ERR_MBX in the case of mailbox error,
++ * -EOPNOTSUPP if the op is not supported or 0 on success.
++ */
++static int ixgbevf_get_pf_link_state(struct ixgbe_hw *hw, ixgbe_link_speed *speed,
++ bool *link_up)
++{
++ u32 msgbuf[3] = {};
++ int err;
++
++ switch (hw->api_version) {
++ case ixgbe_mbox_api_16:
++ case ixgbe_mbox_api_17:
++ break;
++ default:
++ return -EOPNOTSUPP;
++ }
++
++ msgbuf[0] = IXGBE_VF_GET_PF_LINK_STATE;
++
++ err = ixgbevf_write_msg_read_ack(hw, msgbuf, msgbuf,
++ ARRAY_SIZE(msgbuf));
++ if (err || (msgbuf[0] & IXGBE_VT_MSGTYPE_FAILURE)) {
++ err = IXGBE_ERR_MBX;
++ *speed = IXGBE_LINK_SPEED_UNKNOWN;
++ /* No need to set @link_up to false as it will be done by
++ * ixgbe_check_mac_link_vf().
++ */
++ } else {
++ *speed = msgbuf[1];
++ *link_up = msgbuf[2];
++ }
++
++ return err;
++}
++
++/**
++ * ixgbevf_negotiate_features_vf - negotiate supported features with PF driver
++ * @hw: pointer to the HW structure
++ * @pf_features: bitmask of features supported by PF
++ *
++ * Return: IXGBE_ERR_MBX in the case of mailbox error,
++ * -EOPNOTSUPP if the op is not supported or 0 on success.
++ */
++static int ixgbevf_negotiate_features_vf(struct ixgbe_hw *hw, u32 *pf_features)
++{
++ u32 msgbuf[2] = {};
++ int err;
++
++ switch (hw->api_version) {
++ case ixgbe_mbox_api_17:
++ break;
++ default:
++ return -EOPNOTSUPP;
++ }
++
++ msgbuf[0] = IXGBE_VF_FEATURES_NEGOTIATE;
++ msgbuf[1] = IXGBEVF_SUPPORTED_FEATURES;
++
++ err = ixgbevf_write_msg_read_ack(hw, msgbuf, msgbuf,
++ ARRAY_SIZE(msgbuf));
++
++ if (err || (msgbuf[0] & IXGBE_VT_MSGTYPE_FAILURE)) {
++ err = IXGBE_ERR_MBX;
++ *pf_features = 0x0;
++ } else {
++ *pf_features = msgbuf[1];
++ }
++
++ return err;
++}
++
+ /**
+ * ixgbevf_set_vfta_vf - Set/Unset VLAN filter table address
+ * @hw: pointer to the HW structure
+@@ -658,6 +743,58 @@ static s32 ixgbevf_set_vfta_vf(struct ixgbe_hw *hw, u32 vlan, u32 vind,
+ return err;
+ }
+
++/**
++ * ixgbe_read_vflinks - Read VFLINKS register
++ * @hw: pointer to the HW structure
++ * @speed: link speed
++ * @link_up: indicate if link is up/down
++ *
++ * Get linkup status and link speed from the VFLINKS register.
++ */
++static void ixgbe_read_vflinks(struct ixgbe_hw *hw, ixgbe_link_speed *speed,
++ bool *link_up)
++{
++ u32 vflinks = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
++
++ /* if link status is down no point in checking to see if PF is up */
++ if (!(vflinks & IXGBE_LINKS_UP)) {
++ *link_up = false;
++ return;
++ }
++
++ /* for SFP+ modules and DA cables on 82599 it can take up to 500usecs
++ * before the link status is correct
++ */
++ if (hw->mac.type == ixgbe_mac_82599_vf) {
++ for (int i = 0; i < 5; i++) {
++ udelay(100);
++ vflinks = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
++
++ if (!(vflinks & IXGBE_LINKS_UP)) {
++ *link_up = false;
++ return;
++ }
++ }
++ }
++
++ /* We reached this point so there's link */
++ *link_up = true;
++
++ switch (vflinks & IXGBE_LINKS_SPEED_82599) {
++ case IXGBE_LINKS_SPEED_10G_82599:
++ *speed = IXGBE_LINK_SPEED_10GB_FULL;
++ break;
++ case IXGBE_LINKS_SPEED_1G_82599:
++ *speed = IXGBE_LINK_SPEED_1GB_FULL;
++ break;
++ case IXGBE_LINKS_SPEED_100_82599:
++ *speed = IXGBE_LINK_SPEED_100_FULL;
++ break;
++ default:
++ *speed = IXGBE_LINK_SPEED_UNKNOWN;
++ }
++}
++
+ /**
+ * ixgbevf_hv_set_vfta_vf - * Hyper-V variant - just a stub.
+ * @hw: unused
+@@ -702,10 +839,10 @@ static s32 ixgbevf_check_mac_link_vf(struct ixgbe_hw *hw,
+ bool *link_up,
+ bool autoneg_wait_to_complete)
+ {
++ struct ixgbevf_adapter *adapter = hw->back;
+ struct ixgbe_mbx_info *mbx = &hw->mbx;
+ struct ixgbe_mac_info *mac = &hw->mac;
+ s32 ret_val = 0;
+- u32 links_reg;
+ u32 in_msg = 0;
+
+ /* If we were hit with a reset drop the link */
+@@ -715,43 +852,21 @@ static s32 ixgbevf_check_mac_link_vf(struct ixgbe_hw *hw,
+ if (!mac->get_link_status)
+ goto out;
+
+- /* if link status is down no point in checking to see if pf is up */
+- links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
+- if (!(links_reg & IXGBE_LINKS_UP))
+- goto out;
+-
+- /* for SFP+ modules and DA cables on 82599 it can take up to 500usecs
+- * before the link status is correct
+- */
+- if (mac->type == ixgbe_mac_82599_vf) {
+- int i;
+-
+- for (i = 0; i < 5; i++) {
+- udelay(100);
+- links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS);
+-
+- if (!(links_reg & IXGBE_LINKS_UP))
+- goto out;
+- }
+- }
+-
+- switch (links_reg & IXGBE_LINKS_SPEED_82599) {
+- case IXGBE_LINKS_SPEED_10G_82599:
+- *speed = IXGBE_LINK_SPEED_10GB_FULL;
+- break;
+- case IXGBE_LINKS_SPEED_1G_82599:
+- *speed = IXGBE_LINK_SPEED_1GB_FULL;
+- break;
+- case IXGBE_LINKS_SPEED_100_82599:
+- *speed = IXGBE_LINK_SPEED_100_FULL;
+- break;
++ if (hw->mac.type == ixgbe_mac_e610_vf) {
++ ret_val = ixgbevf_get_pf_link_state(hw, speed, link_up);
++ if (ret_val)
++ goto out;
++ } else {
++ ixgbe_read_vflinks(hw, speed, link_up);
++ if (*link_up == false)
++ goto out;
+ }
+
+ /* if the read failed it could just be a mailbox collision, best wait
+ * until we are called again and don't report an error
+ */
+ if (mbx->ops.read(hw, &in_msg, 1)) {
+- if (hw->api_version >= ixgbe_mbox_api_15)
++ if (adapter->pf_features & IXGBEVF_PF_SUP_ESX_MBX)
+ mac->get_link_status = false;
+ goto out;
+ }
+@@ -951,6 +1066,8 @@ int ixgbevf_get_queues(struct ixgbe_hw *hw, unsigned int *num_tcs,
+ case ixgbe_mbox_api_13:
+ case ixgbe_mbox_api_14:
+ case ixgbe_mbox_api_15:
++ case ixgbe_mbox_api_16:
++ case ixgbe_mbox_api_17:
+ break;
+ default:
+ return 0;
+@@ -1005,6 +1122,7 @@ static const struct ixgbe_mac_operations ixgbevf_mac_ops = {
+ .setup_link = ixgbevf_setup_mac_link_vf,
+ .check_link = ixgbevf_check_mac_link_vf,
+ .negotiate_api_version = ixgbevf_negotiate_api_version_vf,
++ .negotiate_features = ixgbevf_negotiate_features_vf,
+ .set_rar = ixgbevf_set_rar_vf,
+ .update_mc_addr_list = ixgbevf_update_mc_addr_list_vf,
+ .update_xcast_mode = ixgbevf_update_xcast_mode,
+diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.h b/drivers/net/ethernet/intel/ixgbevf/vf.h
+index 2d791bc26ae4e7..4f19b8900c29a3 100644
+--- a/drivers/net/ethernet/intel/ixgbevf/vf.h
++++ b/drivers/net/ethernet/intel/ixgbevf/vf.h
+@@ -26,6 +26,7 @@ struct ixgbe_mac_operations {
+ s32 (*stop_adapter)(struct ixgbe_hw *);
+ s32 (*get_bus_info)(struct ixgbe_hw *);
+ s32 (*negotiate_api_version)(struct ixgbe_hw *hw, int api);
++ int (*negotiate_features)(struct ixgbe_hw *hw, u32 *pf_features);
+
+ /* Link */
+ s32 (*setup_link)(struct ixgbe_hw *, ixgbe_link_speed, bool, bool);
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c
+index 69324ae093973e..31310018c3cac9 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c
+@@ -1981,6 +1981,7 @@ static int cgx_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ !is_cgx_mapped_to_nix(pdev->subsystem_device, cgx->cgx_id)) {
+ dev_notice(dev, "CGX %d not mapped to NIX, skipping probe\n",
+ cgx->cgx_id);
++ err = -ENODEV;
+ goto err_release_regions;
+ }
+
+diff --git a/drivers/net/ethernet/mediatek/mtk_wed.c b/drivers/net/ethernet/mediatek/mtk_wed.c
+index 0a80d8f8cff7f4..16aa7e4138d36a 100644
+--- a/drivers/net/ethernet/mediatek/mtk_wed.c
++++ b/drivers/net/ethernet/mediatek/mtk_wed.c
+@@ -670,7 +670,7 @@ mtk_wed_tx_buffer_alloc(struct mtk_wed_device *dev)
+ void *buf;
+ int s;
+
+- page = __dev_alloc_page(GFP_KERNEL);
++ page = __dev_alloc_page(GFP_KERNEL | GFP_DMA32);
+ if (!page)
+ return -ENOMEM;
+
+@@ -793,7 +793,7 @@ mtk_wed_hwrro_buffer_alloc(struct mtk_wed_device *dev)
+ struct page *page;
+ int s;
+
+- page = __dev_alloc_page(GFP_KERNEL);
++ page = __dev_alloc_page(GFP_KERNEL | GFP_DMA32);
+ if (!page)
+ return -ENOMEM;
+
+@@ -2405,6 +2405,10 @@ mtk_wed_attach(struct mtk_wed_device *dev)
+ dev->version = hw->version;
+ dev->hw->pcie_base = mtk_wed_get_pcie_base(dev);
+
++ ret = dma_set_mask_and_coherent(hw->dev, DMA_BIT_MASK(32));
++ if (ret)
++ goto out;
++
+ if (hw->eth->dma_dev == hw->eth->dev &&
+ of_dma_is_coherent(hw->eth->dev->of_node))
+ mtk_eth_set_dma_device(hw->eth, hw->dev);
+diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
+index 9c601f271c02b9..4b0ac73565ea97 100644
+--- a/drivers/net/ethernet/realtek/r8169_main.c
++++ b/drivers/net/ethernet/realtek/r8169_main.c
+@@ -4994,8 +4994,9 @@ static int rtl8169_resume(struct device *device)
+ if (!device_may_wakeup(tp_to_dev(tp)))
+ clk_prepare_enable(tp->clk);
+
+- /* Reportedly at least Asus X453MA truncates packets otherwise */
+- if (tp->mac_version == RTL_GIGA_MAC_VER_37)
++ /* Some chip versions may truncate packets without this initialization */
++ if (tp->mac_version == RTL_GIGA_MAC_VER_37 ||
++ tp->mac_version == RTL_GIGA_MAC_VER_46)
+ rtl_init_rxcfg(tp);
+
+ return rtl8169_runtime_resume(device);
+diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
+index 0178219f0db538..d7938e11f24de1 100644
+--- a/drivers/net/netdevsim/netdev.c
++++ b/drivers/net/netdevsim/netdev.c
+@@ -528,6 +528,7 @@ static void nsim_enable_napi(struct netdevsim *ns)
+ static int nsim_open(struct net_device *dev)
+ {
+ struct netdevsim *ns = netdev_priv(dev);
++ struct netdevsim *peer;
+ int err;
+
+ netdev_assert_locked(dev);
+@@ -538,6 +539,12 @@ static int nsim_open(struct net_device *dev)
+
+ nsim_enable_napi(ns);
+
++ peer = rtnl_dereference(ns->peer);
++ if (peer && netif_running(peer->netdev)) {
++ netif_carrier_on(dev);
++ netif_carrier_on(peer->netdev);
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
+index a60e58ef90c4e1..6884eaccc3e1da 100644
+--- a/drivers/net/phy/broadcom.c
++++ b/drivers/net/phy/broadcom.c
+@@ -407,7 +407,7 @@ static int bcm5481x_set_brrmode(struct phy_device *phydev, bool on)
+ static int bcm54811_config_init(struct phy_device *phydev)
+ {
+ struct bcm54xx_phy_priv *priv = phydev->priv;
+- int err, reg, exp_sync_ethernet;
++ int err, reg, exp_sync_ethernet, aux_rgmii_en;
+
+ /* Enable CLK125 MUX on LED4 if ref clock is enabled. */
+ if (!(phydev->dev_flags & PHY_BRCM_RX_REFCLK_UNUSED)) {
+@@ -436,6 +436,24 @@ static int bcm54811_config_init(struct phy_device *phydev)
+ if (err < 0)
+ return err;
+
++ /* Enable RGMII if configured */
++ if (phy_interface_is_rgmii(phydev))
++ aux_rgmii_en = MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_EN |
++ MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_SKEW_EN;
++ else
++ aux_rgmii_en = 0;
++
++ /* Also writing Reserved bits 6:5 because the documentation requires
++ * them to be written to 0b11
++ */
++ err = bcm54xx_auxctl_write(phydev,
++ MII_BCM54XX_AUXCTL_SHDWSEL_MISC,
++ MII_BCM54XX_AUXCTL_MISC_WREN |
++ aux_rgmii_en |
++ MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RSVD);
++ if (err < 0)
++ return err;
++
+ return bcm5481x_set_brrmode(phydev, priv->brr_mode);
+ }
+
+diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c
+index dd0d675149ad7f..64af3b96f02885 100644
+--- a/drivers/net/phy/realtek/realtek_main.c
++++ b/drivers/net/phy/realtek/realtek_main.c
+@@ -589,26 +589,25 @@ static int rtl8211f_config_init(struct phy_device *phydev)
+ str_enabled_disabled(val_rxdly));
+ }
+
++ if (!priv->has_phycr2)
++ return 0;
++
+ /* Disable PHY-mode EEE so LPI is passed to the MAC */
+ ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2,
+ RTL8211F_PHYCR2_PHY_EEE_ENABLE, 0);
+ if (ret)
+ return ret;
+
+- if (priv->has_phycr2) {
+- ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE,
+- RTL8211F_PHYCR2, RTL8211F_CLKOUT_EN,
+- priv->phycr2);
+- if (ret < 0) {
+- dev_err(dev, "clkout configuration failed: %pe\n",
+- ERR_PTR(ret));
+- return ret;
+- }
+-
+- return genphy_soft_reset(phydev);
++ ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE,
++ RTL8211F_PHYCR2, RTL8211F_CLKOUT_EN,
++ priv->phycr2);
++ if (ret < 0) {
++ dev_err(dev, "clkout configuration failed: %pe\n",
++ ERR_PTR(ret));
++ return ret;
+ }
+
+- return 0;
++ return genphy_soft_reset(phydev);
+ }
+
+ static int rtl821x_suspend(struct phy_device *phydev)
+diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
+index d75502ebbc0d92..e0c425779e67fa 100644
+--- a/drivers/net/usb/lan78xx.c
++++ b/drivers/net/usb/lan78xx.c
+@@ -1174,10 +1174,13 @@ static int lan78xx_write_raw_eeprom(struct lan78xx_net *dev, u32 offset,
+ }
+
+ write_raw_eeprom_done:
+- if (dev->chipid == ID_REV_CHIP_ID_7800_)
+- return lan78xx_write_reg(dev, HW_CFG, saved);
+-
+- return 0;
++ if (dev->chipid == ID_REV_CHIP_ID_7800_) {
++ int rc = lan78xx_write_reg(dev, HW_CFG, saved);
++ /* If USB fails, there is nothing to do */
++ if (rc < 0)
++ return rc;
++ }
++ return ret;
+ }
+
+ static int lan78xx_read_raw_otp(struct lan78xx_net *dev, u32 offset,
+@@ -3241,10 +3244,6 @@ static int lan78xx_reset(struct lan78xx_net *dev)
+ }
+ } while (buf & HW_CFG_LRST_);
+
+- ret = lan78xx_init_mac_address(dev);
+- if (ret < 0)
+- return ret;
+-
+ /* save DEVID for later usage */
+ ret = lan78xx_read_reg(dev, ID_REV, &buf);
+ if (ret < 0)
+@@ -3253,6 +3252,10 @@ static int lan78xx_reset(struct lan78xx_net *dev)
+ dev->chipid = (buf & ID_REV_CHIP_ID_MASK_) >> 16;
+ dev->chiprev = buf & ID_REV_CHIP_REV_MASK_;
+
++ ret = lan78xx_init_mac_address(dev);
++ if (ret < 0)
++ return ret;
++
+ /* Respond to the IN token with a NAK */
+ ret = lan78xx_read_reg(dev, USB_CFG0, &buf);
+ if (ret < 0)
+diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
+index 44cba7acfe7d9b..a22d4bb2cf3b58 100644
+--- a/drivers/net/usb/r8152.c
++++ b/drivers/net/usb/r8152.c
+@@ -10122,7 +10122,12 @@ static int __init rtl8152_driver_init(void)
+ ret = usb_register_device_driver(&rtl8152_cfgselector_driver, THIS_MODULE);
+ if (ret)
+ return ret;
+- return usb_register(&rtl8152_driver);
++
++ ret = usb_register(&rtl8152_driver);
++ if (ret)
++ usb_deregister_device_driver(&rtl8152_cfgselector_driver);
++
++ return ret;
+ }
+
+ static void __exit rtl8152_driver_exit(void)
+diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
+index 511c4154cf742b..bf01f272853184 100644
+--- a/drivers/net/usb/usbnet.c
++++ b/drivers/net/usb/usbnet.c
+@@ -702,6 +702,7 @@ void usbnet_resume_rx(struct usbnet *dev)
+ struct sk_buff *skb;
+ int num = 0;
+
++ local_bh_disable();
+ clear_bit(EVENT_RX_PAUSED, &dev->flags);
+
+ while ((skb = skb_dequeue(&dev->rxq_pause)) != NULL) {
+@@ -710,6 +711,7 @@ void usbnet_resume_rx(struct usbnet *dev)
+ }
+
+ queue_work(system_bh_wq, &dev->bh_work);
++ local_bh_enable();
+
+ netif_dbg(dev, rx_status, dev->net,
+ "paused rx queue disabled, %d skbs requeued\n", num);
+diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
+index 012fcfc79a73b1..a01178caf15bb5 100644
+--- a/drivers/nvme/host/auth.c
++++ b/drivers/nvme/host/auth.c
+@@ -36,6 +36,7 @@ struct nvme_dhchap_queue_context {
+ u8 status;
+ u8 dhgroup_id;
+ u8 hash_id;
++ u8 sc_c;
+ size_t hash_len;
+ u8 c1[64];
+ u8 c2[64];
+@@ -154,6 +155,8 @@ static int nvme_auth_set_dhchap_negotiate_data(struct nvme_ctrl *ctrl,
+ data->auth_protocol[0].dhchap.idlist[34] = NVME_AUTH_DHGROUP_6144;
+ data->auth_protocol[0].dhchap.idlist[35] = NVME_AUTH_DHGROUP_8192;
+
++ chap->sc_c = data->sc_c;
++
+ return size;
+ }
+
+@@ -489,7 +492,7 @@ static int nvme_auth_dhchap_setup_host_response(struct nvme_ctrl *ctrl,
+ ret = crypto_shash_update(shash, buf, 2);
+ if (ret)
+ goto out;
+- memset(buf, 0, sizeof(buf));
++ *buf = chap->sc_c;
+ ret = crypto_shash_update(shash, buf, 1);
+ if (ret)
+ goto out;
+@@ -500,6 +503,7 @@ static int nvme_auth_dhchap_setup_host_response(struct nvme_ctrl *ctrl,
+ strlen(ctrl->opts->host->nqn));
+ if (ret)
+ goto out;
++ memset(buf, 0, sizeof(buf));
+ ret = crypto_shash_update(shash, buf, 1);
+ if (ret)
+ goto out;
+diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
+index 3da980dc60d911..543e17aead12ba 100644
+--- a/drivers/nvme/host/multipath.c
++++ b/drivers/nvme/host/multipath.c
+@@ -182,12 +182,14 @@ void nvme_mpath_start_request(struct request *rq)
+ struct nvme_ns *ns = rq->q->queuedata;
+ struct gendisk *disk = ns->head->disk;
+
+- if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) {
++ if ((READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) &&
++ !(nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE)) {
+ atomic_inc(&ns->ctrl->nr_active);
+ nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE;
+ }
+
+- if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq))
++ if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq) ||
++ (nvme_req(rq)->flags & NVME_MPATH_IO_STATS))
+ return;
+
+ nvme_req(rq)->flags |= NVME_MPATH_IO_STATS;
+diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
+index 1413788ca7d523..9a96df1a511c02 100644
+--- a/drivers/nvme/host/tcp.c
++++ b/drivers/nvme/host/tcp.c
+@@ -1081,6 +1081,9 @@ static void nvme_tcp_write_space(struct sock *sk)
+ queue = sk->sk_user_data;
+ if (likely(queue && sk_stream_is_writeable(sk))) {
+ clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
++ /* Ensure pending TLS partial records are retried */
++ if (nvme_tcp_queue_tls(queue))
++ queue->write_space(sk);
+ queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work);
+ }
+ read_unlock_bh(&sk->sk_callback_lock);
+diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
+index 1bd5bf4a609793..b4b62b9ccc45a0 100644
+--- a/drivers/pci/controller/vmd.c
++++ b/drivers/pci/controller/vmd.c
+@@ -192,6 +192,12 @@ static void vmd_pci_msi_enable(struct irq_data *data)
+ data->chip->irq_unmask(data);
+ }
+
++static unsigned int vmd_pci_msi_startup(struct irq_data *data)
++{
++ vmd_pci_msi_enable(data);
++ return 0;
++}
++
+ static void vmd_irq_disable(struct irq_data *data)
+ {
+ struct vmd_irq *vmdirq = data->chip_data;
+@@ -210,6 +216,11 @@ static void vmd_pci_msi_disable(struct irq_data *data)
+ vmd_irq_disable(data->parent_data);
+ }
+
++static void vmd_pci_msi_shutdown(struct irq_data *data)
++{
++ vmd_pci_msi_disable(data);
++}
++
+ static struct irq_chip vmd_msi_controller = {
+ .name = "VMD-MSI",
+ .irq_compose_msi_msg = vmd_compose_msi_msg,
+@@ -309,6 +320,8 @@ static bool vmd_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
+ if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
+ return false;
+
++ info->chip->irq_startup = vmd_pci_msi_startup;
++ info->chip->irq_shutdown = vmd_pci_msi_shutdown;
+ info->chip->irq_enable = vmd_pci_msi_enable;
+ info->chip->irq_disable = vmd_pci_msi_disable;
+ return true;
+diff --git a/drivers/phy/cadence/cdns-dphy.c b/drivers/phy/cadence/cdns-dphy.c
+index ed87a3970f8343..de5389374d79d0 100644
+--- a/drivers/phy/cadence/cdns-dphy.c
++++ b/drivers/phy/cadence/cdns-dphy.c
+@@ -30,6 +30,7 @@
+
+ #define DPHY_CMN_SSM DPHY_PMA_CMN(0x20)
+ #define DPHY_CMN_SSM_EN BIT(0)
++#define DPHY_CMN_SSM_CAL_WAIT_TIME GENMASK(8, 1)
+ #define DPHY_CMN_TX_MODE_EN BIT(9)
+
+ #define DPHY_CMN_PWM DPHY_PMA_CMN(0x40)
+@@ -79,6 +80,7 @@ struct cdns_dphy_cfg {
+ u8 pll_ipdiv;
+ u8 pll_opdiv;
+ u16 pll_fbdiv;
++ u32 hs_clk_rate;
+ unsigned int nlanes;
+ };
+
+@@ -99,6 +101,8 @@ struct cdns_dphy_ops {
+ void (*set_pll_cfg)(struct cdns_dphy *dphy,
+ const struct cdns_dphy_cfg *cfg);
+ unsigned long (*get_wakeup_time_ns)(struct cdns_dphy *dphy);
++ int (*wait_for_pll_lock)(struct cdns_dphy *dphy);
++ int (*wait_for_cmn_ready)(struct cdns_dphy *dphy);
+ };
+
+ struct cdns_dphy {
+@@ -108,6 +112,8 @@ struct cdns_dphy {
+ struct clk *pll_ref_clk;
+ const struct cdns_dphy_ops *ops;
+ struct phy *phy;
++ bool is_configured;
++ bool is_powered;
+ };
+
+ /* Order of bands is important since the index is the band number. */
+@@ -154,6 +160,9 @@ static int cdns_dsi_get_dphy_pll_cfg(struct cdns_dphy *dphy,
+ cfg->pll_ipdiv,
+ pll_ref_hz);
+
++ cfg->hs_clk_rate = div_u64((u64)pll_ref_hz * cfg->pll_fbdiv,
++ 2 * cfg->pll_opdiv * cfg->pll_ipdiv);
++
+ return 0;
+ }
+
+@@ -191,6 +200,16 @@ static unsigned long cdns_dphy_get_wakeup_time_ns(struct cdns_dphy *dphy)
+ return dphy->ops->get_wakeup_time_ns(dphy);
+ }
+
++static int cdns_dphy_wait_for_pll_lock(struct cdns_dphy *dphy)
++{
++ return dphy->ops->wait_for_pll_lock ? dphy->ops->wait_for_pll_lock(dphy) : 0;
++}
++
++static int cdns_dphy_wait_for_cmn_ready(struct cdns_dphy *dphy)
++{
++ return dphy->ops->wait_for_cmn_ready ? dphy->ops->wait_for_cmn_ready(dphy) : 0;
++}
++
+ static unsigned long cdns_dphy_ref_get_wakeup_time_ns(struct cdns_dphy *dphy)
+ {
+ /* Default wakeup time is 800 ns (in a simulated environment). */
+@@ -232,7 +251,6 @@ static unsigned long cdns_dphy_j721e_get_wakeup_time_ns(struct cdns_dphy *dphy)
+ static void cdns_dphy_j721e_set_pll_cfg(struct cdns_dphy *dphy,
+ const struct cdns_dphy_cfg *cfg)
+ {
+- u32 status;
+
+ /*
+ * set the PWM and PLL Byteclk divider settings to recommended values
+@@ -249,13 +267,6 @@ static void cdns_dphy_j721e_set_pll_cfg(struct cdns_dphy *dphy,
+
+ writel(DPHY_TX_J721E_WIZ_LANE_RSTB,
+ dphy->regs + DPHY_TX_J721E_WIZ_RST_CTRL);
+-
+- readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_PLL_CTRL, status,
+- (status & DPHY_TX_WIZ_PLL_LOCK), 0, POLL_TIMEOUT_US);
+-
+- readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_STATUS, status,
+- (status & DPHY_TX_WIZ_O_CMN_READY), 0,
+- POLL_TIMEOUT_US);
+ }
+
+ static void cdns_dphy_j721e_set_psm_div(struct cdns_dphy *dphy, u8 div)
+@@ -263,6 +274,23 @@ static void cdns_dphy_j721e_set_psm_div(struct cdns_dphy *dphy, u8 div)
+ writel(div, dphy->regs + DPHY_TX_J721E_WIZ_PSM_FREQ);
+ }
+
++static int cdns_dphy_j721e_wait_for_pll_lock(struct cdns_dphy *dphy)
++{
++ u32 status;
++
++ return readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_PLL_CTRL, status,
++ status & DPHY_TX_WIZ_PLL_LOCK, 0, POLL_TIMEOUT_US);
++}
++
++static int cdns_dphy_j721e_wait_for_cmn_ready(struct cdns_dphy *dphy)
++{
++ u32 status;
++
++ return readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_STATUS, status,
++ status & DPHY_TX_WIZ_O_CMN_READY, 0,
++ POLL_TIMEOUT_US);
++}
++
+ /*
+ * This is the reference implementation of DPHY hooks. Specific integration of
+ * this IP may have to re-implement some of them depending on how they decided
+@@ -278,6 +306,8 @@ static const struct cdns_dphy_ops j721e_dphy_ops = {
+ .get_wakeup_time_ns = cdns_dphy_j721e_get_wakeup_time_ns,
+ .set_pll_cfg = cdns_dphy_j721e_set_pll_cfg,
+ .set_psm_div = cdns_dphy_j721e_set_psm_div,
++ .wait_for_pll_lock = cdns_dphy_j721e_wait_for_pll_lock,
++ .wait_for_cmn_ready = cdns_dphy_j721e_wait_for_cmn_ready,
+ };
+
+ static int cdns_dphy_config_from_opts(struct phy *phy,
+@@ -297,6 +327,7 @@ static int cdns_dphy_config_from_opts(struct phy *phy,
+ if (ret)
+ return ret;
+
++ opts->hs_clk_rate = cfg->hs_clk_rate;
+ opts->wakeup = cdns_dphy_get_wakeup_time_ns(dphy) / 1000;
+
+ return 0;
+@@ -334,21 +365,36 @@ static int cdns_dphy_validate(struct phy *phy, enum phy_mode mode, int submode,
+ static int cdns_dphy_configure(struct phy *phy, union phy_configure_opts *opts)
+ {
+ struct cdns_dphy *dphy = phy_get_drvdata(phy);
+- struct cdns_dphy_cfg cfg = { 0 };
+- int ret, band_ctrl;
+- unsigned int reg;
++ int ret;
+
+- ret = cdns_dphy_config_from_opts(phy, &opts->mipi_dphy, &cfg);
+- if (ret)
+- return ret;
++ ret = cdns_dphy_config_from_opts(phy, &opts->mipi_dphy, &dphy->cfg);
++ if (!ret)
++ dphy->is_configured = true;
++
++ return ret;
++}
++
++static int cdns_dphy_power_on(struct phy *phy)
++{
++ struct cdns_dphy *dphy = phy_get_drvdata(phy);
++ int ret;
++ u32 reg;
++
++ if (!dphy->is_configured || dphy->is_powered)
++ return -EINVAL;
++
++ clk_prepare_enable(dphy->psm_clk);
++ clk_prepare_enable(dphy->pll_ref_clk);
+
+ /*
+ * Configure the internal PSM clk divider so that the DPHY has a
+ * 1MHz clk (or something close).
+ */
+ ret = cdns_dphy_setup_psm(dphy);
+- if (ret)
+- return ret;
++ if (ret) {
++ dev_err(&dphy->phy->dev, "Failed to setup PSM with error %d\n", ret);
++ goto err_power_on;
++ }
+
+ /*
+ * Configure attach clk lanes to data lanes: the DPHY has 2 clk lanes
+@@ -363,40 +409,61 @@ static int cdns_dphy_configure(struct phy *phy, union phy_configure_opts *opts)
+ * Configure the DPHY PLL that will be used to generate the TX byte
+ * clk.
+ */
+- cdns_dphy_set_pll_cfg(dphy, &cfg);
++ cdns_dphy_set_pll_cfg(dphy, &dphy->cfg);
+
+- band_ctrl = cdns_dphy_tx_get_band_ctrl(opts->mipi_dphy.hs_clk_rate);
+- if (band_ctrl < 0)
+- return band_ctrl;
++ ret = cdns_dphy_tx_get_band_ctrl(dphy->cfg.hs_clk_rate);
++ if (ret < 0) {
++ dev_err(&dphy->phy->dev, "Failed to get band control value with error %d\n", ret);
++ goto err_power_on;
++ }
+
+- reg = FIELD_PREP(DPHY_BAND_CFG_LEFT_BAND, band_ctrl) |
+- FIELD_PREP(DPHY_BAND_CFG_RIGHT_BAND, band_ctrl);
++ reg = FIELD_PREP(DPHY_BAND_CFG_LEFT_BAND, ret) |
++ FIELD_PREP(DPHY_BAND_CFG_RIGHT_BAND, ret);
+ writel(reg, dphy->regs + DPHY_BAND_CFG);
+
+- return 0;
+-}
++ /* Start TX state machine. */
++ reg = readl(dphy->regs + DPHY_CMN_SSM);
++ writel((reg & DPHY_CMN_SSM_CAL_WAIT_TIME) | DPHY_CMN_SSM_EN | DPHY_CMN_TX_MODE_EN,
++ dphy->regs + DPHY_CMN_SSM);
+
+-static int cdns_dphy_power_on(struct phy *phy)
+-{
+- struct cdns_dphy *dphy = phy_get_drvdata(phy);
++ ret = cdns_dphy_wait_for_pll_lock(dphy);
++ if (ret) {
++ dev_err(&dphy->phy->dev, "Failed to lock PLL with error %d\n", ret);
++ goto err_power_on;
++ }
+
+- clk_prepare_enable(dphy->psm_clk);
+- clk_prepare_enable(dphy->pll_ref_clk);
++ ret = cdns_dphy_wait_for_cmn_ready(dphy);
++ if (ret) {
++ dev_err(&dphy->phy->dev, "O_CMN_READY signal failed to assert with error %d\n",
++ ret);
++ goto err_power_on;
++ }
+
+- /* Start TX state machine. */
+- writel(DPHY_CMN_SSM_EN | DPHY_CMN_TX_MODE_EN,
+- dphy->regs + DPHY_CMN_SSM);
++ dphy->is_powered = true;
+
+ return 0;
++
++err_power_on:
++ clk_disable_unprepare(dphy->pll_ref_clk);
++ clk_disable_unprepare(dphy->psm_clk);
++
++ return ret;
+ }
+
+ static int cdns_dphy_power_off(struct phy *phy)
+ {
+ struct cdns_dphy *dphy = phy_get_drvdata(phy);
++ u32 reg;
+
+ clk_disable_unprepare(dphy->pll_ref_clk);
+ clk_disable_unprepare(dphy->psm_clk);
+
++ /* Stop TX state machine. */
++ reg = readl(dphy->regs + DPHY_CMN_SSM);
++ writel(reg & ~DPHY_CMN_SSM_EN, dphy->regs + DPHY_CMN_SSM);
++
++ dphy->is_powered = false;
++
+ return 0;
+ }
+
+diff --git a/drivers/usb/gadget/function/f_acm.c b/drivers/usb/gadget/function/f_acm.c
+index 7061720b9732e4..106046e17c4e11 100644
+--- a/drivers/usb/gadget/function/f_acm.c
++++ b/drivers/usb/gadget/function/f_acm.c
+@@ -11,12 +11,15 @@
+
+ /* #define VERBOSE_DEBUG */
+
++#include <linux/cleanup.h>
+ #include <linux/slab.h>
+ #include <linux/kernel.h>
+ #include <linux/module.h>
+ #include <linux/device.h>
+ #include <linux/err.h>
+
++#include <linux/usb/gadget.h>
++
+ #include "u_serial.h"
+
+
+@@ -613,6 +616,7 @@ acm_bind(struct usb_configuration *c, struct usb_function *f)
+ struct usb_string *us;
+ int status;
+ struct usb_ep *ep;
++ struct usb_request *request __free(free_usb_request) = NULL;
+
+ /* REVISIT might want instance-specific strings to help
+ * distinguish instances ...
+@@ -630,7 +634,7 @@ acm_bind(struct usb_configuration *c, struct usb_function *f)
+ /* allocate instance-specific interface IDs, and patch descriptors */
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ acm->ctrl_id = status;
+ acm_iad_descriptor.bFirstInterface = status;
+
+@@ -639,43 +643,41 @@ acm_bind(struct usb_configuration *c, struct usb_function *f)
+
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ acm->data_id = status;
+
+ acm_data_interface_desc.bInterfaceNumber = status;
+ acm_union_desc.bSlaveInterface0 = status;
+ acm_call_mgmt_descriptor.bDataInterface = status;
+
+- status = -ENODEV;
+-
+ /* allocate instance-specific endpoints */
+ ep = usb_ep_autoconfig(cdev->gadget, &acm_fs_in_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ acm->port.in = ep;
+
+ ep = usb_ep_autoconfig(cdev->gadget, &acm_fs_out_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ acm->port.out = ep;
+
+ ep = usb_ep_autoconfig(cdev->gadget, &acm_fs_notify_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ acm->notify = ep;
+
+ acm_iad_descriptor.bFunctionProtocol = acm->bInterfaceProtocol;
+ acm_control_interface_desc.bInterfaceProtocol = acm->bInterfaceProtocol;
+
+ /* allocate notification */
+- acm->notify_req = gs_alloc_req(ep,
+- sizeof(struct usb_cdc_notification) + 2,
+- GFP_KERNEL);
+- if (!acm->notify_req)
+- goto fail;
++ request = gs_alloc_req(ep,
++ sizeof(struct usb_cdc_notification) + 2,
++ GFP_KERNEL);
++ if (!request)
++ return -ENODEV;
+
+- acm->notify_req->complete = acm_cdc_notify_complete;
+- acm->notify_req->context = acm;
++ request->complete = acm_cdc_notify_complete;
++ request->context = acm;
+
+ /* support all relevant hardware speeds... we expect that when
+ * hardware is dual speed, all bulk-capable endpoints work at
+@@ -692,7 +694,9 @@ acm_bind(struct usb_configuration *c, struct usb_function *f)
+ status = usb_assign_descriptors(f, acm_fs_function, acm_hs_function,
+ acm_ss_function, acm_ss_function);
+ if (status)
+- goto fail;
++ return status;
++
++ acm->notify_req = no_free_ptr(request);
+
+ dev_dbg(&cdev->gadget->dev,
+ "acm ttyGS%d: IN/%s OUT/%s NOTIFY/%s\n",
+@@ -700,14 +704,6 @@ acm_bind(struct usb_configuration *c, struct usb_function *f)
+ acm->port.in->name, acm->port.out->name,
+ acm->notify->name);
+ return 0;
+-
+-fail:
+- if (acm->notify_req)
+- gs_free_req(acm->notify, acm->notify_req);
+-
+- ERROR(cdev, "%s/%p: can't bind, err %d\n", f->name, f, status);
+-
+- return status;
+ }
+
+ static void acm_unbind(struct usb_configuration *c, struct usb_function *f)
+diff --git a/drivers/usb/gadget/function/f_ecm.c b/drivers/usb/gadget/function/f_ecm.c
+index 027226325039f0..675d2bc538a457 100644
+--- a/drivers/usb/gadget/function/f_ecm.c
++++ b/drivers/usb/gadget/function/f_ecm.c
+@@ -8,6 +8,7 @@
+
+ /* #define VERBOSE_DEBUG */
+
++#include <linux/cleanup.h>
+ #include <linux/slab.h>
+ #include <linux/kernel.h>
+ #include <linux/module.h>
+@@ -15,6 +16,8 @@
+ #include <linux/etherdevice.h>
+ #include <linux/string_choices.h>
+
++#include <linux/usb/gadget.h>
++
+ #include "u_ether.h"
+ #include "u_ether_configfs.h"
+ #include "u_ecm.h"
+@@ -678,6 +681,7 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f)
+ struct usb_ep *ep;
+
+ struct f_ecm_opts *ecm_opts;
++ struct usb_request *request __free(free_usb_request) = NULL;
+
+ if (!can_support_ecm(cdev->gadget))
+ return -EINVAL;
+@@ -711,7 +715,7 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f)
+ /* allocate instance-specific interface IDs */
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ ecm->ctrl_id = status;
+ ecm_iad_descriptor.bFirstInterface = status;
+
+@@ -720,24 +724,22 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f)
+
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ ecm->data_id = status;
+
+ ecm_data_nop_intf.bInterfaceNumber = status;
+ ecm_data_intf.bInterfaceNumber = status;
+ ecm_union_desc.bSlaveInterface0 = status;
+
+- status = -ENODEV;
+-
+ /* allocate instance-specific endpoints */
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_ecm_in_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ ecm->port.in_ep = ep;
+
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_ecm_out_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ ecm->port.out_ep = ep;
+
+ /* NOTE: a status/notification endpoint is *OPTIONAL* but we
+@@ -746,20 +748,18 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f)
+ */
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_ecm_notify_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ ecm->notify = ep;
+
+- status = -ENOMEM;
+-
+ /* allocate notification request and buffer */
+- ecm->notify_req = usb_ep_alloc_request(ep, GFP_KERNEL);
+- if (!ecm->notify_req)
+- goto fail;
+- ecm->notify_req->buf = kmalloc(ECM_STATUS_BYTECOUNT, GFP_KERNEL);
+- if (!ecm->notify_req->buf)
+- goto fail;
+- ecm->notify_req->context = ecm;
+- ecm->notify_req->complete = ecm_notify_complete;
++ request = usb_ep_alloc_request(ep, GFP_KERNEL);
++ if (!request)
++ return -ENOMEM;
++ request->buf = kmalloc(ECM_STATUS_BYTECOUNT, GFP_KERNEL);
++ if (!request->buf)
++ return -ENOMEM;
++ request->context = ecm;
++ request->complete = ecm_notify_complete;
+
+ /* support all relevant hardware speeds... we expect that when
+ * hardware is dual speed, all bulk-capable endpoints work at
+@@ -778,7 +778,7 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f)
+ status = usb_assign_descriptors(f, ecm_fs_function, ecm_hs_function,
+ ecm_ss_function, ecm_ss_function);
+ if (status)
+- goto fail;
++ return status;
+
+ /* NOTE: all that is done without knowing or caring about
+ * the network link ... which is unavailable to this code
+@@ -788,20 +788,12 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f)
+ ecm->port.open = ecm_open;
+ ecm->port.close = ecm_close;
+
++ ecm->notify_req = no_free_ptr(request);
++
+ DBG(cdev, "CDC Ethernet: IN/%s OUT/%s NOTIFY/%s\n",
+ ecm->port.in_ep->name, ecm->port.out_ep->name,
+ ecm->notify->name);
+ return 0;
+-
+-fail:
+- if (ecm->notify_req) {
+- kfree(ecm->notify_req->buf);
+- usb_ep_free_request(ecm->notify, ecm->notify_req);
+- }
+-
+- ERROR(cdev, "%s: can't bind, err %d\n", f->name, status);
+-
+- return status;
+ }
+
+ static inline struct f_ecm_opts *to_f_ecm_opts(struct config_item *item)
+diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
+index 58b0dd575af32a..0148d60926dcf7 100644
+--- a/drivers/usb/gadget/function/f_ncm.c
++++ b/drivers/usb/gadget/function/f_ncm.c
+@@ -11,6 +11,7 @@
+ * Copyright (C) 2008 Nokia Corporation
+ */
+
++#include <linux/cleanup.h>
+ #include <linux/kernel.h>
+ #include <linux/interrupt.h>
+ #include <linux/module.h>
+@@ -20,6 +21,7 @@
+ #include <linux/string_choices.h>
+
+ #include <linux/usb/cdc.h>
++#include <linux/usb/gadget.h>
+
+ #include "u_ether.h"
+ #include "u_ether_configfs.h"
+@@ -1436,18 +1438,18 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
+ struct usb_ep *ep;
+ struct f_ncm_opts *ncm_opts;
+
++ struct usb_os_desc_table *os_desc_table __free(kfree) = NULL;
++ struct usb_request *request __free(free_usb_request) = NULL;
++
+ if (!can_support_ecm(cdev->gadget))
+ return -EINVAL;
+
+ ncm_opts = container_of(f->fi, struct f_ncm_opts, func_inst);
+
+ if (cdev->use_os_string) {
+- f->os_desc_table = kzalloc(sizeof(*f->os_desc_table),
+- GFP_KERNEL);
+- if (!f->os_desc_table)
++ os_desc_table = kzalloc(sizeof(*os_desc_table), GFP_KERNEL);
++ if (!os_desc_table)
+ return -ENOMEM;
+- f->os_desc_n = 1;
+- f->os_desc_table[0].os_desc = &ncm_opts->ncm_os_desc;
+ }
+
+ mutex_lock(&ncm_opts->lock);
+@@ -1459,16 +1461,15 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
+ mutex_unlock(&ncm_opts->lock);
+
+ if (status)
+- goto fail;
++ return status;
+
+ ncm_opts->bound = true;
+
+ us = usb_gstrings_attach(cdev, ncm_strings,
+ ARRAY_SIZE(ncm_string_defs));
+- if (IS_ERR(us)) {
+- status = PTR_ERR(us);
+- goto fail;
+- }
++ if (IS_ERR(us))
++ return PTR_ERR(us);
++
+ ncm_control_intf.iInterface = us[STRING_CTRL_IDX].id;
+ ncm_data_nop_intf.iInterface = us[STRING_DATA_IDX].id;
+ ncm_data_intf.iInterface = us[STRING_DATA_IDX].id;
+@@ -1478,20 +1479,16 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
+ /* allocate instance-specific interface IDs */
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ ncm->ctrl_id = status;
+ ncm_iad_desc.bFirstInterface = status;
+
+ ncm_control_intf.bInterfaceNumber = status;
+ ncm_union_desc.bMasterInterface0 = status;
+
+- if (cdev->use_os_string)
+- f->os_desc_table[0].if_id =
+- ncm_iad_desc.bFirstInterface;
+-
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ ncm->data_id = status;
+
+ ncm_data_nop_intf.bInterfaceNumber = status;
+@@ -1500,35 +1497,31 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
+
+ ecm_desc.wMaxSegmentSize = cpu_to_le16(ncm_opts->max_segment_size);
+
+- status = -ENODEV;
+-
+ /* allocate instance-specific endpoints */
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_ncm_in_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ ncm->port.in_ep = ep;
+
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_ncm_out_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ ncm->port.out_ep = ep;
+
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_ncm_notify_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ ncm->notify = ep;
+
+- status = -ENOMEM;
+-
+ /* allocate notification request and buffer */
+- ncm->notify_req = usb_ep_alloc_request(ep, GFP_KERNEL);
+- if (!ncm->notify_req)
+- goto fail;
+- ncm->notify_req->buf = kmalloc(NCM_STATUS_BYTECOUNT, GFP_KERNEL);
+- if (!ncm->notify_req->buf)
+- goto fail;
+- ncm->notify_req->context = ncm;
+- ncm->notify_req->complete = ncm_notify_complete;
++ request = usb_ep_alloc_request(ep, GFP_KERNEL);
++ if (!request)
++ return -ENOMEM;
++ request->buf = kmalloc(NCM_STATUS_BYTECOUNT, GFP_KERNEL);
++ if (!request->buf)
++ return -ENOMEM;
++ request->context = ncm;
++ request->complete = ncm_notify_complete;
+
+ /*
+ * support all relevant hardware speeds... we expect that when
+@@ -1548,7 +1541,7 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
+ status = usb_assign_descriptors(f, ncm_fs_function, ncm_hs_function,
+ ncm_ss_function, ncm_ss_function);
+ if (status)
+- goto fail;
++ return status;
+
+ /*
+ * NOTE: all that is done without knowing or caring about
+@@ -1561,23 +1554,18 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
+
+ hrtimer_setup(&ncm->task_timer, ncm_tx_timeout, CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT);
+
++ if (cdev->use_os_string) {
++ os_desc_table[0].os_desc = &ncm_opts->ncm_os_desc;
++ os_desc_table[0].if_id = ncm_iad_desc.bFirstInterface;
++ f->os_desc_table = no_free_ptr(os_desc_table);
++ f->os_desc_n = 1;
++ }
++ ncm->notify_req = no_free_ptr(request);
++
+ DBG(cdev, "CDC Network: IN/%s OUT/%s NOTIFY/%s\n",
+ ncm->port.in_ep->name, ncm->port.out_ep->name,
+ ncm->notify->name);
+ return 0;
+-
+-fail:
+- kfree(f->os_desc_table);
+- f->os_desc_n = 0;
+-
+- if (ncm->notify_req) {
+- kfree(ncm->notify_req->buf);
+- usb_ep_free_request(ncm->notify, ncm->notify_req);
+- }
+-
+- ERROR(cdev, "%s: can't bind, err %d\n", f->name, status);
+-
+- return status;
+ }
+
+ static inline struct f_ncm_opts *to_f_ncm_opts(struct config_item *item)
+diff --git a/drivers/usb/gadget/function/f_rndis.c b/drivers/usb/gadget/function/f_rndis.c
+index 7cec19d65fb534..7451e7cb7a8523 100644
+--- a/drivers/usb/gadget/function/f_rndis.c
++++ b/drivers/usb/gadget/function/f_rndis.c
+@@ -19,6 +19,8 @@
+
+ #include <linux/atomic.h>
+
++#include <linux/usb/gadget.h>
++
+ #include "u_ether.h"
+ #include "u_ether_configfs.h"
+ #include "u_rndis.h"
+@@ -662,6 +664,8 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ struct usb_ep *ep;
+
+ struct f_rndis_opts *rndis_opts;
++ struct usb_os_desc_table *os_desc_table __free(kfree) = NULL;
++ struct usb_request *request __free(free_usb_request) = NULL;
+
+ if (!can_support_rndis(c))
+ return -EINVAL;
+@@ -669,12 +673,9 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ rndis_opts = container_of(f->fi, struct f_rndis_opts, func_inst);
+
+ if (cdev->use_os_string) {
+- f->os_desc_table = kzalloc(sizeof(*f->os_desc_table),
+- GFP_KERNEL);
+- if (!f->os_desc_table)
++ os_desc_table = kzalloc(sizeof(*os_desc_table), GFP_KERNEL);
++ if (!os_desc_table)
+ return -ENOMEM;
+- f->os_desc_n = 1;
+- f->os_desc_table[0].os_desc = &rndis_opts->rndis_os_desc;
+ }
+
+ rndis_iad_descriptor.bFunctionClass = rndis_opts->class;
+@@ -692,16 +693,14 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ gether_set_gadget(rndis_opts->net, cdev->gadget);
+ status = gether_register_netdev(rndis_opts->net);
+ if (status)
+- goto fail;
++ return status;
+ rndis_opts->bound = true;
+ }
+
+ us = usb_gstrings_attach(cdev, rndis_strings,
+ ARRAY_SIZE(rndis_string_defs));
+- if (IS_ERR(us)) {
+- status = PTR_ERR(us);
+- goto fail;
+- }
++ if (IS_ERR(us))
++ return PTR_ERR(us);
+ rndis_control_intf.iInterface = us[0].id;
+ rndis_data_intf.iInterface = us[1].id;
+ rndis_iad_descriptor.iFunction = us[2].id;
+@@ -709,36 +708,30 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ /* allocate instance-specific interface IDs */
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ rndis->ctrl_id = status;
+ rndis_iad_descriptor.bFirstInterface = status;
+
+ rndis_control_intf.bInterfaceNumber = status;
+ rndis_union_desc.bMasterInterface0 = status;
+
+- if (cdev->use_os_string)
+- f->os_desc_table[0].if_id =
+- rndis_iad_descriptor.bFirstInterface;
+-
+ status = usb_interface_id(c, f);
+ if (status < 0)
+- goto fail;
++ return status;
+ rndis->data_id = status;
+
+ rndis_data_intf.bInterfaceNumber = status;
+ rndis_union_desc.bSlaveInterface0 = status;
+
+- status = -ENODEV;
+-
+ /* allocate instance-specific endpoints */
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_in_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ rndis->port.in_ep = ep;
+
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_out_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ rndis->port.out_ep = ep;
+
+ /* NOTE: a status/notification endpoint is, strictly speaking,
+@@ -747,21 +740,19 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ */
+ ep = usb_ep_autoconfig(cdev->gadget, &fs_notify_desc);
+ if (!ep)
+- goto fail;
++ return -ENODEV;
+ rndis->notify = ep;
+
+- status = -ENOMEM;
+-
+ /* allocate notification request and buffer */
+- rndis->notify_req = usb_ep_alloc_request(ep, GFP_KERNEL);
+- if (!rndis->notify_req)
+- goto fail;
+- rndis->notify_req->buf = kmalloc(STATUS_BYTECOUNT, GFP_KERNEL);
+- if (!rndis->notify_req->buf)
+- goto fail;
+- rndis->notify_req->length = STATUS_BYTECOUNT;
+- rndis->notify_req->context = rndis;
+- rndis->notify_req->complete = rndis_response_complete;
++ request = usb_ep_alloc_request(ep, GFP_KERNEL);
++ if (!request)
++ return -ENOMEM;
++ request->buf = kmalloc(STATUS_BYTECOUNT, GFP_KERNEL);
++ if (!request->buf)
++ return -ENOMEM;
++ request->length = STATUS_BYTECOUNT;
++ request->context = rndis;
++ request->complete = rndis_response_complete;
+
+ /* support all relevant hardware speeds... we expect that when
+ * hardware is dual speed, all bulk-capable endpoints work at
+@@ -778,7 +769,7 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ status = usb_assign_descriptors(f, eth_fs_function, eth_hs_function,
+ eth_ss_function, eth_ss_function);
+ if (status)
+- goto fail;
++ return status;
+
+ rndis->port.open = rndis_open;
+ rndis->port.close = rndis_close;
+@@ -789,9 +780,18 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ if (rndis->manufacturer && rndis->vendorID &&
+ rndis_set_param_vendor(rndis->params, rndis->vendorID,
+ rndis->manufacturer)) {
+- status = -EINVAL;
+- goto fail_free_descs;
++ usb_free_all_descriptors(f);
++ return -EINVAL;
++ }
++
++ if (cdev->use_os_string) {
++ os_desc_table[0].os_desc = &rndis_opts->rndis_os_desc;
++ os_desc_table[0].if_id = rndis_iad_descriptor.bFirstInterface;
++ f->os_desc_table = no_free_ptr(os_desc_table);
++ f->os_desc_n = 1;
++
+ }
++ rndis->notify_req = no_free_ptr(request);
+
+ /* NOTE: all that is done without knowing or caring about
+ * the network link ... which is unavailable to this code
+@@ -802,21 +802,6 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f)
+ rndis->port.in_ep->name, rndis->port.out_ep->name,
+ rndis->notify->name);
+ return 0;
+-
+-fail_free_descs:
+- usb_free_all_descriptors(f);
+-fail:
+- kfree(f->os_desc_table);
+- f->os_desc_n = 0;
+-
+- if (rndis->notify_req) {
+- kfree(rndis->notify_req->buf);
+- usb_ep_free_request(rndis->notify, rndis->notify_req);
+- }
+-
+- ERROR(cdev, "%s: can't bind, err %d\n", f->name, status);
+-
+- return status;
+ }
+
+ void rndis_borrow_net(struct usb_function_instance *f, struct net_device *net)
+diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
+index d709e24c1fd422..e3d63b8fa0f4c1 100644
+--- a/drivers/usb/gadget/udc/core.c
++++ b/drivers/usb/gadget/udc/core.c
+@@ -194,6 +194,9 @@ struct usb_request *usb_ep_alloc_request(struct usb_ep *ep,
+
+ req = ep->ops->alloc_request(ep, gfp_flags);
+
++ if (req)
++ req->ep = ep;
++
+ trace_usb_ep_alloc_request(ep, req, req ? 0 : -ENOMEM);
+
+ return req;
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index e6d2557ac37b09..a1566df45be917 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -914,7 +914,7 @@ static void btrfs_readahead_expand(struct readahead_control *ractl,
+ {
+ const u64 ra_pos = readahead_pos(ractl);
+ const u64 ra_end = ra_pos + readahead_length(ractl);
+- const u64 em_end = em->start + em->ram_bytes;
++ const u64 em_end = em->start + em->len;
+
+ /* No expansion for holes and inline extents. */
+ if (em->disk_bytenr > EXTENT_MAP_LAST_BYTE)
+diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
+index eba7f22ae49c67..a29c2ac60aef6b 100644
+--- a/fs/btrfs/free-space-tree.c
++++ b/fs/btrfs/free-space-tree.c
+@@ -1106,14 +1106,15 @@ static int populate_free_space_tree(struct btrfs_trans_handle *trans,
+ * If ret is 1 (no key found), it means this is an empty block group,
+ * without any extents allocated from it and there's no block group
+ * item (key BTRFS_BLOCK_GROUP_ITEM_KEY) located in the extent tree
+- * because we are using the block group tree feature, so block group
+- * items are stored in the block group tree. It also means there are no
+- * extents allocated for block groups with a start offset beyond this
+- * block group's end offset (this is the last, highest, block group).
++ * because we are using the block group tree feature (so block group
++ * items are stored in the block group tree) or this is a new block
++ * group created in the current transaction and its block group item
++ * was not yet inserted in the extent tree (that happens in
++ * btrfs_create_pending_block_groups() -> insert_block_group_item()).
++ * It also means there are no extents allocated for block groups with a
++ * start offset beyond this block group's end offset (this is the last,
++ * highest, block group).
+ */
+- if (!btrfs_fs_compat_ro(trans->fs_info, BLOCK_GROUP_TREE))
+- ASSERT(ret == 0);
+-
+ start = block_group->start;
+ end = block_group->start + block_group->length;
+ while (ret == 0) {
+diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
+index 7e13de2bdcbfab..155e11d2faa806 100644
+--- a/fs/btrfs/ioctl.c
++++ b/fs/btrfs/ioctl.c
+@@ -3740,7 +3740,7 @@ static long btrfs_ioctl_qgroup_assign(struct file *file, void __user *arg)
+ prealloc = kzalloc(sizeof(*prealloc), GFP_KERNEL);
+ if (!prealloc) {
+ ret = -ENOMEM;
+- goto drop_write;
++ goto out;
+ }
+ }
+
+diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
+index 7256f6748c8f92..63baae5383e1d0 100644
+--- a/fs/btrfs/relocation.c
++++ b/fs/btrfs/relocation.c
+@@ -3795,6 +3795,7 @@ static noinline_for_stack struct inode *create_reloc_inode(
+ /*
+ * Mark start of chunk relocation that is cancellable. Check if the cancellation
+ * has been requested meanwhile and don't start in that case.
++ * NOTE: if this returns an error, reloc_chunk_end() must not be called.
+ *
+ * Return:
+ * 0 success
+@@ -3811,10 +3812,8 @@ static int reloc_chunk_start(struct btrfs_fs_info *fs_info)
+
+ if (atomic_read(&fs_info->reloc_cancel_req) > 0) {
+ btrfs_info(fs_info, "chunk relocation canceled on start");
+- /*
+- * On cancel, clear all requests but let the caller mark
+- * the end after cleanup operations.
+- */
++ /* On cancel, clear all requests. */
++ clear_and_wake_up_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags);
+ atomic_set(&fs_info->reloc_cancel_req, 0);
+ return -ECANCELED;
+ }
+@@ -3823,9 +3822,11 @@ static int reloc_chunk_start(struct btrfs_fs_info *fs_info)
+
+ /*
+ * Mark end of chunk relocation that is cancellable and wake any waiters.
++ * NOTE: call only if a previous call to reloc_chunk_start() succeeded.
+ */
+ static void reloc_chunk_end(struct btrfs_fs_info *fs_info)
+ {
++ ASSERT(test_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags));
+ /* Requested after start, clear bit first so any waiters can continue */
+ if (atomic_read(&fs_info->reloc_cancel_req) > 0)
+ btrfs_info(fs_info, "chunk relocation canceled during operation");
+@@ -4038,9 +4039,9 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start,
+ if (err && rw)
+ btrfs_dec_block_group_ro(rc->block_group);
+ iput(rc->data_inode);
++ reloc_chunk_end(fs_info);
+ out_put_bg:
+ btrfs_put_block_group(bg);
+- reloc_chunk_end(fs_info);
+ free_reloc_control(rc);
+ return err;
+ }
+@@ -4223,8 +4224,8 @@ int btrfs_recover_relocation(struct btrfs_fs_info *fs_info)
+ ret = ret2;
+ out_unset:
+ unset_reloc_control(rc);
+-out_end:
+ reloc_chunk_end(fs_info);
++out_end:
+ free_reloc_control(rc);
+ out:
+ free_reloc_roots(&reloc_roots);
+diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
+index b06b8f32553781..fcc7ecbb4945f8 100644
+--- a/fs/btrfs/super.c
++++ b/fs/btrfs/super.c
+@@ -1902,8 +1902,6 @@ static int btrfs_get_tree_super(struct fs_context *fc)
+ return PTR_ERR(sb);
+ }
+
+- set_device_specific_options(fs_info);
+-
+ if (sb->s_root) {
+ /*
+ * Not the first mount of the fs thus got an existing super block.
+@@ -1948,6 +1946,7 @@ static int btrfs_get_tree_super(struct fs_context *fc)
+ deactivate_locked_super(sb);
+ return -EACCES;
+ }
++ set_device_specific_options(fs_info);
+ bdev = fs_devices->latest_dev->bdev;
+ snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(sb->s_shrink, "sb-btrfs:%s", sb->s_id);
+diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
+index f426276e2b6bfe..87c5dd3ad016e4 100644
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -1753,7 +1753,7 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
+ !fs_info->stripe_root) {
+ btrfs_err(fs_info, "zoned: data %s needs raid-stripe-tree",
+ btrfs_bg_type_to_raid_name(map->type));
+- return -EINVAL;
++ ret = -EINVAL;
+ }
+
+ if (cache->alloc_offset > cache->zone_capacity) {
+diff --git a/fs/coredump.c b/fs/coredump.c
+index 60bc9685e14985..c5e9a855502dd0 100644
+--- a/fs/coredump.c
++++ b/fs/coredump.c
+@@ -1466,7 +1466,7 @@ static int proc_dostring_coredump(const struct ctl_table *table, int write,
+ ssize_t retval;
+ char old_core_pattern[CORENAME_MAX_SIZE];
+
+- if (write)
++ if (!write)
+ return proc_dostring(table, write, buffer, lenp, ppos);
+
+ retval = strscpy(old_core_pattern, core_pattern, CORENAME_MAX_SIZE);
+diff --git a/fs/dax.c b/fs/dax.c
+index 20ecf652c129d1..260e063e3bc2d8 100644
+--- a/fs/dax.c
++++ b/fs/dax.c
+@@ -1752,7 +1752,7 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
+ if (iov_iter_rw(iter) == WRITE) {
+ lockdep_assert_held_write(&iomi.inode->i_rwsem);
+ iomi.flags |= IOMAP_WRITE;
+- } else {
++ } else if (!sb_rdonly(iomi.inode->i_sb)) {
+ lockdep_assert_held(&iomi.inode->i_rwsem);
+ }
+
+diff --git a/fs/dcache.c b/fs/dcache.c
+index 60046ae23d5148..c11d87810fba12 100644
+--- a/fs/dcache.c
++++ b/fs/dcache.c
+@@ -2557,6 +2557,8 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
+ spin_lock(&parent->d_lock);
+ new->d_parent = dget_dlock(parent);
+ hlist_add_head(&new->d_sib, &parent->d_children);
++ if (parent->d_flags & DCACHE_DISCONNECTED)
++ new->d_flags |= DCACHE_DISCONNECTED;
+ spin_unlock(&parent->d_lock);
+
+ retry:
+diff --git a/fs/exec.c b/fs/exec.c
+index e861a4b7ffda92..a69a2673f63113 100644
+--- a/fs/exec.c
++++ b/fs/exec.c
+@@ -2048,7 +2048,7 @@ static int proc_dointvec_minmax_coredump(const struct ctl_table *table, int writ
+ {
+ int error = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+- if (!error && !write)
++ if (!error && write)
+ validate_coredump_safety();
+ return error;
+ }
+diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
+index b3e9b7bd797879..a0e66bc1009308 100644
+--- a/fs/ext4/ext4_jbd2.c
++++ b/fs/ext4/ext4_jbd2.c
+@@ -280,9 +280,16 @@ int __ext4_forget(const char *where, unsigned int line, handle_t *handle,
+ bh, is_metadata, inode->i_mode,
+ test_opt(inode->i_sb, DATA_FLAGS));
+
+- /* In the no journal case, we can just do a bforget and return */
++ /*
++ * In the no journal case, we should wait for the ongoing buffer
++ * to complete and do a forget.
++ */
+ if (!ext4_handle_valid(handle)) {
+- bforget(bh);
++ if (bh) {
++ clear_buffer_dirty(bh);
++ wait_on_buffer(bh);
++ __bforget(bh);
++ }
+ return 0;
+ }
+
+diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
+index f9e4ac87211ec1..e99306a8f47ce7 100644
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -5319,6 +5319,14 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
+ }
+ ei->i_flags = le32_to_cpu(raw_inode->i_flags);
+ ext4_set_inode_flags(inode, true);
++ /* Detect invalid flag combination - can't have both inline data and extents */
++ if (ext4_test_inode_flag(inode, EXT4_INODE_INLINE_DATA) &&
++ ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
++ ext4_error_inode(inode, function, line, 0,
++ "inode has both inline data and extents flags");
++ ret = -EFSCORRUPTED;
++ goto bad_inode;
++ }
+ inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
+ ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
+ if (ext4_has_feature_64bit(sb))
+diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
+index 50c90bd0392357..7b891b8f0a8dca 100644
+--- a/fs/f2fs/data.c
++++ b/fs/f2fs/data.c
+@@ -1504,8 +1504,8 @@ static bool f2fs_map_blocks_cached(struct inode *inode,
+ struct f2fs_dev_info *dev = &sbi->devs[bidx];
+
+ map->m_bdev = dev->bdev;
+- map->m_pblk -= dev->start_blk;
+ map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk);
++ map->m_pblk -= dev->start_blk;
+ } else {
+ map->m_bdev = inode->i_sb->s_bdev;
+ }
+diff --git a/fs/file_attr.c b/fs/file_attr.c
+index 12424d4945d0a3..460b2dd21a8528 100644
+--- a/fs/file_attr.c
++++ b/fs/file_attr.c
+@@ -84,7 +84,7 @@ int vfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+ int error;
+
+ if (!inode->i_op->fileattr_get)
+- return -EOPNOTSUPP;
++ return -ENOIOCTLCMD;
+
+ error = security_inode_file_getattr(dentry, fa);
+ if (error)
+@@ -270,7 +270,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
+ int err;
+
+ if (!inode->i_op->fileattr_set)
+- return -EOPNOTSUPP;
++ return -ENOIOCTLCMD;
+
+ if (!inode_owner_or_capable(idmap, inode))
+ return -EPERM;
+@@ -312,8 +312,6 @@ int ioctl_getflags(struct file *file, unsigned int __user *argp)
+ int err;
+
+ err = vfs_fileattr_get(file->f_path.dentry, &fa);
+- if (err == -EOPNOTSUPP)
+- err = -ENOIOCTLCMD;
+ if (!err)
+ err = put_user(fa.flags, argp);
+ return err;
+@@ -335,8 +333,6 @@ int ioctl_setflags(struct file *file, unsigned int __user *argp)
+ fileattr_fill_flags(&fa, flags);
+ err = vfs_fileattr_set(idmap, dentry, &fa);
+ mnt_drop_write_file(file);
+- if (err == -EOPNOTSUPP)
+- err = -ENOIOCTLCMD;
+ }
+ }
+ return err;
+@@ -349,8 +345,6 @@ int ioctl_fsgetxattr(struct file *file, void __user *argp)
+ int err;
+
+ err = vfs_fileattr_get(file->f_path.dentry, &fa);
+- if (err == -EOPNOTSUPP)
+- err = -ENOIOCTLCMD;
+ if (!err)
+ err = copy_fsxattr_to_user(&fa, argp);
+
+@@ -371,8 +365,6 @@ int ioctl_fssetxattr(struct file *file, void __user *argp)
+ if (!err) {
+ err = vfs_fileattr_set(idmap, dentry, &fa);
+ mnt_drop_write_file(file);
+- if (err == -EOPNOTSUPP)
+- err = -ENOIOCTLCMD;
+ }
+ }
+ return err;
+diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c
+index 57032eadca6c27..fdc175e93f7474 100644
+--- a/fs/fuse/ioctl.c
++++ b/fs/fuse/ioctl.c
+@@ -536,8 +536,6 @@ int fuse_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+ cleanup:
+ fuse_priv_ioctl_cleanup(inode, ff);
+
+- if (err == -ENOTTY)
+- err = -EOPNOTSUPP;
+ return err;
+ }
+
+@@ -574,7 +572,5 @@ int fuse_fileattr_set(struct mnt_idmap *idmap,
+ cleanup:
+ fuse_priv_ioctl_cleanup(inode, ff);
+
+- if (err == -ENOTTY)
+- err = -EOPNOTSUPP;
+ return err;
+ }
+diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
+index 862ba27f1628a8..11e08a4a18b295 100644
+--- a/fs/hfsplus/unicode.c
++++ b/fs/hfsplus/unicode.c
+@@ -40,6 +40,18 @@ int hfsplus_strcasecmp(const struct hfsplus_unistr *s1,
+ p1 = s1->unicode;
+ p2 = s2->unicode;
+
++ if (len1 > HFSPLUS_MAX_STRLEN) {
++ len1 = HFSPLUS_MAX_STRLEN;
++ pr_err("invalid length %u has been corrected to %d\n",
++ be16_to_cpu(s1->length), len1);
++ }
++
++ if (len2 > HFSPLUS_MAX_STRLEN) {
++ len2 = HFSPLUS_MAX_STRLEN;
++ pr_err("invalid length %u has been corrected to %d\n",
++ be16_to_cpu(s2->length), len2);
++ }
++
+ while (1) {
+ c1 = c2 = 0;
+
+@@ -74,6 +86,18 @@ int hfsplus_strcmp(const struct hfsplus_unistr *s1,
+ p1 = s1->unicode;
+ p2 = s2->unicode;
+
++ if (len1 > HFSPLUS_MAX_STRLEN) {
++ len1 = HFSPLUS_MAX_STRLEN;
++ pr_err("invalid length %u has been corrected to %d\n",
++ be16_to_cpu(s1->length), len1);
++ }
++
++ if (len2 > HFSPLUS_MAX_STRLEN) {
++ len2 = HFSPLUS_MAX_STRLEN;
++ pr_err("invalid length %u has been corrected to %d\n",
++ be16_to_cpu(s2->length), len2);
++ }
++
+ for (len = min(len1, len2); len > 0; len--) {
+ c1 = be16_to_cpu(*p1);
+ c2 = be16_to_cpu(*p2);
+diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
+index c7867139af69dd..3e510564de6ee8 100644
+--- a/fs/jbd2/transaction.c
++++ b/fs/jbd2/transaction.c
+@@ -1659,6 +1659,7 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh)
+ int drop_reserve = 0;
+ int err = 0;
+ int was_modified = 0;
++ int wait_for_writeback = 0;
+
+ if (is_handle_aborted(handle))
+ return -EROFS;
+@@ -1782,18 +1783,22 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh)
+ }
+
+ /*
+- * The buffer is still not written to disk, we should
+- * attach this buffer to current transaction so that the
+- * buffer can be checkpointed only after the current
+- * transaction commits.
++ * The buffer has not yet been written to disk. We should
++ * either clear the buffer or ensure that the ongoing I/O
++ * is completed, and attach this buffer to current
++ * transaction so that the buffer can be checkpointed only
++ * after the current transaction commits.
+ */
+ clear_buffer_dirty(bh);
++ wait_for_writeback = 1;
+ __jbd2_journal_file_buffer(jh, transaction, BJ_Forget);
+ spin_unlock(&journal->j_list_lock);
+ }
+ drop:
+ __brelse(bh);
+ spin_unlock(&jh->b_state_lock);
++ if (wait_for_writeback)
++ wait_on_buffer(bh);
+ jbd2_journal_put_journal_head(jh);
+ if (drop_reserve) {
+ /* no need to reserve log space for this block -bzzz */
+diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
+index 19078a043e85c5..0822d8a119c6fa 100644
+--- a/fs/nfsd/blocklayout.c
++++ b/fs/nfsd/blocklayout.c
+@@ -118,7 +118,6 @@ nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp,
+ struct iomap *iomaps, int nr_iomaps)
+ {
+ struct timespec64 mtime = inode_get_mtime(inode);
+- loff_t new_size = lcp->lc_last_wr + 1;
+ struct iattr iattr = { .ia_valid = 0 };
+ int error;
+
+@@ -128,9 +127,9 @@ nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp,
+ iattr.ia_valid |= ATTR_ATIME | ATTR_CTIME | ATTR_MTIME;
+ iattr.ia_atime = iattr.ia_ctime = iattr.ia_mtime = lcp->lc_mtime;
+
+- if (new_size > i_size_read(inode)) {
++ if (lcp->lc_size_chg) {
+ iattr.ia_valid |= ATTR_SIZE;
+- iattr.ia_size = new_size;
++ iattr.ia_size = lcp->lc_newsize;
+ }
+
+ error = inode->i_sb->s_export_op->commit_blocks(inode, iomaps,
+@@ -173,16 +172,18 @@ nfsd4_block_proc_getdeviceinfo(struct super_block *sb,
+ }
+
+ static __be32
+-nfsd4_block_proc_layoutcommit(struct inode *inode,
++nfsd4_block_proc_layoutcommit(struct inode *inode, struct svc_rqst *rqstp,
+ struct nfsd4_layoutcommit *lcp)
+ {
+ struct iomap *iomaps;
+ int nr_iomaps;
+ __be32 nfserr;
+
+- nfserr = nfsd4_block_decode_layoutupdate(lcp->lc_up_layout,
+- lcp->lc_up_len, &iomaps, &nr_iomaps,
+- i_blocksize(inode));
++ rqstp->rq_arg = lcp->lc_up_layout;
++ svcxdr_init_decode(rqstp);
++
++ nfserr = nfsd4_block_decode_layoutupdate(&rqstp->rq_arg_stream,
++ &iomaps, &nr_iomaps, i_blocksize(inode));
+ if (nfserr != nfs_ok)
+ return nfserr;
+
+@@ -313,16 +314,18 @@ nfsd4_scsi_proc_getdeviceinfo(struct super_block *sb,
+ return nfserrno(nfsd4_block_get_device_info_scsi(sb, clp, gdp));
+ }
+ static __be32
+-nfsd4_scsi_proc_layoutcommit(struct inode *inode,
++nfsd4_scsi_proc_layoutcommit(struct inode *inode, struct svc_rqst *rqstp,
+ struct nfsd4_layoutcommit *lcp)
+ {
+ struct iomap *iomaps;
+ int nr_iomaps;
+ __be32 nfserr;
+
+- nfserr = nfsd4_scsi_decode_layoutupdate(lcp->lc_up_layout,
+- lcp->lc_up_len, &iomaps, &nr_iomaps,
+- i_blocksize(inode));
++ rqstp->rq_arg = lcp->lc_up_layout;
++ svcxdr_init_decode(rqstp);
++
++ nfserr = nfsd4_scsi_decode_layoutupdate(&rqstp->rq_arg_stream,
++ &iomaps, &nr_iomaps, i_blocksize(inode));
+ if (nfserr != nfs_ok)
+ return nfserr;
+
+diff --git a/fs/nfsd/blocklayoutxdr.c b/fs/nfsd/blocklayoutxdr.c
+index bcf21fde912077..e50afe34073719 100644
+--- a/fs/nfsd/blocklayoutxdr.c
++++ b/fs/nfsd/blocklayoutxdr.c
+@@ -29,8 +29,7 @@ nfsd4_block_encode_layoutget(struct xdr_stream *xdr,
+ *p++ = cpu_to_be32(len);
+ *p++ = cpu_to_be32(1); /* we always return a single extent */
+
+- p = xdr_encode_opaque_fixed(p, &b->vol_id,
+- sizeof(struct nfsd4_deviceid));
++ p = svcxdr_encode_deviceid4(p, &b->vol_id);
+ p = xdr_encode_hyper(p, b->foff);
+ p = xdr_encode_hyper(p, b->len);
+ p = xdr_encode_hyper(p, b->soff);
+@@ -114,8 +113,7 @@ nfsd4_block_encode_getdeviceinfo(struct xdr_stream *xdr,
+
+ /**
+ * nfsd4_block_decode_layoutupdate - decode the block layout extent array
+- * @p: pointer to the xdr data
+- * @len: number of bytes to decode
++ * @xdr: subbuf set to the encoded array
+ * @iomapp: pointer to store the decoded extent array
+ * @nr_iomapsp: pointer to store the number of extents
+ * @block_size: alignment of extent offset and length
+@@ -128,25 +126,24 @@ nfsd4_block_encode_getdeviceinfo(struct xdr_stream *xdr,
+ *
+ * Return values:
+ * %nfs_ok: Successful decoding, @iomapp and @nr_iomapsp are valid
+- * %nfserr_bad_xdr: The encoded array in @p is invalid
++ * %nfserr_bad_xdr: The encoded array in @xdr is invalid
+ * %nfserr_inval: An unaligned extent found
+ * %nfserr_delay: Failed to allocate memory for @iomapp
+ */
+ __be32
+-nfsd4_block_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
++nfsd4_block_decode_layoutupdate(struct xdr_stream *xdr, struct iomap **iomapp,
+ int *nr_iomapsp, u32 block_size)
+ {
+ struct iomap *iomaps;
+- u32 nr_iomaps, i;
++ u32 nr_iomaps, expected, len, i;
++ __be32 nfserr;
+
+- if (len < sizeof(u32))
+- return nfserr_bad_xdr;
+- len -= sizeof(u32);
+- if (len % PNFS_BLOCK_EXTENT_SIZE)
++ if (xdr_stream_decode_u32(xdr, &nr_iomaps))
+ return nfserr_bad_xdr;
+
+- nr_iomaps = be32_to_cpup(p++);
+- if (nr_iomaps != len / PNFS_BLOCK_EXTENT_SIZE)
++ len = sizeof(__be32) + xdr_stream_remaining(xdr);
++ expected = sizeof(__be32) + nr_iomaps * PNFS_BLOCK_EXTENT_SIZE;
++ if (len != expected)
+ return nfserr_bad_xdr;
+
+ iomaps = kcalloc(nr_iomaps, sizeof(*iomaps), GFP_KERNEL);
+@@ -156,23 +153,44 @@ nfsd4_block_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
+ for (i = 0; i < nr_iomaps; i++) {
+ struct pnfs_block_extent bex;
+
+- memcpy(&bex.vol_id, p, sizeof(struct nfsd4_deviceid));
+- p += XDR_QUADLEN(sizeof(struct nfsd4_deviceid));
++ if (nfsd4_decode_deviceid4(xdr, &bex.vol_id)) {
++ nfserr = nfserr_bad_xdr;
++ goto fail;
++ }
+
+- p = xdr_decode_hyper(p, &bex.foff);
++ if (xdr_stream_decode_u64(xdr, &bex.foff)) {
++ nfserr = nfserr_bad_xdr;
++ goto fail;
++ }
+ if (bex.foff & (block_size - 1)) {
++ nfserr = nfserr_inval;
++ goto fail;
++ }
++
++ if (xdr_stream_decode_u64(xdr, &bex.len)) {
++ nfserr = nfserr_bad_xdr;
+ goto fail;
+ }
+- p = xdr_decode_hyper(p, &bex.len);
+ if (bex.len & (block_size - 1)) {
++ nfserr = nfserr_inval;
++ goto fail;
++ }
++
++ if (xdr_stream_decode_u64(xdr, &bex.soff)) {
++ nfserr = nfserr_bad_xdr;
+ goto fail;
+ }
+- p = xdr_decode_hyper(p, &bex.soff);
+ if (bex.soff & (block_size - 1)) {
++ nfserr = nfserr_inval;
++ goto fail;
++ }
++
++ if (xdr_stream_decode_u32(xdr, &bex.es)) {
++ nfserr = nfserr_bad_xdr;
+ goto fail;
+ }
+- bex.es = be32_to_cpup(p++);
+ if (bex.es != PNFS_BLOCK_READWRITE_DATA) {
++ nfserr = nfserr_inval;
+ goto fail;
+ }
+
+@@ -185,13 +203,12 @@ nfsd4_block_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
+ return nfs_ok;
+ fail:
+ kfree(iomaps);
+- return nfserr_inval;
++ return nfserr;
+ }
+
+ /**
+ * nfsd4_scsi_decode_layoutupdate - decode the scsi layout extent array
+- * @p: pointer to the xdr data
+- * @len: number of bytes to decode
++ * @xdr: subbuf set to the encoded array
+ * @iomapp: pointer to store the decoded extent array
+ * @nr_iomapsp: pointer to store the number of extents
+ * @block_size: alignment of extent offset and length
+@@ -203,21 +220,22 @@ nfsd4_block_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
+ *
+ * Return values:
+ * %nfs_ok: Successful decoding, @iomapp and @nr_iomapsp are valid
+- * %nfserr_bad_xdr: The encoded array in @p is invalid
++ * %nfserr_bad_xdr: The encoded array in @xdr is invalid
+ * %nfserr_inval: An unaligned extent found
+ * %nfserr_delay: Failed to allocate memory for @iomapp
+ */
+ __be32
+-nfsd4_scsi_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
++nfsd4_scsi_decode_layoutupdate(struct xdr_stream *xdr, struct iomap **iomapp,
+ int *nr_iomapsp, u32 block_size)
+ {
+ struct iomap *iomaps;
+- u32 nr_iomaps, expected, i;
++ u32 nr_iomaps, expected, len, i;
++ __be32 nfserr;
+
+- if (len < sizeof(u32))
++ if (xdr_stream_decode_u32(xdr, &nr_iomaps))
+ return nfserr_bad_xdr;
+
+- nr_iomaps = be32_to_cpup(p++);
++ len = sizeof(__be32) + xdr_stream_remaining(xdr);
+ expected = sizeof(__be32) + nr_iomaps * PNFS_SCSI_RANGE_SIZE;
+ if (len != expected)
+ return nfserr_bad_xdr;
+@@ -229,14 +247,22 @@ nfsd4_scsi_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
+ for (i = 0; i < nr_iomaps; i++) {
+ u64 val;
+
+- p = xdr_decode_hyper(p, &val);
++ if (xdr_stream_decode_u64(xdr, &val)) {
++ nfserr = nfserr_bad_xdr;
++ goto fail;
++ }
+ if (val & (block_size - 1)) {
++ nfserr = nfserr_inval;
+ goto fail;
+ }
+ iomaps[i].offset = val;
+
+- p = xdr_decode_hyper(p, &val);
++ if (xdr_stream_decode_u64(xdr, &val)) {
++ nfserr = nfserr_bad_xdr;
++ goto fail;
++ }
+ if (val & (block_size - 1)) {
++ nfserr = nfserr_inval;
+ goto fail;
+ }
+ iomaps[i].length = val;
+@@ -247,5 +273,5 @@ nfsd4_scsi_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
+ return nfs_ok;
+ fail:
+ kfree(iomaps);
+- return nfserr_inval;
++ return nfserr;
+ }
+diff --git a/fs/nfsd/blocklayoutxdr.h b/fs/nfsd/blocklayoutxdr.h
+index 15b3569f3d9ad3..7d25ef689671f7 100644
+--- a/fs/nfsd/blocklayoutxdr.h
++++ b/fs/nfsd/blocklayoutxdr.h
+@@ -54,9 +54,9 @@ __be32 nfsd4_block_encode_getdeviceinfo(struct xdr_stream *xdr,
+ const struct nfsd4_getdeviceinfo *gdp);
+ __be32 nfsd4_block_encode_layoutget(struct xdr_stream *xdr,
+ const struct nfsd4_layoutget *lgp);
+-__be32 nfsd4_block_decode_layoutupdate(__be32 *p, u32 len,
++__be32 nfsd4_block_decode_layoutupdate(struct xdr_stream *xdr,
+ struct iomap **iomapp, int *nr_iomapsp, u32 block_size);
+-__be32 nfsd4_scsi_decode_layoutupdate(__be32 *p, u32 len,
++__be32 nfsd4_scsi_decode_layoutupdate(struct xdr_stream *xdr,
+ struct iomap **iomapp, int *nr_iomapsp, u32 block_size);
+
+ #endif /* _NFSD_BLOCKLAYOUTXDR_H */
+diff --git a/fs/nfsd/flexfilelayout.c b/fs/nfsd/flexfilelayout.c
+index 3ca5304440ff0a..3c4419da5e24c2 100644
+--- a/fs/nfsd/flexfilelayout.c
++++ b/fs/nfsd/flexfilelayout.c
+@@ -125,6 +125,13 @@ nfsd4_ff_proc_getdeviceinfo(struct super_block *sb, struct svc_rqst *rqstp,
+ return 0;
+ }
+
++static __be32
++nfsd4_ff_proc_layoutcommit(struct inode *inode, struct svc_rqst *rqstp,
++ struct nfsd4_layoutcommit *lcp)
++{
++ return nfs_ok;
++}
++
+ const struct nfsd4_layout_ops ff_layout_ops = {
+ .notify_types =
+ NOTIFY_DEVICEID4_DELETE | NOTIFY_DEVICEID4_CHANGE,
+@@ -133,4 +140,5 @@ const struct nfsd4_layout_ops ff_layout_ops = {
+ .encode_getdeviceinfo = nfsd4_ff_encode_getdeviceinfo,
+ .proc_layoutget = nfsd4_ff_proc_layoutget,
+ .encode_layoutget = nfsd4_ff_encode_layoutget,
++ .proc_layoutcommit = nfsd4_ff_proc_layoutcommit,
+ };
+diff --git a/fs/nfsd/flexfilelayoutxdr.c b/fs/nfsd/flexfilelayoutxdr.c
+index aeb71c10ff1b96..f9f7e38cba13fb 100644
+--- a/fs/nfsd/flexfilelayoutxdr.c
++++ b/fs/nfsd/flexfilelayoutxdr.c
+@@ -54,8 +54,7 @@ nfsd4_ff_encode_layoutget(struct xdr_stream *xdr,
+ *p++ = cpu_to_be32(1); /* single mirror */
+ *p++ = cpu_to_be32(1); /* single data server */
+
+- p = xdr_encode_opaque_fixed(p, &fl->deviceid,
+- sizeof(struct nfsd4_deviceid));
++ p = svcxdr_encode_deviceid4(p, &fl->deviceid);
+
+ *p++ = cpu_to_be32(1); /* efficiency */
+
+diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
+index aea905fcaf87aa..683bd1130afe29 100644
+--- a/fs/nfsd/nfs4layouts.c
++++ b/fs/nfsd/nfs4layouts.c
+@@ -120,7 +120,6 @@ nfsd4_set_deviceid(struct nfsd4_deviceid *id, const struct svc_fh *fhp,
+
+ id->fsid_idx = fhp->fh_export->ex_devid_map->idx;
+ id->generation = device_generation;
+- id->pad = 0;
+ return 0;
+ }
+
+diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
+index 75abdd7c6ef84b..7ae8e885d7530c 100644
+--- a/fs/nfsd/nfs4proc.c
++++ b/fs/nfsd/nfs4proc.c
+@@ -2504,7 +2504,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp,
+ const struct nfsd4_layout_seg *seg = &lcp->lc_seg;
+ struct svc_fh *current_fh = &cstate->current_fh;
+ const struct nfsd4_layout_ops *ops;
+- loff_t new_size = lcp->lc_last_wr + 1;
+ struct inode *inode;
+ struct nfs4_layout_stateid *ls;
+ __be32 nfserr;
+@@ -2520,18 +2519,20 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp,
+ goto out;
+ inode = d_inode(current_fh->fh_dentry);
+
+- nfserr = nfserr_inval;
+- if (new_size <= seg->offset) {
+- dprintk("pnfsd: last write before layout segment\n");
+- goto out;
+- }
+- if (new_size > seg->offset + seg->length) {
+- dprintk("pnfsd: last write beyond layout segment\n");
+- goto out;
+- }
+- if (!lcp->lc_newoffset && new_size > i_size_read(inode)) {
+- dprintk("pnfsd: layoutcommit beyond EOF\n");
+- goto out;
++ lcp->lc_size_chg = false;
++ if (lcp->lc_newoffset) {
++ loff_t new_size = lcp->lc_last_wr + 1;
++
++ nfserr = nfserr_inval;
++ if (new_size <= seg->offset)
++ goto out;
++ if (new_size > seg->offset + seg->length)
++ goto out;
++
++ if (new_size > i_size_read(inode)) {
++ lcp->lc_size_chg = true;
++ lcp->lc_newsize = new_size;
++ }
+ }
+
+ nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lcp->lc_sid,
+@@ -2548,14 +2549,7 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp,
+ /* LAYOUTCOMMIT does not require any serialization */
+ mutex_unlock(&ls->ls_mutex);
+
+- if (new_size > i_size_read(inode)) {
+- lcp->lc_size_chg = true;
+- lcp->lc_newsize = new_size;
+- } else {
+- lcp->lc_size_chg = false;
+- }
+-
+- nfserr = ops->proc_layoutcommit(inode, lcp);
++ nfserr = ops->proc_layoutcommit(inode, rqstp, lcp);
+ nfs4_put_stid(&ls->ls_stid);
+ out:
+ return nfserr;
+diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
+index a00300b2877541..89cc970effbcea 100644
+--- a/fs/nfsd/nfs4xdr.c
++++ b/fs/nfsd/nfs4xdr.c
+@@ -588,23 +588,13 @@ nfsd4_decode_state_owner4(struct nfsd4_compoundargs *argp,
+ }
+
+ #ifdef CONFIG_NFSD_PNFS
+-static __be32
+-nfsd4_decode_deviceid4(struct nfsd4_compoundargs *argp,
+- struct nfsd4_deviceid *devid)
+-{
+- __be32 *p;
+-
+- p = xdr_inline_decode(argp->xdr, NFS4_DEVICEID4_SIZE);
+- if (!p)
+- return nfserr_bad_xdr;
+- memcpy(devid, p, sizeof(*devid));
+- return nfs_ok;
+-}
+
+ static __be32
+ nfsd4_decode_layoutupdate4(struct nfsd4_compoundargs *argp,
+ struct nfsd4_layoutcommit *lcp)
+ {
++ u32 len;
++
+ if (xdr_stream_decode_u32(argp->xdr, &lcp->lc_layout_type) < 0)
+ return nfserr_bad_xdr;
+ if (lcp->lc_layout_type < LAYOUT_NFSV4_1_FILES)
+@@ -612,13 +602,10 @@ nfsd4_decode_layoutupdate4(struct nfsd4_compoundargs *argp,
+ if (lcp->lc_layout_type >= LAYOUT_TYPE_MAX)
+ return nfserr_bad_xdr;
+
+- if (xdr_stream_decode_u32(argp->xdr, &lcp->lc_up_len) < 0)
++ if (xdr_stream_decode_u32(argp->xdr, &len) < 0)
++ return nfserr_bad_xdr;
++ if (!xdr_stream_subsegment(argp->xdr, &lcp->lc_up_layout, len))
+ return nfserr_bad_xdr;
+- if (lcp->lc_up_len > 0) {
+- lcp->lc_up_layout = xdr_inline_decode(argp->xdr, lcp->lc_up_len);
+- if (!lcp->lc_up_layout)
+- return nfserr_bad_xdr;
+- }
+
+ return nfs_ok;
+ }
+@@ -1784,7 +1771,7 @@ nfsd4_decode_getdeviceinfo(struct nfsd4_compoundargs *argp,
+ __be32 status;
+
+ memset(gdev, 0, sizeof(*gdev));
+- status = nfsd4_decode_deviceid4(argp, &gdev->gd_devid);
++ status = nfsd4_decode_deviceid4(argp->xdr, &gdev->gd_devid);
+ if (status)
+ return status;
+ if (xdr_stream_decode_u32(argp->xdr, &gdev->gd_layout_type) < 0)
+diff --git a/fs/nfsd/pnfs.h b/fs/nfsd/pnfs.h
+index 925817f669176c..dfd411d1f363fd 100644
+--- a/fs/nfsd/pnfs.h
++++ b/fs/nfsd/pnfs.h
+@@ -35,6 +35,7 @@ struct nfsd4_layout_ops {
+ const struct nfsd4_layoutget *lgp);
+
+ __be32 (*proc_layoutcommit)(struct inode *inode,
++ struct svc_rqst *rqstp,
+ struct nfsd4_layoutcommit *lcp);
+
+ void (*fence_client)(struct nfs4_layout_stateid *ls,
+diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
+index a23bc56051caf5..d4b48602b2b0c3 100644
+--- a/fs/nfsd/xdr4.h
++++ b/fs/nfsd/xdr4.h
+@@ -595,9 +595,43 @@ struct nfsd4_reclaim_complete {
+ struct nfsd4_deviceid {
+ u64 fsid_idx;
+ u32 generation;
+- u32 pad;
+ };
+
++static inline __be32 *
++svcxdr_encode_deviceid4(__be32 *p, const struct nfsd4_deviceid *devid)
++{
++ __be64 *q = (__be64 *)p;
++
++ *q = (__force __be64)devid->fsid_idx;
++ p += 2;
++ *p++ = (__force __be32)devid->generation;
++ *p++ = xdr_zero;
++ return p;
++}
++
++static inline __be32 *
++svcxdr_decode_deviceid4(__be32 *p, struct nfsd4_deviceid *devid)
++{
++ __be64 *q = (__be64 *)p;
++
++ devid->fsid_idx = (__force u64)(*q);
++ p += 2;
++ devid->generation = (__force u32)(*p++);
++ p++; /* NFSD does not use the remaining octets */
++ return p;
++}
++
++static inline __be32
++nfsd4_decode_deviceid4(struct xdr_stream *xdr, struct nfsd4_deviceid *devid)
++{
++ __be32 *p = xdr_inline_decode(xdr, NFS4_DEVICEID4_SIZE);
++
++ if (unlikely(!p))
++ return nfserr_bad_xdr;
++ svcxdr_decode_deviceid4(p, devid);
++ return nfs_ok;
++}
++
+ struct nfsd4_layout_seg {
+ u32 iomode;
+ u64 offset;
+@@ -630,8 +664,7 @@ struct nfsd4_layoutcommit {
+ u64 lc_last_wr; /* request */
+ struct timespec64 lc_mtime; /* request */
+ u32 lc_layout_type; /* request */
+- u32 lc_up_len; /* layout length */
+- void *lc_up_layout; /* decoded by callback */
++ struct xdr_buf lc_up_layout; /* decoded by callback */
+ bool lc_size_chg; /* response */
+ u64 lc_newsize; /* response */
+ };
+diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
+index 27396fe63f6d5b..20c92ea580937e 100644
+--- a/fs/overlayfs/copy_up.c
++++ b/fs/overlayfs/copy_up.c
+@@ -178,7 +178,7 @@ static int ovl_copy_fileattr(struct inode *inode, const struct path *old,
+ err = ovl_real_fileattr_get(old, &oldfa);
+ if (err) {
+ /* Ntfs-3g returns -EINVAL for "no fileattr support" */
+- if (err == -EOPNOTSUPP || err == -EINVAL)
++ if (err == -ENOTTY || err == -EINVAL)
+ return 0;
+ pr_warn("failed to retrieve lower fileattr (%pd2, err=%i)\n",
+ old->dentry, err);
+diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
+index ecb9f2019395ec..d4722e1b83bc68 100644
+--- a/fs/overlayfs/inode.c
++++ b/fs/overlayfs/inode.c
+@@ -720,7 +720,10 @@ int ovl_real_fileattr_get(const struct path *realpath, struct file_kattr *fa)
+ if (err)
+ return err;
+
+- return vfs_fileattr_get(realpath->dentry, fa);
++ err = vfs_fileattr_get(realpath->dentry, fa);
++ if (err == -ENOIOCTLCMD)
++ err = -ENOTTY;
++ return err;
+ }
+
+ int ovl_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c
+index 0f0d2dae6283ad..6b41360631f96e 100644
+--- a/fs/smb/client/inode.c
++++ b/fs/smb/client/inode.c
+@@ -2431,8 +2431,10 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry,
+ tcon = tlink_tcon(tlink);
+ server = tcon->ses->server;
+
+- if (!server->ops->rename)
+- return -ENOSYS;
++ if (!server->ops->rename) {
++ rc = -ENOSYS;
++ goto do_rename_exit;
++ }
+
+ /* try path-based rename first */
+ rc = server->ops->rename(xid, tcon, from_dentry,
+diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
+index dda6dece802ad2..e10123d8cd7d93 100644
+--- a/fs/smb/client/misc.c
++++ b/fs/smb/client/misc.c
+@@ -916,6 +916,14 @@ parse_dfs_referrals(struct get_dfs_referral_rsp *rsp, u32 rsp_size,
+ char *data_end;
+ struct dfs_referral_level_3 *ref;
+
++ if (rsp_size < sizeof(*rsp)) {
++ cifs_dbg(VFS | ONCE,
++ "%s: header is malformed (size is %u, must be %zu)\n",
++ __func__, rsp_size, sizeof(*rsp));
++ rc = -EINVAL;
++ goto parse_DFS_referrals_exit;
++ }
++
+ *num_of_nodes = le16_to_cpu(rsp->NumberOfReferrals);
+
+ if (*num_of_nodes < 1) {
+@@ -925,6 +933,15 @@ parse_dfs_referrals(struct get_dfs_referral_rsp *rsp, u32 rsp_size,
+ goto parse_DFS_referrals_exit;
+ }
+
++ if (sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3) > rsp_size) {
++ cifs_dbg(VFS | ONCE,
++ "%s: malformed buffer (size is %u, must be at least %zu)\n",
++ __func__, rsp_size,
++ sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3));
++ rc = -EINVAL;
++ goto parse_DFS_referrals_exit;
++ }
++
+ ref = (struct dfs_referral_level_3 *) &(rsp->referrals);
+ if (ref->VersionNumber != cpu_to_le16(3)) {
+ cifs_dbg(VFS, "Referrals of V%d version are not supported, should be V3\n",
+diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
+index 328fdeecae29a6..7e86d0ef4b35a3 100644
+--- a/fs/smb/client/smb2ops.c
++++ b/fs/smb/client/smb2ops.c
+@@ -3129,8 +3129,7 @@ get_smb2_acl_by_path(struct cifs_sb_info *cifs_sb,
+ utf16_path = cifs_convert_path_to_utf16(path, cifs_sb);
+ if (!utf16_path) {
+ rc = -ENOMEM;
+- free_xid(xid);
+- return ERR_PTR(rc);
++ goto put_tlink;
+ }
+
+ oparms = (struct cifs_open_parms) {
+@@ -3162,6 +3161,7 @@ get_smb2_acl_by_path(struct cifs_sb_info *cifs_sb,
+ SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid);
+ }
+
++put_tlink:
+ cifs_put_tlink(tlink);
+ free_xid(xid);
+
+@@ -3202,8 +3202,7 @@ set_smb2_acl(struct smb_ntsd *pnntsd, __u32 acllen,
+ utf16_path = cifs_convert_path_to_utf16(path, cifs_sb);
+ if (!utf16_path) {
+ rc = -ENOMEM;
+- free_xid(xid);
+- return rc;
++ goto put_tlink;
+ }
+
+ oparms = (struct cifs_open_parms) {
+@@ -3224,6 +3223,7 @@ set_smb2_acl(struct smb_ntsd *pnntsd, __u32 acllen,
+ SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid);
+ }
+
++put_tlink:
+ cifs_put_tlink(tlink);
+ free_xid(xid);
+ return rc;
+diff --git a/fs/smb/server/mgmt/user_session.c b/fs/smb/server/mgmt/user_session.c
+index b36d0676dbe584..00805aed0b07d9 100644
+--- a/fs/smb/server/mgmt/user_session.c
++++ b/fs/smb/server/mgmt/user_session.c
+@@ -147,14 +147,11 @@ void ksmbd_session_rpc_close(struct ksmbd_session *sess, int id)
+ int ksmbd_session_rpc_method(struct ksmbd_session *sess, int id)
+ {
+ struct ksmbd_session_rpc *entry;
+- int method;
+
+- down_read(&sess->rpc_lock);
++ lockdep_assert_held(&sess->rpc_lock);
+ entry = xa_load(&sess->rpc_handle_list, id);
+- method = entry ? entry->method : 0;
+- up_read(&sess->rpc_lock);
+
+- return method;
++ return entry ? entry->method : 0;
+ }
+
+ void ksmbd_session_destroy(struct ksmbd_session *sess)
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index a1db006ab6e924..287200d7c07644 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -4624,8 +4624,15 @@ static int smb2_get_info_file_pipe(struct ksmbd_session *sess,
+ * pipe without opening it, checking error condition here
+ */
+ id = req->VolatileFileId;
+- if (!ksmbd_session_rpc_method(sess, id))
++
++ lockdep_assert_not_held(&sess->rpc_lock);
++
++ down_read(&sess->rpc_lock);
++ if (!ksmbd_session_rpc_method(sess, id)) {
++ up_read(&sess->rpc_lock);
+ return -ENOENT;
++ }
++ up_read(&sess->rpc_lock);
+
+ ksmbd_debug(SMB, "FileInfoClass %u, FileId 0x%llx\n",
+ req->FileInfoClass, req->VolatileFileId);
+diff --git a/fs/smb/server/transport_ipc.c b/fs/smb/server/transport_ipc.c
+index 2aa1b29bea0804..46f87fd1ce1cd8 100644
+--- a/fs/smb/server/transport_ipc.c
++++ b/fs/smb/server/transport_ipc.c
+@@ -825,6 +825,9 @@ struct ksmbd_rpc_command *ksmbd_rpc_write(struct ksmbd_session *sess, int handle
+ if (!msg)
+ return NULL;
+
++ lockdep_assert_not_held(&sess->rpc_lock);
++
++ down_read(&sess->rpc_lock);
+ msg->type = KSMBD_EVENT_RPC_REQUEST;
+ req = (struct ksmbd_rpc_command *)msg->payload;
+ req->handle = handle;
+@@ -833,6 +836,7 @@ struct ksmbd_rpc_command *ksmbd_rpc_write(struct ksmbd_session *sess, int handle
+ req->flags |= KSMBD_RPC_WRITE_METHOD;
+ req->payload_sz = payload_sz;
+ memcpy(req->payload, payload, payload_sz);
++ up_read(&sess->rpc_lock);
+
+ resp = ipc_msg_send_request(msg, req->handle);
+ ipc_msg_free(msg);
+@@ -849,6 +853,9 @@ struct ksmbd_rpc_command *ksmbd_rpc_read(struct ksmbd_session *sess, int handle)
+ if (!msg)
+ return NULL;
+
++ lockdep_assert_not_held(&sess->rpc_lock);
++
++ down_read(&sess->rpc_lock);
+ msg->type = KSMBD_EVENT_RPC_REQUEST;
+ req = (struct ksmbd_rpc_command *)msg->payload;
+ req->handle = handle;
+@@ -856,6 +863,7 @@ struct ksmbd_rpc_command *ksmbd_rpc_read(struct ksmbd_session *sess, int handle)
+ req->flags |= rpc_context_flags(sess);
+ req->flags |= KSMBD_RPC_READ_METHOD;
+ req->payload_sz = 0;
++ up_read(&sess->rpc_lock);
+
+ resp = ipc_msg_send_request(msg, req->handle);
+ ipc_msg_free(msg);
+@@ -876,6 +884,9 @@ struct ksmbd_rpc_command *ksmbd_rpc_ioctl(struct ksmbd_session *sess, int handle
+ if (!msg)
+ return NULL;
+
++ lockdep_assert_not_held(&sess->rpc_lock);
++
++ down_read(&sess->rpc_lock);
+ msg->type = KSMBD_EVENT_RPC_REQUEST;
+ req = (struct ksmbd_rpc_command *)msg->payload;
+ req->handle = handle;
+@@ -884,6 +895,7 @@ struct ksmbd_rpc_command *ksmbd_rpc_ioctl(struct ksmbd_session *sess, int handle
+ req->flags |= KSMBD_RPC_IOCTL_METHOD;
+ req->payload_sz = payload_sz;
+ memcpy(req->payload, payload, payload_sz);
++ up_read(&sess->rpc_lock);
+
+ resp = ipc_msg_send_request(msg, req->handle);
+ ipc_msg_free(msg);
+diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
+index 0d637c276db053..942c490f23e4d7 100644
+--- a/fs/xfs/libxfs/xfs_log_format.h
++++ b/fs/xfs/libxfs/xfs_log_format.h
+@@ -174,12 +174,40 @@ typedef struct xlog_rec_header {
+ __be32 h_prev_block; /* block number to previous LR : 4 */
+ __be32 h_num_logops; /* number of log operations in this LR : 4 */
+ __be32 h_cycle_data[XLOG_HEADER_CYCLE_SIZE / BBSIZE];
+- /* new fields */
++
++ /* fields added by the Linux port: */
+ __be32 h_fmt; /* format of log record : 4 */
+ uuid_t h_fs_uuid; /* uuid of FS : 16 */
++
++ /* fields added for log v2: */
+ __be32 h_size; /* iclog size : 4 */
++
++ /*
++ * When h_size added for log v2 support, it caused structure to have
++ * a different size on i386 vs all other architectures because the
++ * sum of the size ofthe member is not aligned by that of the largest
++ * __be64-sized member, and i386 has really odd struct alignment rules.
++ *
++ * Due to the way the log headers are placed out on-disk that alone is
++ * not a problem becaue the xlog_rec_header always sits alone in a
++ * BBSIZEs area, and the rest of that area is padded with zeroes.
++ * But xlog_cksum used to calculate the checksum based on the structure
++ * size, and thus gives different checksums for i386 vs the rest.
++ * We now do two checksum validation passes for both sizes to allow
++ * moving v5 file systems with unclean logs between i386 and other
++ * (little-endian) architectures.
++ */
++ __u32 h_pad0;
+ } xlog_rec_header_t;
+
++#ifdef __i386__
++#define XLOG_REC_SIZE offsetofend(struct xlog_rec_header, h_size)
++#define XLOG_REC_SIZE_OTHER sizeof(struct xlog_rec_header)
++#else
++#define XLOG_REC_SIZE sizeof(struct xlog_rec_header)
++#define XLOG_REC_SIZE_OTHER offsetofend(struct xlog_rec_header, h_size)
++#endif /* __i386__ */
++
+ typedef struct xlog_rec_ext_header {
+ __be32 xh_cycle; /* write cycle of log : 4 */
+ __be32 xh_cycle_data[XLOG_HEADER_CYCLE_SIZE / BBSIZE]; /* : 256 */
+diff --git a/fs/xfs/libxfs/xfs_ondisk.h b/fs/xfs/libxfs/xfs_ondisk.h
+index 5ed44fdf749105..7bfa3242e2c536 100644
+--- a/fs/xfs/libxfs/xfs_ondisk.h
++++ b/fs/xfs/libxfs/xfs_ondisk.h
+@@ -174,6 +174,8 @@ xfs_check_ondisk_structs(void)
+ XFS_CHECK_STRUCT_SIZE(struct xfs_rud_log_format, 16);
+ XFS_CHECK_STRUCT_SIZE(struct xfs_map_extent, 32);
+ XFS_CHECK_STRUCT_SIZE(struct xfs_phys_extent, 16);
++ XFS_CHECK_STRUCT_SIZE(struct xlog_rec_header, 328);
++ XFS_CHECK_STRUCT_SIZE(struct xlog_rec_ext_header, 260);
+
+ XFS_CHECK_OFFSET(struct xfs_bui_log_format, bui_extents, 16);
+ XFS_CHECK_OFFSET(struct xfs_cui_log_format, cui_extents, 16);
+diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
+index c8a57e21a1d3e0..69703dc3ef9499 100644
+--- a/fs/xfs/xfs_log.c
++++ b/fs/xfs/xfs_log.c
+@@ -1568,13 +1568,13 @@ xlog_cksum(
+ struct xlog *log,
+ struct xlog_rec_header *rhead,
+ char *dp,
+- int size)
++ unsigned int hdrsize,
++ unsigned int size)
+ {
+ uint32_t crc;
+
+ /* first generate the crc for the record header ... */
+- crc = xfs_start_cksum_update((char *)rhead,
+- sizeof(struct xlog_rec_header),
++ crc = xfs_start_cksum_update((char *)rhead, hdrsize,
+ offsetof(struct xlog_rec_header, h_crc));
+
+ /* ... then for additional cycle data for v2 logs ... */
+@@ -1818,7 +1818,7 @@ xlog_sync(
+
+ /* calculcate the checksum */
+ iclog->ic_header.h_crc = xlog_cksum(log, &iclog->ic_header,
+- iclog->ic_datap, size);
++ iclog->ic_datap, XLOG_REC_SIZE, size);
+ /*
+ * Intentionally corrupt the log record CRC based on the error injection
+ * frequency, if defined. This facilitates testing log recovery in the
+diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
+index a9a7a271c15bb7..0cfc654d8e872b 100644
+--- a/fs/xfs/xfs_log_priv.h
++++ b/fs/xfs/xfs_log_priv.h
+@@ -499,8 +499,8 @@ xlog_recover_finish(
+ extern void
+ xlog_recover_cancel(struct xlog *);
+
+-extern __le32 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead,
+- char *dp, int size);
++__le32 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead,
++ char *dp, unsigned int hdrsize, unsigned int size);
+
+ extern struct kmem_cache *xfs_log_ticket_cache;
+ struct xlog_ticket *xlog_ticket_alloc(struct xlog *log, int unit_bytes,
+diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
+index e6ed9e09c02710..549d60959aee5b 100644
+--- a/fs/xfs/xfs_log_recover.c
++++ b/fs/xfs/xfs_log_recover.c
+@@ -2894,20 +2894,34 @@ xlog_recover_process(
+ int pass,
+ struct list_head *buffer_list)
+ {
+- __le32 old_crc = rhead->h_crc;
+- __le32 crc;
++ __le32 expected_crc = rhead->h_crc, crc, other_crc;
+
+- crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len));
++ crc = xlog_cksum(log, rhead, dp, XLOG_REC_SIZE,
++ be32_to_cpu(rhead->h_len));
++
++ /*
++ * Look at the end of the struct xlog_rec_header definition in
++ * xfs_log_format.h for the glory details.
++ */
++ if (expected_crc && crc != expected_crc) {
++ other_crc = xlog_cksum(log, rhead, dp, XLOG_REC_SIZE_OTHER,
++ be32_to_cpu(rhead->h_len));
++ if (other_crc == expected_crc) {
++ xfs_notice_once(log->l_mp,
++ "Fixing up incorrect CRC due to padding.");
++ crc = other_crc;
++ }
++ }
+
+ /*
+ * Nothing else to do if this is a CRC verification pass. Just return
+ * if this a record with a non-zero crc. Unfortunately, mkfs always
+- * sets old_crc to 0 so we must consider this valid even on v5 supers.
+- * Otherwise, return EFSBADCRC on failure so the callers up the stack
+- * know precisely what failed.
++ * sets expected_crc to 0 so we must consider this valid even on v5
++ * supers. Otherwise, return EFSBADCRC on failure so the callers up the
++ * stack know precisely what failed.
+ */
+ if (pass == XLOG_RECOVER_CRCPASS) {
+- if (old_crc && crc != old_crc)
++ if (expected_crc && crc != expected_crc)
+ return -EFSBADCRC;
+ return 0;
+ }
+@@ -2918,11 +2932,11 @@ xlog_recover_process(
+ * zero CRC check prevents warnings from being emitted when upgrading
+ * the kernel from one that does not add CRCs by default.
+ */
+- if (crc != old_crc) {
+- if (old_crc || xfs_has_crc(log->l_mp)) {
++ if (crc != expected_crc) {
++ if (expected_crc || xfs_has_crc(log->l_mp)) {
+ xfs_alert(log->l_mp,
+ "log record CRC mismatch: found 0x%x, expected 0x%x.",
+- le32_to_cpu(old_crc),
++ le32_to_cpu(expected_crc),
+ le32_to_cpu(crc));
+ xfs_hex_dump(dp, 32);
+ }
+diff --git a/include/linux/brcmphy.h b/include/linux/brcmphy.h
+index 15c35655f48262..115a964f300696 100644
+--- a/include/linux/brcmphy.h
++++ b/include/linux/brcmphy.h
+@@ -137,6 +137,7 @@
+
+ #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC 0x07
+ #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_WIRESPEED_EN 0x0010
++#define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RSVD 0x0060
+ #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_EN 0x0080
+ #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_SKEW_EN 0x0100
+ #define MII_BCM54XX_AUXCTL_MISC_FORCE_AMDIX 0x0200
+diff --git a/include/linux/libata.h b/include/linux/libata.h
+index 0620dd67369f33..87a0d956f0dba6 100644
+--- a/include/linux/libata.h
++++ b/include/linux/libata.h
+@@ -1594,6 +1594,12 @@ do { \
+ #define ata_dev_dbg(dev, fmt, ...) \
+ ata_dev_printk(debug, dev, fmt, ##__VA_ARGS__)
+
++#define ata_dev_warn_once(dev, fmt, ...) \
++ pr_warn_once("ata%u.%02u: " fmt, \
++ (dev)->link->ap->print_id, \
++ (dev)->link->pmp + (dev)->devno, \
++ ##__VA_ARGS__)
++
+ static inline void ata_print_version_once(const struct device *dev,
+ const char *version)
+ {
+diff --git a/include/linux/suspend.h b/include/linux/suspend.h
+index 317ae31e89b374..b02876f1ae38ac 100644
+--- a/include/linux/suspend.h
++++ b/include/linux/suspend.h
+@@ -418,6 +418,12 @@ static inline int hibernate_quiet_exec(int (*func)(void *data), void *data) {
+ }
+ #endif /* CONFIG_HIBERNATION */
+
++#if defined(CONFIG_HIBERNATION) && defined(CONFIG_SUSPEND)
++bool pm_hibernation_mode_is_suspend(void);
++#else
++static inline bool pm_hibernation_mode_is_suspend(void) { return false; }
++#endif
++
+ int arch_resume_nosmt(void);
+
+ #ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV
+diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
+index 0f28c5512fcb6c..3aaf19e775580b 100644
+--- a/include/linux/usb/gadget.h
++++ b/include/linux/usb/gadget.h
+@@ -15,6 +15,7 @@
+ #ifndef __LINUX_USB_GADGET_H
+ #define __LINUX_USB_GADGET_H
+
++#include <linux/cleanup.h>
+ #include <linux/configfs.h>
+ #include <linux/device.h>
+ #include <linux/errno.h>
+@@ -32,6 +33,7 @@ struct usb_ep;
+
+ /**
+ * struct usb_request - describes one i/o request
++ * @ep: The associated endpoint set by usb_ep_alloc_request().
+ * @buf: Buffer used for data. Always provide this; some controllers
+ * only use PIO, or don't use DMA for some endpoints.
+ * @dma: DMA address corresponding to 'buf'. If you don't set this
+@@ -98,6 +100,7 @@ struct usb_ep;
+ */
+
+ struct usb_request {
++ struct usb_ep *ep;
+ void *buf;
+ unsigned length;
+ dma_addr_t dma;
+@@ -291,6 +294,28 @@ static inline void usb_ep_fifo_flush(struct usb_ep *ep)
+
+ /*-------------------------------------------------------------------------*/
+
++/**
++ * free_usb_request - frees a usb_request object and its buffer
++ * @req: the request being freed
++ *
++ * This helper function frees both the request's buffer and the request object
++ * itself by calling usb_ep_free_request(). Its signature is designed to be used
++ * with DEFINE_FREE() to enable automatic, scope-based cleanup for usb_request
++ * pointers.
++ */
++static inline void free_usb_request(struct usb_request *req)
++{
++ if (!req)
++ return;
++
++ kfree(req->buf);
++ usb_ep_free_request(req->ep, req);
++}
++
++DEFINE_FREE(free_usb_request, struct usb_request *, free_usb_request(_T))
++
++/*-------------------------------------------------------------------------*/
++
+ struct usb_dcd_config_params {
+ __u8 bU1devExitLat; /* U1 Device exit Latency */
+ #define USB_DEFAULT_U1_DEV_EXIT_LAT 0x01 /* Less then 1 microsec */
+diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
+index 8cf1380f36562b..63154c8faecc3e 100644
+--- a/include/net/ip_tunnels.h
++++ b/include/net/ip_tunnels.h
+@@ -609,6 +609,21 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
+ int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst,
+ int headroom, bool reply);
+
++static inline void ip_tunnel_adj_headroom(struct net_device *dev,
++ unsigned int headroom)
++{
++ /* we must cap headroom to some upperlimit, else pskb_expand_head
++ * will overflow header offsets in skb_headers_offset_update().
++ */
++ const unsigned int max_allowed = 512;
++
++ if (headroom > max_allowed)
++ headroom = max_allowed;
++
++ if (headroom > READ_ONCE(dev->needed_headroom))
++ WRITE_ONCE(dev->needed_headroom, headroom);
++}
++
+ int iptunnel_handle_offloads(struct sk_buff *skb, int gso_type_mask);
+
+ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
+diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
+index bdedbaccf776db..0b3827cd6f4a38 100644
+--- a/include/uapi/drm/amdgpu_drm.h
++++ b/include/uapi/drm/amdgpu_drm.h
+@@ -1497,27 +1497,6 @@ struct drm_amdgpu_info_hw_ip {
+ __u32 userq_num_slots;
+ };
+
+-/* GFX metadata BO sizes and alignment info (in bytes) */
+-struct drm_amdgpu_info_uq_fw_areas_gfx {
+- /* shadow area size */
+- __u32 shadow_size;
+- /* shadow area base virtual mem alignment */
+- __u32 shadow_alignment;
+- /* context save area size */
+- __u32 csa_size;
+- /* context save area base virtual mem alignment */
+- __u32 csa_alignment;
+-};
+-
+-/* IP specific fw related information used in the
+- * subquery AMDGPU_INFO_UQ_FW_AREAS
+- */
+-struct drm_amdgpu_info_uq_fw_areas {
+- union {
+- struct drm_amdgpu_info_uq_fw_areas_gfx gfx;
+- };
+-};
+-
+ struct drm_amdgpu_info_num_handles {
+ /** Max handles as supported by firmware for UVD */
+ __u32 uvd_max_handles;
+diff --git a/io_uring/register.c b/io_uring/register.c
+index a59589249fce7a..b1772a470bf6e5 100644
+--- a/io_uring/register.c
++++ b/io_uring/register.c
+@@ -618,6 +618,7 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg)
+ if (ret)
+ return ret;
+ if (copy_to_user(rd_uptr, &rd, sizeof(rd))) {
++ guard(mutex)(&ctx->mmap_lock);
+ io_free_region(ctx, &ctx->param_region);
+ return -EFAULT;
+ }
+diff --git a/io_uring/rw.c b/io_uring/rw.c
+index af5a54b5db1233..b998d945410bf1 100644
+--- a/io_uring/rw.c
++++ b/io_uring/rw.c
+@@ -540,7 +540,7 @@ static void __io_complete_rw_common(struct io_kiocb *req, long res)
+ {
+ if (res == req->cqe.res)
+ return;
+- if (res == -EAGAIN && io_rw_should_reissue(req)) {
++ if ((res == -EOPNOTSUPP || res == -EAGAIN) && io_rw_should_reissue(req)) {
+ req->flags |= REQ_F_REISSUE | REQ_F_BL_NO_RECYCLE;
+ } else {
+ req_set_fail(req);
+diff --git a/kernel/events/core.c b/kernel/events/core.c
+index 820127536e62b7..6e9427c4aaff70 100644
+--- a/kernel/events/core.c
++++ b/kernel/events/core.c
+@@ -9390,7 +9390,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
+ flags |= MAP_HUGETLB;
+
+ if (file) {
+- struct inode *inode;
++ const struct inode *inode;
+ dev_t dev;
+
+ buf = kmalloc(PATH_MAX, GFP_KERNEL);
+@@ -9403,12 +9403,12 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
+ * need to add enough zero bytes after the string to handle
+ * the 64bit alignment we do later.
+ */
+- name = file_path(file, buf, PATH_MAX - sizeof(u64));
++ name = d_path(file_user_path(file), buf, PATH_MAX - sizeof(u64));
+ if (IS_ERR(name)) {
+ name = "//toolong";
+ goto cpy_name;
+ }
+- inode = file_inode(vma->vm_file);
++ inode = file_user_inode(vma->vm_file);
+ dev = inode->i_sb->s_dev;
+ ino = inode->i_ino;
+ gen = inode->i_generation;
+@@ -9479,7 +9479,7 @@ static bool perf_addr_filter_match(struct perf_addr_filter *filter,
+ if (!filter->path.dentry)
+ return false;
+
+- if (d_inode(filter->path.dentry) != file_inode(file))
++ if (d_inode(filter->path.dentry) != file_user_inode(file))
+ return false;
+
+ if (filter->offset > offset + size)
+diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
+index 26e0e662e8f2a7..728328c51b6499 100644
+--- a/kernel/power/hibernate.c
++++ b/kernel/power/hibernate.c
+@@ -80,6 +80,17 @@ static const struct platform_hibernation_ops *hibernation_ops;
+
+ static atomic_t hibernate_atomic = ATOMIC_INIT(1);
+
++#ifdef CONFIG_SUSPEND
++/**
++ * pm_hibernation_mode_is_suspend - Check if hibernation has been set to suspend
++ */
++bool pm_hibernation_mode_is_suspend(void)
++{
++ return hibernation_mode == HIBERNATION_SUSPEND;
++}
++EXPORT_SYMBOL_GPL(pm_hibernation_mode_is_suspend);
++#endif
++
+ bool hibernate_acquire(void)
+ {
+ return atomic_add_unless(&hibernate_atomic, -1, 0);
+diff --git a/kernel/sched/core.c b/kernel/sched/core.c
+index ccba6fc3c3fed2..8575d67cbf7385 100644
+--- a/kernel/sched/core.c
++++ b/kernel/sched/core.c
+@@ -8603,10 +8603,12 @@ int sched_cpu_dying(unsigned int cpu)
+ sched_tick_stop(cpu);
+
+ rq_lock_irqsave(rq, &rf);
++ update_rq_clock(rq);
+ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
+ WARN(true, "Dying CPU not properly vacated!");
+ dump_rq_tasks(rq, KERN_WARNING);
+ }
++ dl_server_stop(&rq->fair_server);
+ rq_unlock_irqrestore(rq, &rf);
+
+ calc_load_migrate(rq);
+diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
+index 615411a0a8813d..7b7671060bf9ed 100644
+--- a/kernel/sched/deadline.c
++++ b/kernel/sched/deadline.c
+@@ -1582,6 +1582,9 @@ void dl_server_start(struct sched_dl_entity *dl_se)
+ if (!dl_server(dl_se) || dl_se->dl_server_active)
+ return;
+
++ if (WARN_ON_ONCE(!cpu_online(cpu_of(rq))))
++ return;
++
+ dl_se->dl_server_active = 1;
+ enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
+ if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
+diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
+index 8ce56a8d507f98..8f0b1acace0ad0 100644
+--- a/kernel/sched/fair.c
++++ b/kernel/sched/fair.c
+@@ -8829,21 +8829,21 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
+ return p;
+
+ idle:
+- if (!rf)
+- return NULL;
+-
+- new_tasks = sched_balance_newidle(rq, rf);
++ if (rf) {
++ new_tasks = sched_balance_newidle(rq, rf);
+
+- /*
+- * Because sched_balance_newidle() releases (and re-acquires) rq->lock, it is
+- * possible for any higher priority task to appear. In that case we
+- * must re-start the pick_next_entity() loop.
+- */
+- if (new_tasks < 0)
+- return RETRY_TASK;
++ /*
++ * Because sched_balance_newidle() releases (and re-acquires)
++ * rq->lock, it is possible for any higher priority task to
++ * appear. In that case we must re-start the pick_next_entity()
++ * loop.
++ */
++ if (new_tasks < 0)
++ return RETRY_TASK;
+
+- if (new_tasks > 0)
+- goto again;
++ if (new_tasks > 0)
++ goto again;
++ }
+
+ /*
+ * rq is about to be idle, check if we need to update the
+diff --git a/mm/slub.c b/mm/slub.c
+index 9bdadf9909e066..16b5e221c94d85 100644
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -2073,8 +2073,15 @@ static inline void free_slab_obj_exts(struct slab *slab)
+ struct slabobj_ext *obj_exts;
+
+ obj_exts = slab_obj_exts(slab);
+- if (!obj_exts)
++ if (!obj_exts) {
++ /*
++ * If obj_exts allocation failed, slab->obj_exts is set to
++ * OBJEXTS_ALLOC_FAIL. In this case, we end up here and should
++ * clear the flag.
++ */
++ slab->obj_exts = 0;
+ return;
++ }
+
+ /*
+ * obj_exts was created with __GFP_NO_OBJ_EXT flag, therefore its
+diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c
+index 3706a872ecafdb..a93af55df5fd50 100644
+--- a/net/can/j1939/main.c
++++ b/net/can/j1939/main.c
+@@ -378,6 +378,8 @@ static int j1939_netdev_notify(struct notifier_block *nb,
+ j1939_ecu_unmap_all(priv);
+ break;
+ case NETDEV_UNREGISTER:
++ j1939_cancel_active_session(priv, NULL);
++ j1939_sk_netdev_event_netdown(priv);
+ j1939_sk_netdev_event_unregister(priv);
+ break;
+ }
+diff --git a/net/core/dev.c b/net/core/dev.c
+index 8d49b2198d072f..5194b70769cc52 100644
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -12088,6 +12088,35 @@ static void dev_memory_provider_uninstall(struct net_device *dev)
+ }
+ }
+
++/* devices must be UP and netdev_lock()'d */
++static void netif_close_many_and_unlock(struct list_head *close_head)
++{
++ struct net_device *dev, *tmp;
++
++ netif_close_many(close_head, false);
++
++ /* ... now unlock them */
++ list_for_each_entry_safe(dev, tmp, close_head, close_list) {
++ netdev_unlock(dev);
++ list_del_init(&dev->close_list);
++ }
++}
++
++static void netif_close_many_and_unlock_cond(struct list_head *close_head)
++{
++#ifdef CONFIG_LOCKDEP
++ /* We can only track up to MAX_LOCK_DEPTH locks per task.
++ *
++ * Reserve half the available slots for additional locks possibly
++ * taken by notifiers and (soft)irqs.
++ */
++ unsigned int limit = MAX_LOCK_DEPTH / 2;
++
++ if (lockdep_depth(current) > limit)
++ netif_close_many_and_unlock(close_head);
++#endif
++}
++
+ void unregister_netdevice_many_notify(struct list_head *head,
+ u32 portid, const struct nlmsghdr *nlh)
+ {
+@@ -12120,17 +12149,18 @@ void unregister_netdevice_many_notify(struct list_head *head,
+
+ /* If device is running, close it first. Start with ops locked... */
+ list_for_each_entry(dev, head, unreg_list) {
++ if (!(dev->flags & IFF_UP))
++ continue;
+ if (netdev_need_ops_lock(dev)) {
+ list_add_tail(&dev->close_list, &close_head);
+ netdev_lock(dev);
+ }
++ netif_close_many_and_unlock_cond(&close_head);
+ }
+- netif_close_many(&close_head, true);
+- /* ... now unlock them and go over the rest. */
++ netif_close_many_and_unlock(&close_head);
++ /* ... now go over the rest. */
+ list_for_each_entry(dev, head, unreg_list) {
+- if (netdev_need_ops_lock(dev))
+- netdev_unlock(dev);
+- else
++ if (!netdev_need_ops_lock(dev))
+ list_add_tail(&dev->close_list, &close_head);
+ }
+ netif_close_many(&close_head, true);
+diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
+index aaeb5d16f0c9a4..158a30ae7c5f2f 100644
+--- a/net/ipv4/ip_tunnel.c
++++ b/net/ipv4/ip_tunnel.c
+@@ -568,20 +568,6 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
+ return 0;
+ }
+
+-static void ip_tunnel_adj_headroom(struct net_device *dev, unsigned int headroom)
+-{
+- /* we must cap headroom to some upperlimit, else pskb_expand_head
+- * will overflow header offsets in skb_headers_offset_update().
+- */
+- static const unsigned int max_allowed = 512;
+-
+- if (headroom > max_allowed)
+- headroom = max_allowed;
+-
+- if (headroom > READ_ONCE(dev->needed_headroom))
+- WRITE_ONCE(dev->needed_headroom, headroom);
+-}
+-
+ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
+ u8 proto, int tunnel_hlen)
+ {
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index caf11920a87861..16251d8e1b592b 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -2219,7 +2219,8 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
+ u32 max_segs)
+ {
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+- u32 send_win, cong_win, limit, in_flight;
++ u32 send_win, cong_win, limit, in_flight, threshold;
++ u64 srtt_in_ns, expected_ack, how_far_is_the_ack;
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct sk_buff *head;
+ int win_divisor;
+@@ -2281,9 +2282,19 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
+ head = tcp_rtx_queue_head(sk);
+ if (!head)
+ goto send_now;
+- delta = tp->tcp_clock_cache - head->tstamp;
+- /* If next ACK is likely to come too late (half srtt), do not defer */
+- if ((s64)(delta - (u64)NSEC_PER_USEC * (tp->srtt_us >> 4)) < 0)
++
++ srtt_in_ns = (u64)(NSEC_PER_USEC >> 3) * tp->srtt_us;
++ /* When is the ACK expected ? */
++ expected_ack = head->tstamp + srtt_in_ns;
++ /* How far from now is the ACK expected ? */
++ how_far_is_the_ack = expected_ack - tp->tcp_clock_cache;
++
++ /* If next ACK is likely to come too late,
++ * ie in more than min(1ms, half srtt), do not defer.
++ */
++ threshold = min(srtt_in_ns >> 1, NSEC_PER_MSEC);
++
++ if ((s64)(how_far_is_the_ack - threshold) > 0)
+ goto send_now;
+
+ /* Ok, it looks like it is advisable to defer.
+diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
+index 3262e81223dfc8..6405072050e0ef 100644
+--- a/net/ipv6/ip6_tunnel.c
++++ b/net/ipv6/ip6_tunnel.c
+@@ -1257,8 +1257,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
+ */
+ max_headroom = LL_RESERVED_SPACE(tdev) + sizeof(struct ipv6hdr)
+ + dst->header_len + t->hlen;
+- if (max_headroom > READ_ONCE(dev->needed_headroom))
+- WRITE_ONCE(dev->needed_headroom, max_headroom);
++ ip_tunnel_adj_headroom(dev, max_headroom);
+
+ err = ip6_tnl_encap(skb, t, &proto, fl6);
+ if (err)
+diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
+index a3ccb3135e51ac..39a2ab47fe7204 100644
+--- a/net/tls/tls_main.c
++++ b/net/tls/tls_main.c
+@@ -255,12 +255,9 @@ int tls_process_cmsg(struct sock *sk, struct msghdr *msg,
+ if (msg->msg_flags & MSG_MORE)
+ return -EINVAL;
+
+- rc = tls_handle_open_record(sk, msg->msg_flags);
+- if (rc)
+- return rc;
+-
+ *record_type = *(unsigned char *)CMSG_DATA(cmsg);
+- rc = 0;
++
++ rc = tls_handle_open_record(sk, msg->msg_flags);
+ break;
+ default:
+ return -EINVAL;
+diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
+index daac9fd4be7eb5..d171353699800e 100644
+--- a/net/tls/tls_sw.c
++++ b/net/tls/tls_sw.c
+@@ -1054,7 +1054,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
+ if (ret == -EINPROGRESS)
+ num_async++;
+ else if (ret != -EAGAIN)
+- goto send_end;
++ goto end;
+ }
+ }
+
+@@ -1112,8 +1112,11 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
+ goto send_end;
+ tls_ctx->pending_open_record_frags = true;
+
+- if (sk_msg_full(msg_pl))
++ if (sk_msg_full(msg_pl)) {
+ full_record = true;
++ sk_msg_trim(sk, msg_en,
++ msg_pl->sg.size + prot->overhead_size);
++ }
+
+ if (full_record || eor)
+ goto copied;
+@@ -1149,6 +1152,13 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
+ } else if (ret != -EAGAIN)
+ goto send_end;
+ }
++
++ /* Transmit if any encryptions have completed */
++ if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) {
++ cancel_delayed_work(&ctx->tx_work.work);
++ tls_tx_records(sk, msg->msg_flags);
++ }
++
+ continue;
+ rollback_iter:
+ copied -= try_to_copy;
+@@ -1204,6 +1214,12 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
+ goto send_end;
+ }
+ }
++
++ /* Transmit if any encryptions have completed */
++ if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) {
++ cancel_delayed_work(&ctx->tx_work.work);
++ tls_tx_records(sk, msg->msg_flags);
++ }
+ }
+
+ continue;
+@@ -1223,8 +1239,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
+ goto alloc_encrypted;
+ }
+
++send_end:
+ if (!num_async) {
+- goto send_end;
++ goto end;
+ } else if (num_zc || eor) {
+ int err;
+
+@@ -1242,7 +1259,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
+ tls_tx_records(sk, msg->msg_flags);
+ }
+
+-send_end:
++end:
+ ret = sk_stream_error(sk, msg->msg_flags, ret);
+ return copied > 0 ? copied : ret;
+ }
+@@ -1637,8 +1654,10 @@ static int tls_decrypt_sg(struct sock *sk, struct iov_iter *out_iov,
+
+ if (unlikely(darg->async)) {
+ err = tls_strp_msg_hold(&ctx->strp, &ctx->async_hold);
+- if (err)
+- __skb_queue_tail(&ctx->async_hold, darg->skb);
++ if (err) {
++ err = tls_decrypt_async_wait(ctx);
++ darg->async = false;
++ }
+ return err;
+ }
+
+diff --git a/rust/kernel/cpufreq.rs b/rust/kernel/cpufreq.rs
+index b762ecdc22b00b..cb15f612028ed7 100644
+--- a/rust/kernel/cpufreq.rs
++++ b/rust/kernel/cpufreq.rs
+@@ -39,8 +39,7 @@
+ const CPUFREQ_NAME_LEN: usize = bindings::CPUFREQ_NAME_LEN as usize;
+
+ /// Default transition latency value in nanoseconds.
+-pub const DEFAULT_TRANSITION_LATENCY_NS: u32 =
+- bindings::CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
++pub const DEFAULT_TRANSITION_LATENCY_NS: u32 = bindings::CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
+
+ /// CPU frequency driver flags.
+ pub mod flags {
+diff --git a/sound/firewire/amdtp-stream.h b/sound/firewire/amdtp-stream.h
+index 775db3fc4959f5..ec10270c2cce3d 100644
+--- a/sound/firewire/amdtp-stream.h
++++ b/sound/firewire/amdtp-stream.h
+@@ -32,7 +32,7 @@
+ * allows 5 times as large as IEC 61883-6 defines.
+ * @CIP_HEADER_WITHOUT_EOH: Only for in-stream. CIP Header doesn't include
+ * valid EOH.
+- * @CIP_NO_HEADERS: a lack of headers in packets
++ * @CIP_NO_HEADER: a lack of headers in packets
+ * @CIP_UNALIGHED_DBC: Only for in-stream. The value of dbc is not alighed to
+ * the value of current SYT_INTERVAL; e.g. initial value is not zero.
+ * @CIP_UNAWARE_SYT: For outgoing packet, the value in SYT field of CIP is 0xffff.
+diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c
+index 07ea76efa5de8f..8fb1a5c6ff6df6 100644
+--- a/sound/hda/codecs/realtek/alc269.c
++++ b/sound/hda/codecs/realtek/alc269.c
+@@ -6390,6 +6390,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x854a, "HP EliteBook 830 G6", ALC285_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x85c6, "HP Pavilion x360 Convertible 14-dy1xxx", ALC295_FIXUP_HP_MUTE_LED_COEFBIT11),
+ SND_PCI_QUIRK(0x103c, 0x85de, "HP Envy x360 13-ar0xxx", ALC285_FIXUP_HP_ENVY_X360),
++ SND_PCI_QUIRK(0x103c, 0x860c, "HP ZBook 17 G6", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x860f, "HP ZBook 15 G6", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x861f, "HP Elite Dragonfly G1", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x869d, "HP", ALC236_FIXUP_HP_MUTE_LED),
+diff --git a/sound/hda/codecs/side-codecs/cs35l41_hda.c b/sound/hda/codecs/side-codecs/cs35l41_hda.c
+index 37f2cdc8ce8243..0ef77fae040227 100644
+--- a/sound/hda/codecs/side-codecs/cs35l41_hda.c
++++ b/sound/hda/codecs/side-codecs/cs35l41_hda.c
+@@ -1426,6 +1426,8 @@ static int cs35l41_get_acpi_mute_state(struct cs35l41_hda *cs35l41, acpi_handle
+
+ if (cs35l41_dsm_supported(handle, CS35L41_DSM_GET_MUTE)) {
+ ret = acpi_evaluate_dsm(handle, &guid, 0, CS35L41_DSM_GET_MUTE, NULL);
++ if (!ret)
++ return -EINVAL;
+ mute = *ret->buffer.pointer;
+ dev_dbg(cs35l41->dev, "CS35L41_DSM_GET_MUTE: %d\n", mute);
+ }
+diff --git a/sound/hda/codecs/side-codecs/hda_component.c b/sound/hda/codecs/side-codecs/hda_component.c
+index 71860e2d637716..dd96994b1cf8a0 100644
+--- a/sound/hda/codecs/side-codecs/hda_component.c
++++ b/sound/hda/codecs/side-codecs/hda_component.c
+@@ -181,6 +181,10 @@ int hda_component_manager_init(struct hda_codec *cdc,
+ sm->match_str = match_str;
+ sm->index = i;
+ component_match_add(dev, &match, hda_comp_match_dev_name, sm);
++ if (IS_ERR(match)) {
++ codec_err(cdc, "Fail to add component %ld\n", PTR_ERR(match));
++ return PTR_ERR(match);
++ }
+ }
+
+ ret = component_master_add_with_match(dev, ops, match);
+diff --git a/sound/hda/controllers/intel.c b/sound/hda/controllers/intel.c
+index 1bb3ff55b1151d..9e37586e3e0a7b 100644
+--- a/sound/hda/controllers/intel.c
++++ b/sound/hda/controllers/intel.c
+@@ -2077,6 +2077,7 @@ static const struct pci_device_id driver_denylist[] = {
+ { PCI_DEVICE_SUB(0x1022, 0x1487, 0x1043, 0x874f) }, /* ASUS ROG Zenith II / Strix */
+ { PCI_DEVICE_SUB(0x1022, 0x1487, 0x1462, 0xcb59) }, /* MSI TRX40 Creator */
+ { PCI_DEVICE_SUB(0x1022, 0x1487, 0x1462, 0xcb60) }, /* MSI TRX40 */
++ { PCI_DEVICE_SUB(0x1022, 0x15e3, 0x1462, 0xee59) }, /* MSI X870E Tomahawk WiFi */
+ {}
+ };
+
+diff --git a/sound/soc/amd/acp/acp-sdw-sof-mach.c b/sound/soc/amd/acp/acp-sdw-sof-mach.c
+index 91d72d4bb9a26c..d055582a3bf1ad 100644
+--- a/sound/soc/amd/acp/acp-sdw-sof-mach.c
++++ b/sound/soc/amd/acp/acp-sdw-sof-mach.c
+@@ -176,9 +176,9 @@ static int create_sdw_dailink(struct snd_soc_card *card,
+ cpus->dai_name = devm_kasprintf(dev, GFP_KERNEL,
+ "SDW%d Pin%d",
+ link_num, cpu_pin_id);
+- dev_dbg(dev, "cpu->dai_name:%s\n", cpus->dai_name);
+ if (!cpus->dai_name)
+ return -ENOMEM;
++ dev_dbg(dev, "cpu->dai_name:%s\n", cpus->dai_name);
+
+ codec_maps[j].cpu = 0;
+ codec_maps[j].codec = j;
+diff --git a/sound/soc/codecs/idt821034.c b/sound/soc/codecs/idt821034.c
+index a03d4e5e7d1441..cab2f2eecdfba2 100644
+--- a/sound/soc/codecs/idt821034.c
++++ b/sound/soc/codecs/idt821034.c
+@@ -548,14 +548,14 @@ static int idt821034_kctrl_mute_put(struct snd_kcontrol *kcontrol,
+ return ret;
+ }
+
+-static const DECLARE_TLV_DB_LINEAR(idt821034_gain_in, -6520, 1306);
+-#define IDT821034_GAIN_IN_MIN_RAW 1 /* -65.20 dB -> 10^(-65.2/20.0) * 1820 = 1 */
+-#define IDT821034_GAIN_IN_MAX_RAW 8191 /* 13.06 dB -> 10^(13.06/20.0) * 1820 = 8191 */
++static const DECLARE_TLV_DB_LINEAR(idt821034_gain_in, -300, 1300);
++#define IDT821034_GAIN_IN_MIN_RAW 1288 /* -3.0 dB -> 10^(-3.0/20.0) * 1820 = 1288 */
++#define IDT821034_GAIN_IN_MAX_RAW 8130 /* 13.0 dB -> 10^(13.0/20.0) * 1820 = 8130 */
+ #define IDT821034_GAIN_IN_INIT_RAW 1820 /* 0dB -> 10^(0/20) * 1820 = 1820 */
+
+-static const DECLARE_TLV_DB_LINEAR(idt821034_gain_out, -6798, 1029);
+-#define IDT821034_GAIN_OUT_MIN_RAW 1 /* -67.98 dB -> 10^(-67.98/20.0) * 2506 = 1*/
+-#define IDT821034_GAIN_OUT_MAX_RAW 8191 /* 10.29 dB -> 10^(10.29/20.0) * 2506 = 8191 */
++static const DECLARE_TLV_DB_LINEAR(idt821034_gain_out, -1300, 300);
++#define IDT821034_GAIN_OUT_MIN_RAW 561 /* -13.0 dB -> 10^(-13.0/20.0) * 2506 = 561 */
++#define IDT821034_GAIN_OUT_MAX_RAW 3540 /* 3.0 dB -> 10^(3.0/20.0) * 2506 = 3540 */
+ #define IDT821034_GAIN_OUT_INIT_RAW 2506 /* 0dB -> 10^(0/20) * 2506 = 2506 */
+
+ static const struct snd_kcontrol_new idt821034_controls[] = {
+diff --git a/sound/soc/codecs/nau8821.c b/sound/soc/codecs/nau8821.c
+index edb95f869a4a6b..a8ff2ce70be9a9 100644
+--- a/sound/soc/codecs/nau8821.c
++++ b/sound/soc/codecs/nau8821.c
+@@ -26,7 +26,8 @@
+ #include <sound/tlv.h>
+ #include "nau8821.h"
+
+-#define NAU8821_JD_ACTIVE_HIGH BIT(0)
++#define NAU8821_QUIRK_JD_ACTIVE_HIGH BIT(0)
++#define NAU8821_QUIRK_JD_DB_BYPASS BIT(1)
+
+ static int nau8821_quirk;
+ static int quirk_override = -1;
+@@ -1021,12 +1022,17 @@ static bool nau8821_is_jack_inserted(struct regmap *regmap)
+ return active_high == is_high;
+ }
+
+-static void nau8821_int_status_clear_all(struct regmap *regmap)
++static void nau8821_irq_status_clear(struct regmap *regmap, int active_irq)
+ {
+- int active_irq, clear_irq, i;
++ int clear_irq, i;
+
+- /* Reset the intrruption status from rightmost bit if the corres-
+- * ponding irq event occurs.
++ if (active_irq) {
++ regmap_write(regmap, NAU8821_R11_INT_CLR_KEY_STATUS, active_irq);
++ return;
++ }
++
++ /* Reset the interruption status from rightmost bit if the
++ * corresponding irq event occurs.
+ */
+ regmap_read(regmap, NAU8821_R10_IRQ_STATUS, &active_irq);
+ for (i = 0; i < NAU8821_REG_DATA_LEN; i++) {
+@@ -1052,20 +1058,24 @@ static void nau8821_eject_jack(struct nau8821 *nau8821)
+ snd_soc_component_disable_pin(component, "MICBIAS");
+ snd_soc_dapm_sync(dapm);
+
++ /* Disable & mask both insertion & ejection IRQs */
++ regmap_update_bits(regmap, NAU8821_R12_INTERRUPT_DIS_CTRL,
++ NAU8821_IRQ_INSERT_DIS | NAU8821_IRQ_EJECT_DIS,
++ NAU8821_IRQ_INSERT_DIS | NAU8821_IRQ_EJECT_DIS);
++ regmap_update_bits(regmap, NAU8821_R0F_INTERRUPT_MASK,
++ NAU8821_IRQ_INSERT_EN | NAU8821_IRQ_EJECT_EN,
++ NAU8821_IRQ_INSERT_EN | NAU8821_IRQ_EJECT_EN);
++
+ /* Clear all interruption status */
+- nau8821_int_status_clear_all(regmap);
++ nau8821_irq_status_clear(regmap, 0);
+
+- /* Enable the insertion interruption, disable the ejection inter-
+- * ruption, and then bypass de-bounce circuit.
+- */
++ /* Enable & unmask the insertion IRQ */
+ regmap_update_bits(regmap, NAU8821_R12_INTERRUPT_DIS_CTRL,
+- NAU8821_IRQ_EJECT_DIS | NAU8821_IRQ_INSERT_DIS,
+- NAU8821_IRQ_EJECT_DIS);
+- /* Mask unneeded IRQs: 1 - disable, 0 - enable */
++ NAU8821_IRQ_INSERT_DIS, 0);
+ regmap_update_bits(regmap, NAU8821_R0F_INTERRUPT_MASK,
+- NAU8821_IRQ_EJECT_EN | NAU8821_IRQ_INSERT_EN,
+- NAU8821_IRQ_EJECT_EN);
++ NAU8821_IRQ_INSERT_EN, 0);
+
++ /* Bypass de-bounce circuit */
+ regmap_update_bits(regmap, NAU8821_R0D_JACK_DET_CTRL,
+ NAU8821_JACK_DET_DB_BYPASS, NAU8821_JACK_DET_DB_BYPASS);
+
+@@ -1089,7 +1099,6 @@ static void nau8821_eject_jack(struct nau8821 *nau8821)
+ NAU8821_IRQ_KEY_RELEASE_DIS |
+ NAU8821_IRQ_KEY_PRESS_DIS);
+ }
+-
+ }
+
+ static void nau8821_jdet_work(struct work_struct *work)
+@@ -1146,6 +1155,15 @@ static void nau8821_setup_inserted_irq(struct nau8821 *nau8821)
+ {
+ struct regmap *regmap = nau8821->regmap;
+
++ /* Disable & mask insertion IRQ */
++ regmap_update_bits(regmap, NAU8821_R12_INTERRUPT_DIS_CTRL,
++ NAU8821_IRQ_INSERT_DIS, NAU8821_IRQ_INSERT_DIS);
++ regmap_update_bits(regmap, NAU8821_R0F_INTERRUPT_MASK,
++ NAU8821_IRQ_INSERT_EN, NAU8821_IRQ_INSERT_EN);
++
++ /* Clear insert IRQ status */
++ nau8821_irq_status_clear(regmap, NAU8821_JACK_INSERT_DETECTED);
++
+ /* Enable internal VCO needed for interruptions */
+ if (nau8821->dapm->bias_level < SND_SOC_BIAS_PREPARE)
+ nau8821_configure_sysclk(nau8821, NAU8821_CLK_INTERNAL, 0);
+@@ -1160,21 +1178,23 @@ static void nau8821_setup_inserted_irq(struct nau8821 *nau8821)
+ regmap_update_bits(regmap, NAU8821_R1D_I2S_PCM_CTRL2,
+ NAU8821_I2S_MS_MASK, NAU8821_I2S_MS_SLAVE);
+
+- /* Not bypass de-bounce circuit */
+- regmap_update_bits(regmap, NAU8821_R0D_JACK_DET_CTRL,
+- NAU8821_JACK_DET_DB_BYPASS, 0);
++ /* Do not bypass de-bounce circuit */
++ if (!(nau8821_quirk & NAU8821_QUIRK_JD_DB_BYPASS))
++ regmap_update_bits(regmap, NAU8821_R0D_JACK_DET_CTRL,
++ NAU8821_JACK_DET_DB_BYPASS, 0);
+
++ /* Unmask & enable the ejection IRQs */
+ regmap_update_bits(regmap, NAU8821_R0F_INTERRUPT_MASK,
+- NAU8821_IRQ_EJECT_EN, 0);
++ NAU8821_IRQ_EJECT_EN, 0);
+ regmap_update_bits(regmap, NAU8821_R12_INTERRUPT_DIS_CTRL,
+- NAU8821_IRQ_EJECT_DIS, 0);
++ NAU8821_IRQ_EJECT_DIS, 0);
+ }
+
+ static irqreturn_t nau8821_interrupt(int irq, void *data)
+ {
+ struct nau8821 *nau8821 = (struct nau8821 *)data;
+ struct regmap *regmap = nau8821->regmap;
+- int active_irq, clear_irq = 0, event = 0, event_mask = 0;
++ int active_irq, event = 0, event_mask = 0;
+
+ if (regmap_read(regmap, NAU8821_R10_IRQ_STATUS, &active_irq)) {
+ dev_err(nau8821->dev, "failed to read irq status\n");
+@@ -1185,48 +1205,38 @@ static irqreturn_t nau8821_interrupt(int irq, void *data)
+
+ if ((active_irq & NAU8821_JACK_EJECT_IRQ_MASK) ==
+ NAU8821_JACK_EJECT_DETECTED) {
++ cancel_work_sync(&nau8821->jdet_work);
+ regmap_update_bits(regmap, NAU8821_R71_ANALOG_ADC_1,
+ NAU8821_MICDET_MASK, NAU8821_MICDET_DIS);
+ nau8821_eject_jack(nau8821);
+ event_mask |= SND_JACK_HEADSET;
+- clear_irq = NAU8821_JACK_EJECT_IRQ_MASK;
+ } else if (active_irq & NAU8821_KEY_SHORT_PRESS_IRQ) {
+ event |= NAU8821_BUTTON;
+ event_mask |= NAU8821_BUTTON;
+- clear_irq = NAU8821_KEY_SHORT_PRESS_IRQ;
++ nau8821_irq_status_clear(regmap, NAU8821_KEY_SHORT_PRESS_IRQ);
+ } else if (active_irq & NAU8821_KEY_RELEASE_IRQ) {
+ event_mask = NAU8821_BUTTON;
+- clear_irq = NAU8821_KEY_RELEASE_IRQ;
++ nau8821_irq_status_clear(regmap, NAU8821_KEY_RELEASE_IRQ);
+ } else if ((active_irq & NAU8821_JACK_INSERT_IRQ_MASK) ==
+ NAU8821_JACK_INSERT_DETECTED) {
++ cancel_work_sync(&nau8821->jdet_work);
+ regmap_update_bits(regmap, NAU8821_R71_ANALOG_ADC_1,
+ NAU8821_MICDET_MASK, NAU8821_MICDET_EN);
+ if (nau8821_is_jack_inserted(regmap)) {
+ /* detect microphone and jack type */
+- cancel_work_sync(&nau8821->jdet_work);
+ schedule_work(&nau8821->jdet_work);
+ /* Turn off insertion interruption at manual mode */
+- regmap_update_bits(regmap,
+- NAU8821_R12_INTERRUPT_DIS_CTRL,
+- NAU8821_IRQ_INSERT_DIS,
+- NAU8821_IRQ_INSERT_DIS);
+- regmap_update_bits(regmap,
+- NAU8821_R0F_INTERRUPT_MASK,
+- NAU8821_IRQ_INSERT_EN,
+- NAU8821_IRQ_INSERT_EN);
+ nau8821_setup_inserted_irq(nau8821);
+ } else {
+ dev_warn(nau8821->dev,
+ "Inserted IRQ fired but not connected\n");
+ nau8821_eject_jack(nau8821);
+ }
++ } else {
++ /* Clear the rightmost interrupt */
++ nau8821_irq_status_clear(regmap, active_irq);
+ }
+
+- if (!clear_irq)
+- clear_irq = active_irq;
+- /* clears the rightmost interruption */
+- regmap_write(regmap, NAU8821_R11_INT_CLR_KEY_STATUS, clear_irq);
+-
+ if (event_mask)
+ snd_soc_jack_report(nau8821->jack, event, event_mask);
+
+@@ -1521,7 +1531,7 @@ static int nau8821_resume_setup(struct nau8821 *nau8821)
+ nau8821_configure_sysclk(nau8821, NAU8821_CLK_DIS, 0);
+ if (nau8821->irq) {
+ /* Clear all interruption status */
+- nau8821_int_status_clear_all(regmap);
++ nau8821_irq_status_clear(regmap, 0);
+
+ /* Enable both insertion and ejection interruptions, and then
+ * bypass de-bounce circuit.
+@@ -1856,7 +1866,23 @@ static const struct dmi_system_id nau8821_quirk_table[] = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Positivo Tecnologia SA"),
+ DMI_MATCH(DMI_BOARD_NAME, "CW14Q01P-V2"),
+ },
+- .driver_data = (void *)(NAU8821_JD_ACTIVE_HIGH),
++ .driver_data = (void *)(NAU8821_QUIRK_JD_ACTIVE_HIGH),
++ },
++ {
++ /* Valve Steam Deck LCD */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Valve"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Jupiter"),
++ },
++ .driver_data = (void *)(NAU8821_QUIRK_JD_DB_BYPASS),
++ },
++ {
++ /* Valve Steam Deck OLED */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Valve"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Galileo"),
++ },
++ .driver_data = (void *)(NAU8821_QUIRK_JD_DB_BYPASS),
+ },
+ {}
+ };
+@@ -1898,9 +1924,12 @@ static int nau8821_i2c_probe(struct i2c_client *i2c)
+
+ nau8821_check_quirks();
+
+- if (nau8821_quirk & NAU8821_JD_ACTIVE_HIGH)
++ if (nau8821_quirk & NAU8821_QUIRK_JD_ACTIVE_HIGH)
+ nau8821->jkdet_polarity = 0;
+
++ if (nau8821_quirk & NAU8821_QUIRK_JD_DB_BYPASS)
++ dev_dbg(dev, "Force bypassing jack detection debounce circuit\n");
++
+ nau8821_print_device_properties(nau8821);
+
+ nau8821_reset_chip(nau8821->regmap);
+diff --git a/sound/usb/card.c b/sound/usb/card.c
+index 10d9b728559709..557f53d10ecfb0 100644
+--- a/sound/usb/card.c
++++ b/sound/usb/card.c
+@@ -850,10 +850,16 @@ get_alias_quirk(struct usb_device *dev, unsigned int id)
+ */
+ static int try_to_register_card(struct snd_usb_audio *chip, int ifnum)
+ {
++ struct usb_interface *iface;
++
+ if (check_delayed_register_option(chip) == ifnum ||
+- chip->last_iface == ifnum ||
+- usb_interface_claimed(usb_ifnum_to_if(chip->dev, chip->last_iface)))
++ chip->last_iface == ifnum)
++ return snd_card_register(chip->card);
++
++ iface = usb_ifnum_to_if(chip->dev, chip->last_iface);
++ if (iface && usb_interface_claimed(iface))
+ return snd_card_register(chip->card);
++
+ return 0;
+ }
+
+diff --git a/tools/testing/selftests/bpf/prog_tests/arg_parsing.c b/tools/testing/selftests/bpf/prog_tests/arg_parsing.c
+index bb143de68875cc..e27d66b75fb1fc 100644
+--- a/tools/testing/selftests/bpf/prog_tests/arg_parsing.c
++++ b/tools/testing/selftests/bpf/prog_tests/arg_parsing.c
+@@ -144,11 +144,17 @@ static void test_parse_test_list_file(void)
+ if (!ASSERT_OK(ferror(fp), "prepare tmp"))
+ goto out_fclose;
+
++ if (!ASSERT_OK(fsync(fileno(fp)), "fsync tmp"))
++ goto out_fclose;
++
+ init_test_filter_set(&set);
+
+- ASSERT_OK(parse_test_list_file(tmpfile, &set, true), "parse file");
++ if (!ASSERT_OK(parse_test_list_file(tmpfile, &set, true), "parse file"))
++ goto out_fclose;
++
++ if (!ASSERT_EQ(set.cnt, 4, "test count"))
++ goto out_free_set;
+
+- ASSERT_EQ(set.cnt, 4, "test count");
+ ASSERT_OK(strcmp("test_with_spaces", set.tests[0].name), "test 0 name");
+ ASSERT_EQ(set.tests[0].subtest_cnt, 0, "test 0 subtest count");
+ ASSERT_OK(strcmp("testA", set.tests[1].name), "test 1 name");
+@@ -158,8 +164,8 @@ static void test_parse_test_list_file(void)
+ ASSERT_OK(strcmp("testB", set.tests[2].name), "test 2 name");
+ ASSERT_OK(strcmp("testC_no_eof_newline", set.tests[3].name), "test 3 name");
+
++out_free_set:
+ free_test_filter_set(&set);
+-
+ out_fclose:
+ fclose(fp);
+ out_remove:
+diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
+index d6c00efeb66423..281758e4078880 100755
+--- a/tools/testing/selftests/net/rtnetlink.sh
++++ b/tools/testing/selftests/net/rtnetlink.sh
+@@ -1453,6 +1453,8 @@ usage: ${0##*/} OPTS
+ EOF
+ }
+
++require_command jq
++
+ #check for needed privileges
+ if [ "$(id -u)" -ne 0 ];then
+ end_test "SKIP: Need root privileges"
+diff --git a/tools/testing/selftests/net/vlan_bridge_binding.sh b/tools/testing/selftests/net/vlan_bridge_binding.sh
+index e7cb8c678bdeea..fe5472d844243a 100755
+--- a/tools/testing/selftests/net/vlan_bridge_binding.sh
++++ b/tools/testing/selftests/net/vlan_bridge_binding.sh
+@@ -249,6 +249,8 @@ test_binding_toggle_off_when_upper_down()
+ do_test_binding_off : "on->off when upper down"
+ }
+
++require_command jq
++
+ trap defer_scopes_cleanup EXIT
+ setup_prepare
+ tests_run
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [gentoo-commits] proj/linux-patches:6.17 commit in: /
@ 2025-10-27 14:40 Arisu Tachibana
0 siblings, 0 replies; 20+ messages in thread
From: Arisu Tachibana @ 2025-10-27 14:40 UTC (permalink / raw
To: gentoo-commits
commit: 635270aa5e057c2179a7edc698854d621b0eeade
Author: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 27 14:25:50 2025 +0000
Commit: Arisu Tachibana <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Oct 27 14:39:58 2025 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=635270aa
Add patch 1710 disable sse4a
Signed-off-by: Arisu Tachibana <alicef <AT> gentoo.org>
0000_README | 4 ++++
1710_disable_sse4a.patch | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 42 insertions(+)
diff --git a/0000_README b/0000_README
index f15fc4f0..6ad0c146 100644
--- a/0000_README
+++ b/0000_README
@@ -71,6 +71,10 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
+Patch: 1710_disable_sse4a.patch
+From: https://lore.kernel.org/all/20251027124049.GAaP9oUaUtzzHUK4j4@fat_crate.local/
+Desc: Disable SSE4A
+
Patch: 1730_parisc-Disable-prctl.patch
From: https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git
Desc: prctl: Temporarily disable prctl(PR_SET_MDWE) on parisc
diff --git a/1710_disable_sse4a.patch b/1710_disable_sse4a.patch
new file mode 100644
index 00000000..55283380
--- /dev/null
+++ b/1710_disable_sse4a.patch
@@ -0,0 +1,38 @@
+Subject: x86: Disable SSE4A
+Sent: October 27, 2025 11:40:59 AM UTC
+From: Peter Zijlstra <peterz@infradead.org>
+To: x86@kernel.org, Leyvi Rose <leyvirose@gmail.com>
+Cc: Samuel Holland <samuel.holland@sifive.com>, "Christian König" <christian.koenig@amd.com>, Masami Hiramatsu <mhiramat@kernel.org>
+
+Hi,
+
+Leyvi Rose reported that his X86_NATIVE_CPU=y build is failing because
+our instruction decoder doesn't support SSE4A and the AMDGPU code seems
+to be generating those with his compiler of choice (CLANG+LTO).
+
+Now, our normal build flags disable SSE MMX SSE2 3DNOW AVX, but then
+CC_FLAGS_FPU re-enable SSE SSE2.
+
+Since nothing mentions SSE3 or SSE4, I'm assuming that -msse (or its
+negative) control all SSE variants -- but why then explicitly enumerate
+SSE2 ?
+
+Anyway, until the instruction decoder gets fixed, explicitly disallow
+SSE4A (an AMD specific SSE4 extension).
+
+Fixes: ea1dcca1de12 ("x86/kbuild/64: Add the CONFIG_X86_NATIVE_CPU option to locally optimize the kernel with '-march=native'")
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+---
+
+diff --git a/arch/x86/Makefile b/arch/x86/Makefile
+index 4db7e4bf69f5..8fbff3106c56 100644
+--- a/arch/x86/Makefile
++++ b/arch/x86/Makefile
+@@ -75,7 +75,7 @@ export BITS
+ #
+ # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
+ #
+-KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
++KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -mno-sse4a
+ KBUILD_RUSTFLAGS += --target=$(objtree)/scripts/target.json
+ KBUILD_RUSTFLAGS += -Ctarget-feature=-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2
^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-10-27 14:40 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-01 18:08 [gentoo-commits] proj/linux-patches:6.17 commit in: / Arisu Tachibana
-- strict thread matches above, loose matches on Subject: below --
2025-10-27 14:40 Arisu Tachibana
2025-10-24 9:08 Arisu Tachibana
2025-10-20 5:29 Arisu Tachibana
2025-10-15 18:25 Arisu Tachibana
2025-10-15 18:23 Arisu Tachibana
2025-10-15 18:18 Arisu Tachibana
2025-10-15 17:51 Arisu Tachibana
2025-10-15 17:33 Arisu Tachibana
2025-10-13 11:56 Arisu Tachibana
2025-10-06 11:42 Arisu Tachibana
2025-10-06 11:42 Arisu Tachibana
2025-10-06 11:08 Arisu Tachibana
2025-10-06 11:06 Arisu Tachibana
2025-10-06 11:06 Arisu Tachibana
2025-10-02 3:06 Arisu Tachibana
2025-10-01 6:43 Arisu Tachibana
2025-09-29 12:16 Arisu Tachibana
2025-09-29 12:16 Arisu Tachibana
2025-09-29 12:07 Arisu Tachibana
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox