From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 74CF9138206 for ; Wed, 17 Jan 2018 09:14:50 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 7097FE08F5; Wed, 17 Jan 2018 09:14:48 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 2AFECE08F5 for ; Wed, 17 Jan 2018 09:14:47 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id D160D335C42 for ; Wed, 17 Jan 2018 09:14:45 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id B647A1B8 for ; Wed, 17 Jan 2018 09:14:42 +0000 (UTC) From: "Alice Ferrazzi" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Alice Ferrazzi" Message-ID: <1516180455.d3b3c9bf79f9eaafdddc4026c380a64e302506f2.alicef@gentoo> Subject: [gentoo-commits] proj/linux-patches:4.14 commit in: / X-VCS-Repository: proj/linux-patches X-VCS-Files: 0000_README 1013_linux-4.14.14.patch X-VCS-Directories: / X-VCS-Committer: alicef X-VCS-Committer-Name: Alice Ferrazzi X-VCS-Revision: d3b3c9bf79f9eaafdddc4026c380a64e302506f2 X-VCS-Branch: 4.14 Date: Wed, 17 Jan 2018 09:14:42 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: 5077cbf4-f95c-4a12-a35f-ab5efc001f86 X-Archives-Hash: 3982b75a9276d78eeff8c9a54e24e8aa commit: d3b3c9bf79f9eaafdddc4026c380a64e302506f2 Author: Alice Ferrazzi gentoo org> AuthorDate: Wed Jan 17 09:14:15 2018 +0000 Commit: Alice Ferrazzi gentoo org> CommitDate: Wed Jan 17 09:14:15 2018 +0000 URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=d3b3c9bf added 4.14.14 patches 0000_README | 4 + 1013_linux-4.14.14.patch | 5912 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 5916 insertions(+) diff --git a/0000_README b/0000_README index a90b52d..84f3396 100644 --- a/0000_README +++ b/0000_README @@ -95,6 +95,10 @@ Patch: 1012_linux-4.14.13.patch From: http://www.kernel.org Desc: Linux 4.14.13 +Patch: 1013_linux-4.14.14.patch +From: http://www.kernel.org +Desc: Linux 4.14.14 + Patch: 1500_XATTR_USER_PREFIX.patch From: https://bugs.gentoo.org/show_bug.cgi?id=470644 Desc: Support for namespace user.pax.* on tmpfs. diff --git a/1013_linux-4.14.14.patch b/1013_linux-4.14.14.patch new file mode 100644 index 0000000..2fb5589 --- /dev/null +++ b/1013_linux-4.14.14.patch @@ -0,0 +1,5912 @@ +diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu +index f3d5817c4ef0..258902db14bf 100644 +--- a/Documentation/ABI/testing/sysfs-devices-system-cpu ++++ b/Documentation/ABI/testing/sysfs-devices-system-cpu +@@ -373,3 +373,19 @@ Contact: Linux kernel mailing list + Description: information about CPUs heterogeneity. + + cpu_capacity: capacity of cpu#. ++ ++What: /sys/devices/system/cpu/vulnerabilities ++ /sys/devices/system/cpu/vulnerabilities/meltdown ++ /sys/devices/system/cpu/vulnerabilities/spectre_v1 ++ /sys/devices/system/cpu/vulnerabilities/spectre_v2 ++Date: January 2018 ++Contact: Linux kernel mailing list ++Description: Information about CPU vulnerabilities ++ ++ The files are named after the code names of CPU ++ vulnerabilities. The output of those files reflects the ++ state of the CPUs in the system. Possible output values: ++ ++ "Not affected" CPU is not affected by the vulnerability ++ "Vulnerable" CPU is affected and no mitigation in effect ++ "Mitigation: $M" CPU is affected and mitigation $M is in effect +diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt +index 520fdec15bbb..8122b5f98ea1 100644 +--- a/Documentation/admin-guide/kernel-parameters.txt ++++ b/Documentation/admin-guide/kernel-parameters.txt +@@ -2599,6 +2599,11 @@ + nosmt [KNL,S390] Disable symmetric multithreading (SMT). + Equivalent to smt=1. + ++ nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 ++ (indirect branch prediction) vulnerability. System may ++ allow data leaks with this option, which is equivalent ++ to spectre_v2=off. ++ + noxsave [BUGS=X86] Disables x86 extended register state save + and restore using xsave. The kernel will fallback to + enabling legacy floating-point and sse state. +@@ -2685,8 +2690,6 @@ + steal time is computed, but won't influence scheduler + behaviour + +- nopti [X86-64] Disable kernel page table isolation +- + nolapic [X86-32,APIC] Do not enable or use the local APIC. + + nolapic_timer [X86-32,APIC] Do not use the local APIC timer. +@@ -3255,11 +3258,20 @@ + pt. [PARIDE] + See Documentation/blockdev/paride.txt. + +- pti= [X86_64] +- Control user/kernel address space isolation: +- on - enable +- off - disable +- auto - default setting ++ pti= [X86_64] Control Page Table Isolation of user and ++ kernel address spaces. Disabling this feature ++ removes hardening, but improves performance of ++ system calls and interrupts. ++ ++ on - unconditionally enable ++ off - unconditionally disable ++ auto - kernel detects whether your CPU model is ++ vulnerable to issues that PTI mitigates ++ ++ Not specifying this option is equivalent to pti=auto. ++ ++ nopti [X86_64] ++ Equivalent to pti=off + + pty.legacy_count= + [KNL] Number of legacy pty's. Overwrites compiled-in +@@ -3901,6 +3913,29 @@ + sonypi.*= [HW] Sony Programmable I/O Control Device driver + See Documentation/laptops/sonypi.txt + ++ spectre_v2= [X86] Control mitigation of Spectre variant 2 ++ (indirect branch speculation) vulnerability. ++ ++ on - unconditionally enable ++ off - unconditionally disable ++ auto - kernel detects whether your CPU model is ++ vulnerable ++ ++ Selecting 'on' will, and 'auto' may, choose a ++ mitigation method at run time according to the ++ CPU, the available microcode, the setting of the ++ CONFIG_RETPOLINE configuration option, and the ++ compiler with which the kernel was built. ++ ++ Specific mitigations can also be selected manually: ++ ++ retpoline - replace indirect branches ++ retpoline,generic - google's original retpoline ++ retpoline,amd - AMD-specific minimal thunk ++ ++ Not specifying this option is equivalent to ++ spectre_v2=auto. ++ + spia_io_base= [HW,MTD] + spia_fio_base= + spia_pedr= +diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt +new file mode 100644 +index 000000000000..d11eff61fc9a +--- /dev/null ++++ b/Documentation/x86/pti.txt +@@ -0,0 +1,186 @@ ++Overview ++======== ++ ++Page Table Isolation (pti, previously known as KAISER[1]) is a ++countermeasure against attacks on the shared user/kernel address ++space such as the "Meltdown" approach[2]. ++ ++To mitigate this class of attacks, we create an independent set of ++page tables for use only when running userspace applications. When ++the kernel is entered via syscalls, interrupts or exceptions, the ++page tables are switched to the full "kernel" copy. When the system ++switches back to user mode, the user copy is used again. ++ ++The userspace page tables contain only a minimal amount of kernel ++data: only what is needed to enter/exit the kernel such as the ++entry/exit functions themselves and the interrupt descriptor table ++(IDT). There are a few strictly unnecessary things that get mapped ++such as the first C function when entering an interrupt (see ++comments in pti.c). ++ ++This approach helps to ensure that side-channel attacks leveraging ++the paging structures do not function when PTI is enabled. It can be ++enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time. ++Once enabled at compile-time, it can be disabled at boot with the ++'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt). ++ ++Page Table Management ++===================== ++ ++When PTI is enabled, the kernel manages two sets of page tables. ++The first set is very similar to the single set which is present in ++kernels without PTI. This includes a complete mapping of userspace ++that the kernel can use for things like copy_to_user(). ++ ++Although _complete_, the user portion of the kernel page tables is ++crippled by setting the NX bit in the top level. This ensures ++that any missed kernel->user CR3 switch will immediately crash ++userspace upon executing its first instruction. ++ ++The userspace page tables map only the kernel data needed to enter ++and exit the kernel. This data is entirely contained in the 'struct ++cpu_entry_area' structure which is placed in the fixmap which gives ++each CPU's copy of the area a compile-time-fixed virtual address. ++ ++For new userspace mappings, the kernel makes the entries in its ++page tables like normal. The only difference is when the kernel ++makes entries in the top (PGD) level. In addition to setting the ++entry in the main kernel PGD, a copy of the entry is made in the ++userspace page tables' PGD. ++ ++This sharing at the PGD level also inherently shares all the lower ++layers of the page tables. This leaves a single, shared set of ++userspace page tables to manage. One PTE to lock, one set of ++accessed bits, dirty bits, etc... ++ ++Overhead ++======== ++ ++Protection against side-channel attacks is important. But, ++this protection comes at a cost: ++ ++1. Increased Memory Use ++ a. Each process now needs an order-1 PGD instead of order-0. ++ (Consumes an additional 4k per process). ++ b. The 'cpu_entry_area' structure must be 2MB in size and 2MB ++ aligned so that it can be mapped by setting a single PMD ++ entry. This consumes nearly 2MB of RAM once the kernel ++ is decompressed, but no space in the kernel image itself. ++ ++2. Runtime Cost ++ a. CR3 manipulation to switch between the page table copies ++ must be done at interrupt, syscall, and exception entry ++ and exit (it can be skipped when the kernel is interrupted, ++ though.) Moves to CR3 are on the order of a hundred ++ cycles, and are required at every entry and exit. ++ b. A "trampoline" must be used for SYSCALL entry. This ++ trampoline depends on a smaller set of resources than the ++ non-PTI SYSCALL entry code, so requires mapping fewer ++ things into the userspace page tables. The downside is ++ that stacks must be switched at entry time. ++ d. Global pages are disabled for all kernel structures not ++ mapped into both kernel and userspace page tables. This ++ feature of the MMU allows different processes to share TLB ++ entries mapping the kernel. Losing the feature means more ++ TLB misses after a context switch. The actual loss of ++ performance is very small, however, never exceeding 1%. ++ d. Process Context IDentifiers (PCID) is a CPU feature that ++ allows us to skip flushing the entire TLB when switching page ++ tables by setting a special bit in CR3 when the page tables ++ are changed. This makes switching the page tables (at context ++ switch, or kernel entry/exit) cheaper. But, on systems with ++ PCID support, the context switch code must flush both the user ++ and kernel entries out of the TLB. The user PCID TLB flush is ++ deferred until the exit to userspace, minimizing the cost. ++ See intel.com/sdm for the gory PCID/INVPCID details. ++ e. The userspace page tables must be populated for each new ++ process. Even without PTI, the shared kernel mappings ++ are created by copying top-level (PGD) entries into each ++ new process. But, with PTI, there are now *two* kernel ++ mappings: one in the kernel page tables that maps everything ++ and one for the entry/exit structures. At fork(), we need to ++ copy both. ++ f. In addition to the fork()-time copying, there must also ++ be an update to the userspace PGD any time a set_pgd() is done ++ on a PGD used to map userspace. This ensures that the kernel ++ and userspace copies always map the same userspace ++ memory. ++ g. On systems without PCID support, each CR3 write flushes ++ the entire TLB. That means that each syscall, interrupt ++ or exception flushes the TLB. ++ h. INVPCID is a TLB-flushing instruction which allows flushing ++ of TLB entries for non-current PCIDs. Some systems support ++ PCIDs, but do not support INVPCID. On these systems, addresses ++ can only be flushed from the TLB for the current PCID. When ++ flushing a kernel address, we need to flush all PCIDs, so a ++ single kernel address flush will require a TLB-flushing CR3 ++ write upon the next use of every PCID. ++ ++Possible Future Work ++==================== ++1. We can be more careful about not actually writing to CR3 ++ unless its value is actually changed. ++2. Allow PTI to be enabled/disabled at runtime in addition to the ++ boot-time switching. ++ ++Testing ++======== ++ ++To test stability of PTI, the following test procedure is recommended, ++ideally doing all of these in parallel: ++ ++1. Set CONFIG_DEBUG_ENTRY=y ++2. Run several copies of all of the tools/testing/selftests/x86/ tests ++ (excluding MPX and protection_keys) in a loop on multiple CPUs for ++ several minutes. These tests frequently uncover corner cases in the ++ kernel entry code. In general, old kernels might cause these tests ++ themselves to crash, but they should never crash the kernel. ++3. Run the 'perf' tool in a mode (top or record) that generates many ++ frequent performance monitoring non-maskable interrupts (see "NMI" ++ in /proc/interrupts). This exercises the NMI entry/exit code which ++ is known to trigger bugs in code paths that did not expect to be ++ interrupted, including nested NMIs. Using "-c" boosts the rate of ++ NMIs, and using two -c with separate counters encourages nested NMIs ++ and less deterministic behavior. ++ ++ while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done ++ ++4. Launch a KVM virtual machine. ++5. Run 32-bit binaries on systems supporting the SYSCALL instruction. ++ This has been a lightly-tested code path and needs extra scrutiny. ++ ++Debugging ++========= ++ ++Bugs in PTI cause a few different signatures of crashes ++that are worth noting here. ++ ++ * Failures of the selftests/x86 code. Usually a bug in one of the ++ more obscure corners of entry_64.S ++ * Crashes in early boot, especially around CPU bringup. Bugs ++ in the trampoline code or mappings cause these. ++ * Crashes at the first interrupt. Caused by bugs in entry_64.S, ++ like screwing up a page table switch. Also caused by ++ incorrectly mapping the IRQ handler entry code. ++ * Crashes at the first NMI. The NMI code is separate from main ++ interrupt handlers and can have bugs that do not affect ++ normal interrupts. Also caused by incorrectly mapping NMI ++ code. NMIs that interrupt the entry code must be very ++ careful and can be the cause of crashes that show up when ++ running perf. ++ * Kernel crashes at the first exit to userspace. entry_64.S ++ bugs, or failing to map some of the exit code. ++ * Crashes at first interrupt that interrupts userspace. The paths ++ in entry_64.S that return to userspace are sometimes separate ++ from the ones that return to the kernel. ++ * Double faults: overflowing the kernel stack because of page ++ faults upon page faults. Caused by touching non-pti-mapped ++ data in the entry code, or forgetting to switch to kernel ++ CR3 before calling into C functions which are not pti-mapped. ++ * Userspace segfaults early in boot, sometimes manifesting ++ as mount(8) failing to mount the rootfs. These have ++ tended to be TLB invalidation issues. Usually invalidating ++ the wrong PCID, or otherwise missing an invalidation. ++ ++1. https://gruss.cc/files/kaiser.pdf ++2. https://meltdownattack.com/meltdown.pdf +diff --git a/Makefile b/Makefile +index a67c5179052a..4951305eb867 100644 +--- a/Makefile ++++ b/Makefile +@@ -1,7 +1,7 @@ + # SPDX-License-Identifier: GPL-2.0 + VERSION = 4 + PATCHLEVEL = 14 +-SUBLEVEL = 13 ++SUBLEVEL = 14 + EXTRAVERSION = + NAME = Petit Gorille + +diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c +index c5ff6bfe2825..2f2d176396aa 100644 +--- a/arch/mips/kernel/process.c ++++ b/arch/mips/kernel/process.c +@@ -705,6 +705,18 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value) + struct task_struct *t; + int max_users; + ++ /* If nothing to change, return right away, successfully. */ ++ if (value == mips_get_process_fp_mode(task)) ++ return 0; ++ ++ /* Only accept a mode change if 64-bit FP enabled for o32. */ ++ if (!IS_ENABLED(CONFIG_MIPS_O32_FP64_SUPPORT)) ++ return -EOPNOTSUPP; ++ ++ /* And only for o32 tasks. */ ++ if (IS_ENABLED(CONFIG_64BIT) && !test_thread_flag(TIF_32BIT_REGS)) ++ return -EOPNOTSUPP; ++ + /* Check the value is valid */ + if (value & ~known_bits) + return -EOPNOTSUPP; +diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c +index 5a09c2901a76..c552c20237d4 100644 +--- a/arch/mips/kernel/ptrace.c ++++ b/arch/mips/kernel/ptrace.c +@@ -410,63 +410,160 @@ static int gpr64_set(struct task_struct *target, + + #endif /* CONFIG_64BIT */ + ++/* ++ * Copy the floating-point context to the supplied NT_PRFPREG buffer, ++ * !CONFIG_CPU_HAS_MSA variant. FP context's general register slots ++ * correspond 1:1 to buffer slots. Only general registers are copied. ++ */ ++static int fpr_get_fpa(struct task_struct *target, ++ unsigned int *pos, unsigned int *count, ++ void **kbuf, void __user **ubuf) ++{ ++ return user_regset_copyout(pos, count, kbuf, ubuf, ++ &target->thread.fpu, ++ 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); ++} ++ ++/* ++ * Copy the floating-point context to the supplied NT_PRFPREG buffer, ++ * CONFIG_CPU_HAS_MSA variant. Only lower 64 bits of FP context's ++ * general register slots are copied to buffer slots. Only general ++ * registers are copied. ++ */ ++static int fpr_get_msa(struct task_struct *target, ++ unsigned int *pos, unsigned int *count, ++ void **kbuf, void __user **ubuf) ++{ ++ unsigned int i; ++ u64 fpr_val; ++ int err; ++ ++ BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); ++ for (i = 0; i < NUM_FPU_REGS; i++) { ++ fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); ++ err = user_regset_copyout(pos, count, kbuf, ubuf, ++ &fpr_val, i * sizeof(elf_fpreg_t), ++ (i + 1) * sizeof(elf_fpreg_t)); ++ if (err) ++ return err; ++ } ++ ++ return 0; ++} ++ ++/* ++ * Copy the floating-point context to the supplied NT_PRFPREG buffer. ++ * Choose the appropriate helper for general registers, and then copy ++ * the FCSR register separately. ++ */ + static int fpr_get(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + void *kbuf, void __user *ubuf) + { +- unsigned i; ++ const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); + int err; +- u64 fpr_val; + +- /* XXX fcr31 */ ++ if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) ++ err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf); ++ else ++ err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf); ++ if (err) ++ return err; + +- if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) +- return user_regset_copyout(&pos, &count, &kbuf, &ubuf, +- &target->thread.fpu, +- 0, sizeof(elf_fpregset_t)); ++ err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, ++ &target->thread.fpu.fcr31, ++ fcr31_pos, fcr31_pos + sizeof(u32)); + +- for (i = 0; i < NUM_FPU_REGS; i++) { +- fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); +- err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, +- &fpr_val, i * sizeof(elf_fpreg_t), +- (i + 1) * sizeof(elf_fpreg_t)); ++ return err; ++} ++ ++/* ++ * Copy the supplied NT_PRFPREG buffer to the floating-point context, ++ * !CONFIG_CPU_HAS_MSA variant. Buffer slots correspond 1:1 to FP ++ * context's general register slots. Only general registers are copied. ++ */ ++static int fpr_set_fpa(struct task_struct *target, ++ unsigned int *pos, unsigned int *count, ++ const void **kbuf, const void __user **ubuf) ++{ ++ return user_regset_copyin(pos, count, kbuf, ubuf, ++ &target->thread.fpu, ++ 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); ++} ++ ++/* ++ * Copy the supplied NT_PRFPREG buffer to the floating-point context, ++ * CONFIG_CPU_HAS_MSA variant. Buffer slots are copied to lower 64 ++ * bits only of FP context's general register slots. Only general ++ * registers are copied. ++ */ ++static int fpr_set_msa(struct task_struct *target, ++ unsigned int *pos, unsigned int *count, ++ const void **kbuf, const void __user **ubuf) ++{ ++ unsigned int i; ++ u64 fpr_val; ++ int err; ++ ++ BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); ++ for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) { ++ err = user_regset_copyin(pos, count, kbuf, ubuf, ++ &fpr_val, i * sizeof(elf_fpreg_t), ++ (i + 1) * sizeof(elf_fpreg_t)); + if (err) + return err; ++ set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); + } + + return 0; + } + ++/* ++ * Copy the supplied NT_PRFPREG buffer to the floating-point context. ++ * Choose the appropriate helper for general registers, and then copy ++ * the FCSR register separately. ++ * ++ * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0', ++ * which is supposed to have been guaranteed by the kernel before ++ * calling us, e.g. in `ptrace_regset'. We enforce that requirement, ++ * so that we can safely avoid preinitializing temporaries for ++ * partial register writes. ++ */ + static int fpr_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) + { +- unsigned i; ++ const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); ++ u32 fcr31; + int err; +- u64 fpr_val; + +- /* XXX fcr31 */ ++ BUG_ON(count % sizeof(elf_fpreg_t)); ++ ++ if (pos + count > sizeof(elf_fpregset_t)) ++ return -EIO; + + init_fp_ctx(target); + +- if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) +- return user_regset_copyin(&pos, &count, &kbuf, &ubuf, +- &target->thread.fpu, +- 0, sizeof(elf_fpregset_t)); ++ if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) ++ err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf); ++ else ++ err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf); ++ if (err) ++ return err; + +- BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); +- for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) { ++ if (count > 0) { + err = user_regset_copyin(&pos, &count, &kbuf, &ubuf, +- &fpr_val, i * sizeof(elf_fpreg_t), +- (i + 1) * sizeof(elf_fpreg_t)); ++ &fcr31, ++ fcr31_pos, fcr31_pos + sizeof(u32)); + if (err) + return err; +- set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); ++ ++ ptrace_setfcr31(target, fcr31); + } + +- return 0; ++ return err; + } + + enum mips_regset { +diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c +index 29ebe2fd5867..a93d719edc90 100644 +--- a/arch/powerpc/kvm/book3s_64_mmu.c ++++ b/arch/powerpc/kvm/book3s_64_mmu.c +@@ -235,6 +235,7 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr, + gpte->may_read = true; + gpte->may_write = true; + gpte->page_size = MMU_PAGE_4K; ++ gpte->wimg = HPTE_R_M; + + return 0; + } +diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c +index 59247af5fd45..2645d484e945 100644 +--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c ++++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c +@@ -65,11 +65,17 @@ struct kvm_resize_hpt { + u32 order; + + /* These fields protected by kvm->lock */ ++ ++ /* Possible values and their usage: ++ * <0 an error occurred during allocation, ++ * -EBUSY allocation is in the progress, ++ * 0 allocation made successfuly. ++ */ + int error; +- bool prepare_done; + +- /* Private to the work thread, until prepare_done is true, +- * then protected by kvm->resize_hpt_sem */ ++ /* Private to the work thread, until error != -EBUSY, ++ * then protected by kvm->lock. ++ */ + struct kvm_hpt_info hpt; + }; + +@@ -159,8 +165,6 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) + * Reset all the reverse-mapping chains for all memslots + */ + kvmppc_rmap_reset(kvm); +- /* Ensure that each vcpu will flush its TLB on next entry. */ +- cpumask_setall(&kvm->arch.need_tlb_flush); + err = 0; + goto out; + } +@@ -176,6 +180,10 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) + kvmppc_set_hpt(kvm, &info); + + out: ++ if (err == 0) ++ /* Ensure that each vcpu will flush its TLB on next entry. */ ++ cpumask_setall(&kvm->arch.need_tlb_flush); ++ + mutex_unlock(&kvm->lock); + return err; + } +@@ -1424,16 +1432,20 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize) + + static void resize_hpt_release(struct kvm *kvm, struct kvm_resize_hpt *resize) + { +- BUG_ON(kvm->arch.resize_hpt != resize); ++ if (WARN_ON(!mutex_is_locked(&kvm->lock))) ++ return; + + if (!resize) + return; + +- if (resize->hpt.virt) +- kvmppc_free_hpt(&resize->hpt); ++ if (resize->error != -EBUSY) { ++ if (resize->hpt.virt) ++ kvmppc_free_hpt(&resize->hpt); ++ kfree(resize); ++ } + +- kvm->arch.resize_hpt = NULL; +- kfree(resize); ++ if (kvm->arch.resize_hpt == resize) ++ kvm->arch.resize_hpt = NULL; + } + + static void resize_hpt_prepare_work(struct work_struct *work) +@@ -1442,17 +1454,41 @@ static void resize_hpt_prepare_work(struct work_struct *work) + struct kvm_resize_hpt, + work); + struct kvm *kvm = resize->kvm; +- int err; ++ int err = 0; + +- resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n", +- resize->order); +- +- err = resize_hpt_allocate(resize); ++ if (WARN_ON(resize->error != -EBUSY)) ++ return; + + mutex_lock(&kvm->lock); + ++ /* Request is still current? */ ++ if (kvm->arch.resize_hpt == resize) { ++ /* We may request large allocations here: ++ * do not sleep with kvm->lock held for a while. ++ */ ++ mutex_unlock(&kvm->lock); ++ ++ resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n", ++ resize->order); ++ ++ err = resize_hpt_allocate(resize); ++ ++ /* We have strict assumption about -EBUSY ++ * when preparing for HPT resize. ++ */ ++ if (WARN_ON(err == -EBUSY)) ++ err = -EINPROGRESS; ++ ++ mutex_lock(&kvm->lock); ++ /* It is possible that kvm->arch.resize_hpt != resize ++ * after we grab kvm->lock again. ++ */ ++ } ++ + resize->error = err; +- resize->prepare_done = true; ++ ++ if (kvm->arch.resize_hpt != resize) ++ resize_hpt_release(kvm, resize); + + mutex_unlock(&kvm->lock); + } +@@ -1477,14 +1513,12 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, + + if (resize) { + if (resize->order == shift) { +- /* Suitable resize in progress */ +- if (resize->prepare_done) { +- ret = resize->error; +- if (ret != 0) +- resize_hpt_release(kvm, resize); +- } else { ++ /* Suitable resize in progress? */ ++ ret = resize->error; ++ if (ret == -EBUSY) + ret = 100; /* estimated time in ms */ +- } ++ else if (ret) ++ resize_hpt_release(kvm, resize); + + goto out; + } +@@ -1504,6 +1538,8 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, + ret = -ENOMEM; + goto out; + } ++ ++ resize->error = -EBUSY; + resize->order = shift; + resize->kvm = kvm; + INIT_WORK(&resize->work, resize_hpt_prepare_work); +@@ -1558,16 +1594,12 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm, + if (!resize || (resize->order != shift)) + goto out; + +- ret = -EBUSY; +- if (!resize->prepare_done) +- goto out; +- + ret = resize->error; +- if (ret != 0) ++ if (ret) + goto out; + + ret = resize_hpt_rehash(resize); +- if (ret != 0) ++ if (ret) + goto out; + + resize_hpt_pivot(resize); +diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c +index 69a09444d46e..e2ef16198456 100644 +--- a/arch/powerpc/kvm/book3s_pr.c ++++ b/arch/powerpc/kvm/book3s_pr.c +@@ -60,6 +60,7 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac); + #define MSR_USER32 MSR_USER + #define MSR_USER64 MSR_USER + #define HW_PAGE_SIZE PAGE_SIZE ++#define HPTE_R_M _PAGE_COHERENT + #endif + + static bool kvmppc_is_split_real(struct kvm_vcpu *vcpu) +@@ -557,6 +558,7 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu, + pte.eaddr = eaddr; + pte.vpage = eaddr >> 12; + pte.page_size = MMU_PAGE_64K; ++ pte.wimg = HPTE_R_M; + } + + switch (kvmppc_get_msr(vcpu) & (MSR_DR|MSR_IR)) { +diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig +index 592c974d4558..17de6acc0eab 100644 +--- a/arch/x86/Kconfig ++++ b/arch/x86/Kconfig +@@ -89,6 +89,7 @@ config X86 + select GENERIC_CLOCKEVENTS_MIN_ADJUST + select GENERIC_CMOS_UPDATE + select GENERIC_CPU_AUTOPROBE ++ select GENERIC_CPU_VULNERABILITIES + select GENERIC_EARLY_IOREMAP + select GENERIC_FIND_FIRST_BIT + select GENERIC_IOMAP +@@ -428,6 +429,19 @@ config GOLDFISH + def_bool y + depends on X86_GOLDFISH + ++config RETPOLINE ++ bool "Avoid speculative indirect branches in kernel" ++ default y ++ help ++ Compile kernel with the retpoline compiler options to guard against ++ kernel-to-user data leaks by avoiding speculative indirect ++ branches. Requires a compiler with -mindirect-branch=thunk-extern ++ support for full protection. The kernel may run slower. ++ ++ Without compiler support, at least indirect branches in assembler ++ code are eliminated. Since this includes the syscall entry path, ++ it is not entirely pointless. ++ + config INTEL_RDT + bool "Intel Resource Director Technology support" + default n +diff --git a/arch/x86/Makefile b/arch/x86/Makefile +index a20eacd9c7e9..504b1a4535ac 100644 +--- a/arch/x86/Makefile ++++ b/arch/x86/Makefile +@@ -235,6 +235,14 @@ KBUILD_CFLAGS += -Wno-sign-compare + # + KBUILD_CFLAGS += -fno-asynchronous-unwind-tables + ++# Avoid indirect branches in kernel to deal with Spectre ++ifdef CONFIG_RETPOLINE ++ RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) ++ ifneq ($(RETPOLINE_CFLAGS),) ++ KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE ++ endif ++endif ++ + archscripts: scripts_basic + $(Q)$(MAKE) $(build)=arch/x86/tools relocs + +diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S +index 16627fec80b2..3d09e3aca18d 100644 +--- a/arch/x86/crypto/aesni-intel_asm.S ++++ b/arch/x86/crypto/aesni-intel_asm.S +@@ -32,6 +32,7 @@ + #include + #include + #include ++#include + + /* + * The following macros are used to move an (un)aligned 16 byte value to/from +@@ -2884,7 +2885,7 @@ ENTRY(aesni_xts_crypt8) + pxor INC, STATE4 + movdqu IV, 0x30(OUTP) + +- call *%r11 ++ CALL_NOSPEC %r11 + + movdqu 0x00(OUTP), INC + pxor INC, STATE1 +@@ -2929,7 +2930,7 @@ ENTRY(aesni_xts_crypt8) + _aesni_gf128mul_x_ble() + movups IV, (IVP) + +- call *%r11 ++ CALL_NOSPEC %r11 + + movdqu 0x40(OUTP), INC + pxor INC, STATE1 +diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S +index f7c495e2863c..a14af6eb09cb 100644 +--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S ++++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S +@@ -17,6 +17,7 @@ + + #include + #include ++#include + + #define CAMELLIA_TABLE_BYTE_LEN 272 + +@@ -1227,7 +1228,7 @@ camellia_xts_crypt_16way: + vpxor 14 * 16(%rax), %xmm15, %xmm14; + vpxor 15 * 16(%rax), %xmm15, %xmm15; + +- call *%r9; ++ CALL_NOSPEC %r9; + + addq $(16 * 16), %rsp; + +diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +index eee5b3982cfd..b66bbfa62f50 100644 +--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S ++++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +@@ -12,6 +12,7 @@ + + #include + #include ++#include + + #define CAMELLIA_TABLE_BYTE_LEN 272 + +@@ -1343,7 +1344,7 @@ camellia_xts_crypt_32way: + vpxor 14 * 32(%rax), %ymm15, %ymm14; + vpxor 15 * 32(%rax), %ymm15, %ymm15; + +- call *%r9; ++ CALL_NOSPEC %r9; + + addq $(16 * 32), %rsp; + +diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +index 7a7de27c6f41..d9b734d0c8cc 100644 +--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S ++++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +@@ -45,6 +45,7 @@ + + #include + #include ++#include + + ## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction + +@@ -172,7 +173,7 @@ continue_block: + movzxw (bufp, %rax, 2), len + lea crc_array(%rip), bufp + lea (bufp, len, 1), bufp +- jmp *bufp ++ JMP_NOSPEC bufp + + ################################################################ + ## 2a) PROCESS FULL BLOCKS: +diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h +index 45a63e00a6af..3f48f695d5e6 100644 +--- a/arch/x86/entry/calling.h ++++ b/arch/x86/entry/calling.h +@@ -198,8 +198,11 @@ For 32-bit we have the following conventions - kernel is built with + * PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two + * halves: + */ +-#define PTI_SWITCH_PGTABLES_MASK (1< + #include + #include ++#include + + .section .entry.text, "ax" + +@@ -290,7 +291,7 @@ ENTRY(ret_from_fork) + + /* kernel thread */ + 1: movl %edi, %eax +- call *%ebx ++ CALL_NOSPEC %ebx + /* + * A kernel thread is allowed to return here after successfully + * calling do_execve(). Exit to userspace to complete the execve() +@@ -919,7 +920,7 @@ common_exception: + movl %ecx, %es + TRACE_IRQS_OFF + movl %esp, %eax # pt_regs pointer +- call *%edi ++ CALL_NOSPEC %edi + jmp ret_from_exception + END(common_exception) + +diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S +index dd696b966e58..f5fda5f26e34 100644 +--- a/arch/x86/entry/entry_64.S ++++ b/arch/x86/entry/entry_64.S +@@ -37,6 +37,7 @@ + #include + #include + #include ++#include + #include + + #include "calling.h" +@@ -187,7 +188,7 @@ ENTRY(entry_SYSCALL_64_trampoline) + */ + pushq %rdi + movq $entry_SYSCALL_64_stage2, %rdi +- jmp *%rdi ++ JMP_NOSPEC %rdi + END(entry_SYSCALL_64_trampoline) + + .popsection +@@ -266,7 +267,12 @@ entry_SYSCALL_64_fastpath: + * It might end up jumping to the slow path. If it jumps, RAX + * and all argument registers are clobbered. + */ ++#ifdef CONFIG_RETPOLINE ++ movq sys_call_table(, %rax, 8), %rax ++ call __x86_indirect_thunk_rax ++#else + call *sys_call_table(, %rax, 8) ++#endif + .Lentry_SYSCALL_64_after_fastpath_call: + + movq %rax, RAX(%rsp) +@@ -438,7 +444,7 @@ ENTRY(stub_ptregs_64) + jmp entry_SYSCALL64_slow_path + + 1: +- jmp *%rax /* Called from C */ ++ JMP_NOSPEC %rax /* Called from C */ + END(stub_ptregs_64) + + .macro ptregs_stub func +@@ -517,7 +523,7 @@ ENTRY(ret_from_fork) + 1: + /* kernel thread */ + movq %r12, %rdi +- call *%rbx ++ CALL_NOSPEC %rbx + /* + * A kernel thread is allowed to return here after successfully + * calling do_execve(). Exit to userspace to complete the execve() +diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c +index 141e07b06216..24ffa1e88cf9 100644 +--- a/arch/x86/events/intel/bts.c ++++ b/arch/x86/events/intel/bts.c +@@ -582,6 +582,24 @@ static __init int bts_init(void) + if (!boot_cpu_has(X86_FEATURE_DTES64) || !x86_pmu.bts) + return -ENODEV; + ++ if (boot_cpu_has(X86_FEATURE_PTI)) { ++ /* ++ * BTS hardware writes through a virtual memory map we must ++ * either use the kernel physical map, or the user mapping of ++ * the AUX buffer. ++ * ++ * However, since this driver supports per-CPU and per-task inherit ++ * we cannot use the user mapping since it will not be availble ++ * if we're not running the owning process. ++ * ++ * With PTI we can't use the kernal map either, because its not ++ * there when we run userspace. ++ * ++ * For now, disable this driver when using PTI. ++ */ ++ return -ENODEV; ++ } ++ + bts_pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_ITRACE | + PERF_PMU_CAP_EXCLUSIVE; + bts_pmu.task_ctx_nr = perf_sw_context; +diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h +index ff700d81e91e..0927cdc4f946 100644 +--- a/arch/x86/include/asm/asm-prototypes.h ++++ b/arch/x86/include/asm/asm-prototypes.h +@@ -11,7 +11,32 @@ + #include + #include + #include ++#include + + #ifndef CONFIG_X86_CMPXCHG64 + extern void cmpxchg8b_emu(void); + #endif ++ ++#ifdef CONFIG_RETPOLINE ++#ifdef CONFIG_X86_32 ++#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); ++#else ++#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); ++INDIRECT_THUNK(8) ++INDIRECT_THUNK(9) ++INDIRECT_THUNK(10) ++INDIRECT_THUNK(11) ++INDIRECT_THUNK(12) ++INDIRECT_THUNK(13) ++INDIRECT_THUNK(14) ++INDIRECT_THUNK(15) ++#endif ++INDIRECT_THUNK(ax) ++INDIRECT_THUNK(bx) ++INDIRECT_THUNK(cx) ++INDIRECT_THUNK(dx) ++INDIRECT_THUNK(si) ++INDIRECT_THUNK(di) ++INDIRECT_THUNK(bp) ++INDIRECT_THUNK(sp) ++#endif /* CONFIG_RETPOLINE */ +diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h +index 21ac898df2d8..f275447862f4 100644 +--- a/arch/x86/include/asm/cpufeatures.h ++++ b/arch/x86/include/asm/cpufeatures.h +@@ -203,6 +203,8 @@ + #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ + #define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ + #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ ++#define X86_FEATURE_RETPOLINE ( 7*32+12) /* Generic Retpoline mitigation for Spectre variant 2 */ ++#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* AMD Retpoline mitigation for Spectre variant 2 */ + #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */ + #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ + #define X86_FEATURE_AVX512_4VNNIW ( 7*32+16) /* AVX-512 Neural Network Instructions */ +@@ -342,5 +344,7 @@ + #define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */ + #define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */ + #define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ ++#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ ++#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ + + #endif /* _ASM_X86_CPUFEATURES_H */ +diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h +index 581bb54dd464..5119e4b555cc 100644 +--- a/arch/x86/include/asm/mshyperv.h ++++ b/arch/x86/include/asm/mshyperv.h +@@ -7,6 +7,7 @@ + #include + #include + #include ++#include + + /* + * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent +@@ -186,10 +187,11 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output) + return U64_MAX; + + __asm__ __volatile__("mov %4, %%r8\n" +- "call *%5" ++ CALL_NOSPEC + : "=a" (hv_status), ASM_CALL_CONSTRAINT, + "+c" (control), "+d" (input_address) +- : "r" (output_address), "m" (hv_hypercall_pg) ++ : "r" (output_address), ++ THUNK_TARGET(hv_hypercall_pg) + : "cc", "memory", "r8", "r9", "r10", "r11"); + #else + u32 input_address_hi = upper_32_bits(input_address); +@@ -200,13 +202,13 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output) + if (!hv_hypercall_pg) + return U64_MAX; + +- __asm__ __volatile__("call *%7" ++ __asm__ __volatile__(CALL_NOSPEC + : "=A" (hv_status), + "+c" (input_address_lo), ASM_CALL_CONSTRAINT + : "A" (control), + "b" (input_address_hi), + "D"(output_address_hi), "S"(output_address_lo), +- "m" (hv_hypercall_pg) ++ THUNK_TARGET(hv_hypercall_pg) + : "cc", "memory"); + #endif /* !x86_64 */ + return hv_status; +@@ -227,10 +229,10 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1) + + #ifdef CONFIG_X86_64 + { +- __asm__ __volatile__("call *%4" ++ __asm__ __volatile__(CALL_NOSPEC + : "=a" (hv_status), ASM_CALL_CONSTRAINT, + "+c" (control), "+d" (input1) +- : "m" (hv_hypercall_pg) ++ : THUNK_TARGET(hv_hypercall_pg) + : "cc", "r8", "r9", "r10", "r11"); + } + #else +@@ -238,13 +240,13 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1) + u32 input1_hi = upper_32_bits(input1); + u32 input1_lo = lower_32_bits(input1); + +- __asm__ __volatile__ ("call *%5" ++ __asm__ __volatile__ (CALL_NOSPEC + : "=A"(hv_status), + "+c"(input1_lo), + ASM_CALL_CONSTRAINT + : "A" (control), + "b" (input1_hi), +- "m" (hv_hypercall_pg) ++ THUNK_TARGET(hv_hypercall_pg) + : "cc", "edi", "esi"); + } + #endif +diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h +index ab022618a50a..fa11fb1fa570 100644 +--- a/arch/x86/include/asm/msr-index.h ++++ b/arch/x86/include/asm/msr-index.h +@@ -352,6 +352,9 @@ + #define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL + #define FAM10H_MMIO_CONF_BASE_SHIFT 20 + #define MSR_FAM10H_NODE_ID 0xc001100c ++#define MSR_F10H_DECFG 0xc0011029 ++#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1 ++#define MSR_F10H_DECFG_LFENCE_SERIALIZE BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT) + + /* K8 MSRs */ + #define MSR_K8_TOP_MEM1 0xc001001a +diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h +new file mode 100644 +index 000000000000..402a11c803c3 +--- /dev/null ++++ b/arch/x86/include/asm/nospec-branch.h +@@ -0,0 +1,214 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++ ++#ifndef __NOSPEC_BRANCH_H__ ++#define __NOSPEC_BRANCH_H__ ++ ++#include ++#include ++#include ++ ++/* ++ * Fill the CPU return stack buffer. ++ * ++ * Each entry in the RSB, if used for a speculative 'ret', contains an ++ * infinite 'pause; jmp' loop to capture speculative execution. ++ * ++ * This is required in various cases for retpoline and IBRS-based ++ * mitigations for the Spectre variant 2 vulnerability. Sometimes to ++ * eliminate potentially bogus entries from the RSB, and sometimes ++ * purely to ensure that it doesn't get empty, which on some CPUs would ++ * allow predictions from other (unwanted!) sources to be used. ++ * ++ * We define a CPP macro such that it can be used from both .S files and ++ * inline assembly. It's possible to do a .macro and then include that ++ * from C via asm(".include ") but let's not go there. ++ */ ++ ++#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ ++#define RSB_FILL_LOOPS 16 /* To avoid underflow */ ++ ++/* ++ * Google experimented with loop-unrolling and this turned out to be ++ * the optimal version — two calls, each with their own speculation ++ * trap should their return address end up getting used, in a loop. ++ */ ++#define __FILL_RETURN_BUFFER(reg, nr, sp) \ ++ mov $(nr/2), reg; \ ++771: \ ++ call 772f; \ ++773: /* speculation trap */ \ ++ pause; \ ++ jmp 773b; \ ++772: \ ++ call 774f; \ ++775: /* speculation trap */ \ ++ pause; \ ++ jmp 775b; \ ++774: \ ++ dec reg; \ ++ jnz 771b; \ ++ add $(BITS_PER_LONG/8) * nr, sp; ++ ++#ifdef __ASSEMBLY__ ++ ++/* ++ * This should be used immediately before a retpoline alternative. It tells ++ * objtool where the retpolines are so that it can make sense of the control ++ * flow by just reading the original instruction(s) and ignoring the ++ * alternatives. ++ */ ++.macro ANNOTATE_NOSPEC_ALTERNATIVE ++ .Lannotate_\@: ++ .pushsection .discard.nospec ++ .long .Lannotate_\@ - . ++ .popsection ++.endm ++ ++/* ++ * These are the bare retpoline primitives for indirect jmp and call. ++ * Do not use these directly; they only exist to make the ALTERNATIVE ++ * invocation below less ugly. ++ */ ++.macro RETPOLINE_JMP reg:req ++ call .Ldo_rop_\@ ++.Lspec_trap_\@: ++ pause ++ jmp .Lspec_trap_\@ ++.Ldo_rop_\@: ++ mov \reg, (%_ASM_SP) ++ ret ++.endm ++ ++/* ++ * This is a wrapper around RETPOLINE_JMP so the called function in reg ++ * returns to the instruction after the macro. ++ */ ++.macro RETPOLINE_CALL reg:req ++ jmp .Ldo_call_\@ ++.Ldo_retpoline_jmp_\@: ++ RETPOLINE_JMP \reg ++.Ldo_call_\@: ++ call .Ldo_retpoline_jmp_\@ ++.endm ++ ++/* ++ * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple ++ * indirect jmp/call which may be susceptible to the Spectre variant 2 ++ * attack. ++ */ ++.macro JMP_NOSPEC reg:req ++#ifdef CONFIG_RETPOLINE ++ ANNOTATE_NOSPEC_ALTERNATIVE ++ ALTERNATIVE_2 __stringify(jmp *\reg), \ ++ __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ ++ __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD ++#else ++ jmp *\reg ++#endif ++.endm ++ ++.macro CALL_NOSPEC reg:req ++#ifdef CONFIG_RETPOLINE ++ ANNOTATE_NOSPEC_ALTERNATIVE ++ ALTERNATIVE_2 __stringify(call *\reg), \ ++ __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ ++ __stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD ++#else ++ call *\reg ++#endif ++.endm ++ ++ /* ++ * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP ++ * monstrosity above, manually. ++ */ ++.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ++#ifdef CONFIG_RETPOLINE ++ ANNOTATE_NOSPEC_ALTERNATIVE ++ ALTERNATIVE "jmp .Lskip_rsb_\@", \ ++ __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \ ++ \ftr ++.Lskip_rsb_\@: ++#endif ++.endm ++ ++#else /* __ASSEMBLY__ */ ++ ++#define ANNOTATE_NOSPEC_ALTERNATIVE \ ++ "999:\n\t" \ ++ ".pushsection .discard.nospec\n\t" \ ++ ".long 999b - .\n\t" \ ++ ".popsection\n\t" ++ ++#if defined(CONFIG_X86_64) && defined(RETPOLINE) ++ ++/* ++ * Since the inline asm uses the %V modifier which is only in newer GCC, ++ * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE. ++ */ ++# define CALL_NOSPEC \ ++ ANNOTATE_NOSPEC_ALTERNATIVE \ ++ ALTERNATIVE( \ ++ "call *%[thunk_target]\n", \ ++ "call __x86_indirect_thunk_%V[thunk_target]\n", \ ++ X86_FEATURE_RETPOLINE) ++# define THUNK_TARGET(addr) [thunk_target] "r" (addr) ++ ++#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE) ++/* ++ * For i386 we use the original ret-equivalent retpoline, because ++ * otherwise we'll run out of registers. We don't care about CET ++ * here, anyway. ++ */ ++# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n", \ ++ " jmp 904f;\n" \ ++ " .align 16\n" \ ++ "901: call 903f;\n" \ ++ "902: pause;\n" \ ++ " jmp 902b;\n" \ ++ " .align 16\n" \ ++ "903: addl $4, %%esp;\n" \ ++ " pushl %[thunk_target];\n" \ ++ " ret;\n" \ ++ " .align 16\n" \ ++ "904: call 901b;\n", \ ++ X86_FEATURE_RETPOLINE) ++ ++# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) ++#else /* No retpoline for C / inline asm */ ++# define CALL_NOSPEC "call *%[thunk_target]\n" ++# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) ++#endif ++ ++/* The Spectre V2 mitigation variants */ ++enum spectre_v2_mitigation { ++ SPECTRE_V2_NONE, ++ SPECTRE_V2_RETPOLINE_MINIMAL, ++ SPECTRE_V2_RETPOLINE_MINIMAL_AMD, ++ SPECTRE_V2_RETPOLINE_GENERIC, ++ SPECTRE_V2_RETPOLINE_AMD, ++ SPECTRE_V2_IBRS, ++}; ++ ++/* ++ * On VMEXIT we must ensure that no RSB predictions learned in the guest ++ * can be followed in the host, by overwriting the RSB completely. Both ++ * retpoline and IBRS mitigations for Spectre v2 need this; only on future ++ * CPUs with IBRS_ATT *might* it be avoided. ++ */ ++static inline void vmexit_fill_RSB(void) ++{ ++#ifdef CONFIG_RETPOLINE ++ unsigned long loops = RSB_CLEAR_LOOPS / 2; ++ ++ asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE ++ ALTERNATIVE("jmp 910f", ++ __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), ++ X86_FEATURE_RETPOLINE) ++ "910:" ++ : "=&r" (loops), ASM_CALL_CONSTRAINT ++ : "r" (loops) : "memory" ); ++#endif ++} ++#endif /* __ASSEMBLY__ */ ++#endif /* __NOSPEC_BRANCH_H__ */ +diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h +index 6a60fea90b9d..625a52a5594f 100644 +--- a/arch/x86/include/asm/processor-flags.h ++++ b/arch/x86/include/asm/processor-flags.h +@@ -40,7 +40,7 @@ + #define CR3_NOFLUSH BIT_ULL(63) + + #ifdef CONFIG_PAGE_TABLE_ISOLATION +-# define X86_CR3_PTI_SWITCH_BIT 11 ++# define X86_CR3_PTI_PCID_USER_BIT 11 + #endif + + #else +diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h +index f9b48ce152eb..3effd3c994af 100644 +--- a/arch/x86/include/asm/tlbflush.h ++++ b/arch/x86/include/asm/tlbflush.h +@@ -81,13 +81,13 @@ static inline u16 kern_pcid(u16 asid) + * Make sure that the dynamic ASID space does not confict with the + * bit we are using to switch between user and kernel ASIDs. + */ +- BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_SWITCH_BIT)); ++ BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_PCID_USER_BIT)); + + /* + * The ASID being passed in here should have respected the + * MAX_ASID_AVAILABLE and thus never have the switch bit set. + */ +- VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT)); ++ VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT)); + #endif + /* + * The dynamically-assigned ASIDs that get passed in are small +@@ -112,7 +112,7 @@ static inline u16 user_pcid(u16 asid) + { + u16 ret = kern_pcid(asid); + #ifdef CONFIG_PAGE_TABLE_ISOLATION +- ret |= 1 << X86_CR3_PTI_SWITCH_BIT; ++ ret |= 1 << X86_CR3_PTI_PCID_USER_BIT; + #endif + return ret; + } +diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h +index 7cb282e9e587..bfd882617613 100644 +--- a/arch/x86/include/asm/xen/hypercall.h ++++ b/arch/x86/include/asm/xen/hypercall.h +@@ -44,6 +44,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -217,9 +218,9 @@ privcmd_call(unsigned call, + __HYPERCALL_5ARG(a1, a2, a3, a4, a5); + + stac(); +- asm volatile("call *%[call]" ++ asm volatile(CALL_NOSPEC + : __HYPERCALL_5PARAM +- : [call] "a" (&hypercall_page[call]) ++ : [thunk_target] "a" (&hypercall_page[call]) + : __HYPERCALL_CLOBBER5); + clac(); + +diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c +index 079535e53e2a..9c2a002d9297 100644 +--- a/arch/x86/kernel/acpi/boot.c ++++ b/arch/x86/kernel/acpi/boot.c +@@ -342,13 +342,12 @@ acpi_parse_lapic_nmi(struct acpi_subtable_header * header, const unsigned long e + #ifdef CONFIG_X86_IO_APIC + #define MP_ISA_BUS 0 + ++static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, ++ u8 trigger, u32 gsi); ++ + static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, + u32 gsi) + { +- int ioapic; +- int pin; +- struct mpc_intsrc mp_irq; +- + /* + * Check bus_irq boundary. + */ +@@ -357,14 +356,6 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, + return; + } + +- /* +- * Convert 'gsi' to 'ioapic.pin'. +- */ +- ioapic = mp_find_ioapic(gsi); +- if (ioapic < 0) +- return; +- pin = mp_find_ioapic_pin(ioapic, gsi); +- + /* + * TBD: This check is for faulty timer entries, where the override + * erroneously sets the trigger to level, resulting in a HUGE +@@ -373,16 +364,8 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, + if ((bus_irq == 0) && (trigger == 3)) + trigger = 1; + +- mp_irq.type = MP_INTSRC; +- mp_irq.irqtype = mp_INT; +- mp_irq.irqflag = (trigger << 2) | polarity; +- mp_irq.srcbus = MP_ISA_BUS; +- mp_irq.srcbusirq = bus_irq; /* IRQ */ +- mp_irq.dstapic = mpc_ioapic_id(ioapic); /* APIC ID */ +- mp_irq.dstirq = pin; /* INTIN# */ +- +- mp_save_irq(&mp_irq); +- ++ if (mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi) < 0) ++ return; + /* + * Reset default identity mapping if gsi is also an legacy IRQ, + * otherwise there will be more than one entry with the same GSI +@@ -429,6 +412,34 @@ static int mp_config_acpi_gsi(struct device *dev, u32 gsi, int trigger, + return 0; + } + ++static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, ++ u8 trigger, u32 gsi) ++{ ++ struct mpc_intsrc mp_irq; ++ int ioapic, pin; ++ ++ /* Convert 'gsi' to 'ioapic.pin'(INTIN#) */ ++ ioapic = mp_find_ioapic(gsi); ++ if (ioapic < 0) { ++ pr_warn("Failed to find ioapic for gsi : %u\n", gsi); ++ return ioapic; ++ } ++ ++ pin = mp_find_ioapic_pin(ioapic, gsi); ++ ++ mp_irq.type = MP_INTSRC; ++ mp_irq.irqtype = mp_INT; ++ mp_irq.irqflag = (trigger << 2) | polarity; ++ mp_irq.srcbus = MP_ISA_BUS; ++ mp_irq.srcbusirq = bus_irq; ++ mp_irq.dstapic = mpc_ioapic_id(ioapic); ++ mp_irq.dstirq = pin; ++ ++ mp_save_irq(&mp_irq); ++ ++ return 0; ++} ++ + static int __init + acpi_parse_ioapic(struct acpi_subtable_header * header, const unsigned long end) + { +@@ -473,7 +484,11 @@ static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, + if (acpi_sci_flags & ACPI_MADT_POLARITY_MASK) + polarity = acpi_sci_flags & ACPI_MADT_POLARITY_MASK; + +- mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); ++ if (bus_irq < NR_IRQS_LEGACY) ++ mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); ++ else ++ mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi); ++ + acpi_penalize_sci_irq(bus_irq, trigger, polarity); + + /* +diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c +index 3344d3382e91..e0b97e4d1db5 100644 +--- a/arch/x86/kernel/alternative.c ++++ b/arch/x86/kernel/alternative.c +@@ -344,9 +344,12 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf) + static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr) + { + unsigned long flags; ++ int i; + +- if (instr[0] != 0x90) +- return; ++ for (i = 0; i < a->padlen; i++) { ++ if (instr[i] != 0x90) ++ return; ++ } + + local_irq_save(flags); + add_nops(instr + (a->instrlen - a->padlen), a->padlen); +diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c +index bcb75dc97d44..ea831c858195 100644 +--- a/arch/x86/kernel/cpu/amd.c ++++ b/arch/x86/kernel/cpu/amd.c +@@ -829,8 +829,32 @@ static void init_amd(struct cpuinfo_x86 *c) + set_cpu_cap(c, X86_FEATURE_K8); + + if (cpu_has(c, X86_FEATURE_XMM2)) { +- /* MFENCE stops RDTSC speculation */ +- set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); ++ unsigned long long val; ++ int ret; ++ ++ /* ++ * A serializing LFENCE has less overhead than MFENCE, so ++ * use it for execution serialization. On families which ++ * don't have that MSR, LFENCE is already serializing. ++ * msr_set_bit() uses the safe accessors, too, even if the MSR ++ * is not present. ++ */ ++ msr_set_bit(MSR_F10H_DECFG, ++ MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT); ++ ++ /* ++ * Verify that the MSR write was successful (could be running ++ * under a hypervisor) and only then assume that LFENCE is ++ * serializing. ++ */ ++ ret = rdmsrl_safe(MSR_F10H_DECFG, &val); ++ if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) { ++ /* A serializing LFENCE stops RDTSC speculation */ ++ set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); ++ } else { ++ /* MFENCE stops RDTSC speculation */ ++ set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); ++ } + } + + /* +diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c +index ba0b2424c9b0..e4dc26185aa7 100644 +--- a/arch/x86/kernel/cpu/bugs.c ++++ b/arch/x86/kernel/cpu/bugs.c +@@ -10,6 +10,10 @@ + */ + #include + #include ++#include ++ ++#include ++#include + #include + #include + #include +@@ -20,6 +24,8 @@ + #include + #include + ++static void __init spectre_v2_select_mitigation(void); ++ + void __init check_bugs(void) + { + identify_boot_cpu(); +@@ -29,6 +35,9 @@ void __init check_bugs(void) + print_cpu_info(&boot_cpu_data); + } + ++ /* Select the proper spectre mitigation before patching alternatives */ ++ spectre_v2_select_mitigation(); ++ + #ifdef CONFIG_X86_32 + /* + * Check whether we are able to run this kernel safely on SMP. +@@ -60,3 +69,179 @@ void __init check_bugs(void) + set_memory_4k((unsigned long)__va(0), 1); + #endif + } ++ ++/* The kernel command line selection */ ++enum spectre_v2_mitigation_cmd { ++ SPECTRE_V2_CMD_NONE, ++ SPECTRE_V2_CMD_AUTO, ++ SPECTRE_V2_CMD_FORCE, ++ SPECTRE_V2_CMD_RETPOLINE, ++ SPECTRE_V2_CMD_RETPOLINE_GENERIC, ++ SPECTRE_V2_CMD_RETPOLINE_AMD, ++}; ++ ++static const char *spectre_v2_strings[] = { ++ [SPECTRE_V2_NONE] = "Vulnerable", ++ [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", ++ [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", ++ [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", ++ [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", ++}; ++ ++#undef pr_fmt ++#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt ++ ++static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; ++ ++static void __init spec2_print_if_insecure(const char *reason) ++{ ++ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) ++ pr_info("%s\n", reason); ++} ++ ++static void __init spec2_print_if_secure(const char *reason) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) ++ pr_info("%s\n", reason); ++} ++ ++static inline bool retp_compiler(void) ++{ ++ return __is_defined(RETPOLINE); ++} ++ ++static inline bool match_option(const char *arg, int arglen, const char *opt) ++{ ++ int len = strlen(opt); ++ ++ return len == arglen && !strncmp(arg, opt, len); ++} ++ ++static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) ++{ ++ char arg[20]; ++ int ret; ++ ++ ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, ++ sizeof(arg)); ++ if (ret > 0) { ++ if (match_option(arg, ret, "off")) { ++ goto disable; ++ } else if (match_option(arg, ret, "on")) { ++ spec2_print_if_secure("force enabled on command line."); ++ return SPECTRE_V2_CMD_FORCE; ++ } else if (match_option(arg, ret, "retpoline")) { ++ spec2_print_if_insecure("retpoline selected on command line."); ++ return SPECTRE_V2_CMD_RETPOLINE; ++ } else if (match_option(arg, ret, "retpoline,amd")) { ++ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { ++ pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); ++ return SPECTRE_V2_CMD_AUTO; ++ } ++ spec2_print_if_insecure("AMD retpoline selected on command line."); ++ return SPECTRE_V2_CMD_RETPOLINE_AMD; ++ } else if (match_option(arg, ret, "retpoline,generic")) { ++ spec2_print_if_insecure("generic retpoline selected on command line."); ++ return SPECTRE_V2_CMD_RETPOLINE_GENERIC; ++ } else if (match_option(arg, ret, "auto")) { ++ return SPECTRE_V2_CMD_AUTO; ++ } ++ } ++ ++ if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2")) ++ return SPECTRE_V2_CMD_AUTO; ++disable: ++ spec2_print_if_insecure("disabled on command line."); ++ return SPECTRE_V2_CMD_NONE; ++} ++ ++static void __init spectre_v2_select_mitigation(void) ++{ ++ enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); ++ enum spectre_v2_mitigation mode = SPECTRE_V2_NONE; ++ ++ /* ++ * If the CPU is not affected and the command line mode is NONE or AUTO ++ * then nothing to do. ++ */ ++ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) && ++ (cmd == SPECTRE_V2_CMD_NONE || cmd == SPECTRE_V2_CMD_AUTO)) ++ return; ++ ++ switch (cmd) { ++ case SPECTRE_V2_CMD_NONE: ++ return; ++ ++ case SPECTRE_V2_CMD_FORCE: ++ /* FALLTRHU */ ++ case SPECTRE_V2_CMD_AUTO: ++ goto retpoline_auto; ++ ++ case SPECTRE_V2_CMD_RETPOLINE_AMD: ++ if (IS_ENABLED(CONFIG_RETPOLINE)) ++ goto retpoline_amd; ++ break; ++ case SPECTRE_V2_CMD_RETPOLINE_GENERIC: ++ if (IS_ENABLED(CONFIG_RETPOLINE)) ++ goto retpoline_generic; ++ break; ++ case SPECTRE_V2_CMD_RETPOLINE: ++ if (IS_ENABLED(CONFIG_RETPOLINE)) ++ goto retpoline_auto; ++ break; ++ } ++ pr_err("kernel not compiled with retpoline; no mitigation available!"); ++ return; ++ ++retpoline_auto: ++ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { ++ retpoline_amd: ++ if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { ++ pr_err("LFENCE not serializing. Switching to generic retpoline\n"); ++ goto retpoline_generic; ++ } ++ mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : ++ SPECTRE_V2_RETPOLINE_MINIMAL_AMD; ++ setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); ++ setup_force_cpu_cap(X86_FEATURE_RETPOLINE); ++ } else { ++ retpoline_generic: ++ mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC : ++ SPECTRE_V2_RETPOLINE_MINIMAL; ++ setup_force_cpu_cap(X86_FEATURE_RETPOLINE); ++ } ++ ++ spectre_v2_enabled = mode; ++ pr_info("%s\n", spectre_v2_strings[mode]); ++} ++ ++#undef pr_fmt ++ ++#ifdef CONFIG_SYSFS ++ssize_t cpu_show_meltdown(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) ++ return sprintf(buf, "Not affected\n"); ++ if (boot_cpu_has(X86_FEATURE_PTI)) ++ return sprintf(buf, "Mitigation: PTI\n"); ++ return sprintf(buf, "Vulnerable\n"); ++} ++ ++ssize_t cpu_show_spectre_v1(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) ++ return sprintf(buf, "Not affected\n"); ++ return sprintf(buf, "Vulnerable\n"); ++} ++ ++ssize_t cpu_show_spectre_v2(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) ++ return sprintf(buf, "Not affected\n"); ++ ++ return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]); ++} ++#endif +diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c +index 2d3bd2215e5b..372ba3fb400f 100644 +--- a/arch/x86/kernel/cpu/common.c ++++ b/arch/x86/kernel/cpu/common.c +@@ -902,6 +902,9 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c) + if (c->x86_vendor != X86_VENDOR_AMD) + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + ++ setup_force_cpu_bug(X86_BUG_SPECTRE_V1); ++ setup_force_cpu_bug(X86_BUG_SPECTRE_V2); ++ + fpu__init_system(c); + + #ifdef CONFIG_X86_32 +diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c +index 8ccdca6d3f9e..d9e460fc7a3b 100644 +--- a/arch/x86/kernel/cpu/microcode/intel.c ++++ b/arch/x86/kernel/cpu/microcode/intel.c +@@ -910,8 +910,17 @@ static bool is_blacklisted(unsigned int cpu) + { + struct cpuinfo_x86 *c = &cpu_data(cpu); + +- if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) { +- pr_err_once("late loading on model 79 is disabled.\n"); ++ /* ++ * Late loading on model 79 with microcode revision less than 0x0b000021 ++ * may result in a system hang. This behavior is documented in item ++ * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). ++ */ ++ if (c->x86 == 6 && ++ c->x86_model == INTEL_FAM6_BROADWELL_X && ++ c->x86_mask == 0x01 && ++ c->microcode < 0x0b000021) { ++ pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode); ++ pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); + return true; + } + +diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S +index b6c6468e10bc..4c8440de3355 100644 +--- a/arch/x86/kernel/ftrace_32.S ++++ b/arch/x86/kernel/ftrace_32.S +@@ -8,6 +8,7 @@ + #include + #include + #include ++#include + + #ifdef CC_USING_FENTRY + # define function_hook __fentry__ +@@ -197,7 +198,8 @@ ftrace_stub: + movl 0x4(%ebp), %edx + subl $MCOUNT_INSN_SIZE, %eax + +- call *ftrace_trace_function ++ movl ftrace_trace_function, %ecx ++ CALL_NOSPEC %ecx + + popl %edx + popl %ecx +@@ -241,5 +243,5 @@ return_to_handler: + movl %eax, %ecx + popl %edx + popl %eax +- jmp *%ecx ++ JMP_NOSPEC %ecx + #endif +diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S +index c832291d948a..7cb8ba08beb9 100644 +--- a/arch/x86/kernel/ftrace_64.S ++++ b/arch/x86/kernel/ftrace_64.S +@@ -7,7 +7,7 @@ + #include + #include + #include +- ++#include + + .code64 + .section .entry.text, "ax" +@@ -286,8 +286,8 @@ trace: + * ip and parent ip are used and the list function is called when + * function tracing is enabled. + */ +- call *ftrace_trace_function +- ++ movq ftrace_trace_function, %r8 ++ CALL_NOSPEC %r8 + restore_mcount_regs + + jmp fgraph_trace +@@ -329,5 +329,5 @@ GLOBAL(return_to_handler) + movq 8(%rsp), %rdx + movq (%rsp), %rax + addq $24, %rsp +- jmp *%rdi ++ JMP_NOSPEC %rdi + #endif +diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c +index a83b3346a0e1..c1bdbd3d3232 100644 +--- a/arch/x86/kernel/irq_32.c ++++ b/arch/x86/kernel/irq_32.c +@@ -20,6 +20,7 @@ + #include + + #include ++#include + + #ifdef CONFIG_DEBUG_STACKOVERFLOW + +@@ -55,11 +56,11 @@ DEFINE_PER_CPU(struct irq_stack *, softirq_stack); + static void call_on_stack(void *func, void *stack) + { + asm volatile("xchgl %%ebx,%%esp \n" +- "call *%%edi \n" ++ CALL_NOSPEC + "movl %%ebx,%%esp \n" + : "=b" (stack) + : "0" (stack), +- "D"(func) ++ [thunk_target] "D"(func) + : "memory", "cc", "edx", "ecx", "eax"); + } + +@@ -95,11 +96,11 @@ static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc) + call_on_stack(print_stack_overflow, isp); + + asm volatile("xchgl %%ebx,%%esp \n" +- "call *%%edi \n" ++ CALL_NOSPEC + "movl %%ebx,%%esp \n" + : "=a" (arg1), "=b" (isp) + : "0" (desc), "1" (isp), +- "D" (desc->handle_irq) ++ [thunk_target] "D" (desc->handle_irq) + : "memory", "cc", "ecx"); + return 1; + } +diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c +index a4eb27918ceb..a2486f444073 100644 +--- a/arch/x86/kernel/tboot.c ++++ b/arch/x86/kernel/tboot.c +@@ -138,6 +138,17 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn, + return -1; + set_pte_at(&tboot_mm, vaddr, pte, pfn_pte(pfn, prot)); + pte_unmap(pte); ++ ++ /* ++ * PTI poisons low addresses in the kernel page tables in the ++ * name of making them unusable for userspace. To execute ++ * code at such a low address, the poison must be cleared. ++ * ++ * Note: 'pgd' actually gets set in p4d_alloc() _or_ ++ * pud_alloc() depending on 4/5-level paging. ++ */ ++ pgd->pgd &= ~_PAGE_NX; ++ + return 0; + } + +diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c +index 17fb6c6d939a..6a8284f72328 100644 +--- a/arch/x86/kvm/svm.c ++++ b/arch/x86/kvm/svm.c +@@ -45,6 +45,7 @@ + #include + #include + #include ++#include + + #include + #include "trace.h" +@@ -4964,6 +4965,25 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) + "mov %%r13, %c[r13](%[svm]) \n\t" + "mov %%r14, %c[r14](%[svm]) \n\t" + "mov %%r15, %c[r15](%[svm]) \n\t" ++#endif ++ /* ++ * Clear host registers marked as clobbered to prevent ++ * speculative use. ++ */ ++ "xor %%" _ASM_BX ", %%" _ASM_BX " \n\t" ++ "xor %%" _ASM_CX ", %%" _ASM_CX " \n\t" ++ "xor %%" _ASM_DX ", %%" _ASM_DX " \n\t" ++ "xor %%" _ASM_SI ", %%" _ASM_SI " \n\t" ++ "xor %%" _ASM_DI ", %%" _ASM_DI " \n\t" ++#ifdef CONFIG_X86_64 ++ "xor %%r8, %%r8 \n\t" ++ "xor %%r9, %%r9 \n\t" ++ "xor %%r10, %%r10 \n\t" ++ "xor %%r11, %%r11 \n\t" ++ "xor %%r12, %%r12 \n\t" ++ "xor %%r13, %%r13 \n\t" ++ "xor %%r14, %%r14 \n\t" ++ "xor %%r15, %%r15 \n\t" + #endif + "pop %%" _ASM_BP + : +@@ -4994,6 +5014,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) + #endif + ); + ++ /* Eliminate branch target predictions from guest mode */ ++ vmexit_fill_RSB(); ++ + #ifdef CONFIG_X86_64 + wrmsrl(MSR_GS_BASE, svm->host.gs_base); + #else +diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c +index 47d9432756f3..ef16cf0f7cfd 100644 +--- a/arch/x86/kvm/vmx.c ++++ b/arch/x86/kvm/vmx.c +@@ -50,6 +50,7 @@ + #include + #include + #include ++#include + + #include "trace.h" + #include "pmu.h" +@@ -888,8 +889,16 @@ static inline short vmcs_field_to_offset(unsigned long field) + { + BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX); + +- if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) || +- vmcs_field_to_offset_table[field] == 0) ++ if (field >= ARRAY_SIZE(vmcs_field_to_offset_table)) ++ return -ENOENT; ++ ++ /* ++ * FIXME: Mitigation for CVE-2017-5753. To be replaced with a ++ * generic mechanism. ++ */ ++ asm("lfence"); ++ ++ if (vmcs_field_to_offset_table[field] == 0) + return -ENOENT; + + return vmcs_field_to_offset_table[field]; +@@ -9405,6 +9414,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) + /* Save guest registers, load host registers, keep flags */ + "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" + "pop %0 \n\t" ++ "setbe %c[fail](%0)\n\t" + "mov %%" _ASM_AX ", %c[rax](%0) \n\t" + "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" + __ASM_SIZE(pop) " %c[rcx](%0) \n\t" +@@ -9421,12 +9431,23 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) + "mov %%r13, %c[r13](%0) \n\t" + "mov %%r14, %c[r14](%0) \n\t" + "mov %%r15, %c[r15](%0) \n\t" ++ "xor %%r8d, %%r8d \n\t" ++ "xor %%r9d, %%r9d \n\t" ++ "xor %%r10d, %%r10d \n\t" ++ "xor %%r11d, %%r11d \n\t" ++ "xor %%r12d, %%r12d \n\t" ++ "xor %%r13d, %%r13d \n\t" ++ "xor %%r14d, %%r14d \n\t" ++ "xor %%r15d, %%r15d \n\t" + #endif + "mov %%cr2, %%" _ASM_AX " \n\t" + "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" + ++ "xor %%eax, %%eax \n\t" ++ "xor %%ebx, %%ebx \n\t" ++ "xor %%esi, %%esi \n\t" ++ "xor %%edi, %%edi \n\t" + "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" +- "setbe %c[fail](%0) \n\t" + ".pushsection .rodata \n\t" + ".global vmx_return \n\t" + "vmx_return: " _ASM_PTR " 2b \n\t" +@@ -9463,6 +9484,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) + #endif + ); + ++ /* Eliminate branch target predictions from guest mode */ ++ vmexit_fill_RSB(); ++ + /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ + if (debugctlmsr) + update_debugctlmsr(debugctlmsr); +diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c +index 075619a92ce7..575c8953cc9a 100644 +--- a/arch/x86/kvm/x86.c ++++ b/arch/x86/kvm/x86.c +@@ -4362,7 +4362,7 @@ static int vcpu_mmio_read(struct kvm_vcpu *vcpu, gpa_t addr, int len, void *v) + addr, n, v)) + && kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, n, v)) + break; +- trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, *(u64 *)v); ++ trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, v); + handled += n; + addr += n; + len -= n; +@@ -4621,7 +4621,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) + { + if (vcpu->mmio_read_completed) { + trace_kvm_mmio(KVM_TRACE_MMIO_READ, bytes, +- vcpu->mmio_fragments[0].gpa, *(u64 *)val); ++ vcpu->mmio_fragments[0].gpa, val); + vcpu->mmio_read_completed = 0; + return 1; + } +@@ -4643,14 +4643,14 @@ static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, + + static int write_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes, void *val) + { +- trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, *(u64 *)val); ++ trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, val); + return vcpu_mmio_write(vcpu, gpa, bytes, val); + } + + static int read_exit_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, + void *val, int bytes) + { +- trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, 0); ++ trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, NULL); + return X86EMUL_IO_NEEDED; + } + +diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile +index 457f681ef379..d435c89875c1 100644 +--- a/arch/x86/lib/Makefile ++++ b/arch/x86/lib/Makefile +@@ -26,6 +26,7 @@ lib-y += memcpy_$(BITS).o + lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o + lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o + lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o ++lib-$(CONFIG_RETPOLINE) += retpoline.o + + obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o + +diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S +index 4d34bb548b41..46e71a74e612 100644 +--- a/arch/x86/lib/checksum_32.S ++++ b/arch/x86/lib/checksum_32.S +@@ -29,7 +29,8 @@ + #include + #include + #include +- ++#include ++ + /* + * computes a partial checksum, e.g. for TCP/UDP fragments + */ +@@ -156,7 +157,7 @@ ENTRY(csum_partial) + negl %ebx + lea 45f(%ebx,%ebx,2), %ebx + testl %esi, %esi +- jmp *%ebx ++ JMP_NOSPEC %ebx + + # Handle 2-byte-aligned regions + 20: addw (%esi), %ax +@@ -439,7 +440,7 @@ ENTRY(csum_partial_copy_generic) + andl $-32,%edx + lea 3f(%ebx,%ebx), %ebx + testl %esi, %esi +- jmp *%ebx ++ JMP_NOSPEC %ebx + 1: addl $64,%esi + addl $64,%edi + SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl) +diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S +new file mode 100644 +index 000000000000..cb45c6cb465f +--- /dev/null ++++ b/arch/x86/lib/retpoline.S +@@ -0,0 +1,48 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++ ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++ ++.macro THUNK reg ++ .section .text.__x86.indirect_thunk.\reg ++ ++ENTRY(__x86_indirect_thunk_\reg) ++ CFI_STARTPROC ++ JMP_NOSPEC %\reg ++ CFI_ENDPROC ++ENDPROC(__x86_indirect_thunk_\reg) ++.endm ++ ++/* ++ * Despite being an assembler file we can't just use .irp here ++ * because __KSYM_DEPS__ only uses the C preprocessor and would ++ * only see one instance of "__x86_indirect_thunk_\reg" rather ++ * than one per register with the correct names. So we do it ++ * the simple and nasty way... ++ */ ++#define EXPORT_THUNK(reg) EXPORT_SYMBOL(__x86_indirect_thunk_ ## reg) ++#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg) ++ ++GENERATE_THUNK(_ASM_AX) ++GENERATE_THUNK(_ASM_BX) ++GENERATE_THUNK(_ASM_CX) ++GENERATE_THUNK(_ASM_DX) ++GENERATE_THUNK(_ASM_SI) ++GENERATE_THUNK(_ASM_DI) ++GENERATE_THUNK(_ASM_BP) ++GENERATE_THUNK(_ASM_SP) ++#ifdef CONFIG_64BIT ++GENERATE_THUNK(r8) ++GENERATE_THUNK(r9) ++GENERATE_THUNK(r10) ++GENERATE_THUNK(r11) ++GENERATE_THUNK(r12) ++GENERATE_THUNK(r13) ++GENERATE_THUNK(r14) ++GENERATE_THUNK(r15) ++#endif +diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c +index 43d4a4a29037..ce38f165489b 100644 +--- a/arch/x86/mm/pti.c ++++ b/arch/x86/mm/pti.c +@@ -149,7 +149,7 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) + * + * Returns a pointer to a P4D on success, or NULL on failure. + */ +-static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) ++static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) + { + pgd_t *pgd = kernel_to_user_pgdp(pgd_offset_k(address)); + gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); +@@ -164,12 +164,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) + if (!new_p4d_page) + return NULL; + +- if (pgd_none(*pgd)) { +- set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page))); +- new_p4d_page = 0; +- } +- if (new_p4d_page) +- free_page(new_p4d_page); ++ set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page))); + } + BUILD_BUG_ON(pgd_large(*pgd) != 0); + +@@ -182,7 +177,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) + * + * Returns a pointer to a PMD on success, or NULL on failure. + */ +-static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) ++static __init pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) + { + gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); + p4d_t *p4d = pti_user_pagetable_walk_p4d(address); +@@ -194,12 +189,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) + if (!new_pud_page) + return NULL; + +- if (p4d_none(*p4d)) { +- set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page))); +- new_pud_page = 0; +- } +- if (new_pud_page) +- free_page(new_pud_page); ++ set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page))); + } + + pud = pud_offset(p4d, address); +@@ -213,12 +203,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) + if (!new_pmd_page) + return NULL; + +- if (pud_none(*pud)) { +- set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page))); +- new_pmd_page = 0; +- } +- if (new_pmd_page) +- free_page(new_pmd_page); ++ set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page))); + } + + return pmd_offset(pud, address); +@@ -251,12 +236,7 @@ static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address) + if (!new_pte_page) + return NULL; + +- if (pmd_none(*pmd)) { +- set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page))); +- new_pte_page = 0; +- } +- if (new_pte_page) +- free_page(new_pte_page); ++ set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page))); + } + + pte = pte_offset_kernel(pmd, address); +diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c +index 39c4b35ac7a4..61975b6bcb1a 100644 +--- a/arch/x86/platform/efi/efi_64.c ++++ b/arch/x86/platform/efi/efi_64.c +@@ -134,7 +134,9 @@ pgd_t * __init efi_call_phys_prolog(void) + pud[j] = *pud_offset(p4d_k, vaddr); + } + } ++ pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX; + } ++ + out: + __flush_tlb_all(); + +diff --git a/crypto/algapi.c b/crypto/algapi.c +index aa699ff6c876..50eb828db767 100644 +--- a/crypto/algapi.c ++++ b/crypto/algapi.c +@@ -167,6 +167,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list, + + spawn->alg = NULL; + spawns = &inst->alg.cra_users; ++ ++ /* ++ * We may encounter an unregistered instance here, since ++ * an instance's spawns are set up prior to the instance ++ * being registered. An unregistered instance will have ++ * NULL ->cra_users.next, since ->cra_users isn't ++ * properly initialized until registration. But an ++ * unregistered instance cannot have any users, so treat ++ * it the same as ->cra_users being empty. ++ */ ++ if (spawns->next == NULL) ++ break; + } + } while ((spawns = crypto_more_spawns(alg, &stack, &top, + &secondary_spawns))); +diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig +index bdc87907d6a1..2415ad9f6dd4 100644 +--- a/drivers/base/Kconfig ++++ b/drivers/base/Kconfig +@@ -236,6 +236,9 @@ config GENERIC_CPU_DEVICES + config GENERIC_CPU_AUTOPROBE + bool + ++config GENERIC_CPU_VULNERABILITIES ++ bool ++ + config SOC_BUS + bool + select GLOB +diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c +index 321cd7b4d817..825964efda1d 100644 +--- a/drivers/base/cpu.c ++++ b/drivers/base/cpu.c +@@ -501,10 +501,58 @@ static void __init cpu_dev_register_generic(void) + #endif + } + ++#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES ++ ++ssize_t __weak cpu_show_meltdown(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ return sprintf(buf, "Not affected\n"); ++} ++ ++ssize_t __weak cpu_show_spectre_v1(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ return sprintf(buf, "Not affected\n"); ++} ++ ++ssize_t __weak cpu_show_spectre_v2(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ return sprintf(buf, "Not affected\n"); ++} ++ ++static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); ++static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); ++static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); ++ ++static struct attribute *cpu_root_vulnerabilities_attrs[] = { ++ &dev_attr_meltdown.attr, ++ &dev_attr_spectre_v1.attr, ++ &dev_attr_spectre_v2.attr, ++ NULL ++}; ++ ++static const struct attribute_group cpu_root_vulnerabilities_group = { ++ .name = "vulnerabilities", ++ .attrs = cpu_root_vulnerabilities_attrs, ++}; ++ ++static void __init cpu_register_vulnerabilities(void) ++{ ++ if (sysfs_create_group(&cpu_subsys.dev_root->kobj, ++ &cpu_root_vulnerabilities_group)) ++ pr_err("Unable to register CPU vulnerabilities\n"); ++} ++ ++#else ++static inline void cpu_register_vulnerabilities(void) { } ++#endif ++ + void __init cpu_dev_init(void) + { + if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) + panic("Failed to register CPU subsystem"); + + cpu_dev_register_generic(); ++ cpu_register_vulnerabilities(); + } +diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c +index adc877dfef5c..609227211295 100644 +--- a/drivers/block/rbd.c ++++ b/drivers/block/rbd.c +@@ -3074,13 +3074,21 @@ static void format_lock_cookie(struct rbd_device *rbd_dev, char *buf) + mutex_unlock(&rbd_dev->watch_mutex); + } + ++static void __rbd_lock(struct rbd_device *rbd_dev, const char *cookie) ++{ ++ struct rbd_client_id cid = rbd_get_cid(rbd_dev); ++ ++ strcpy(rbd_dev->lock_cookie, cookie); ++ rbd_set_owner_cid(rbd_dev, &cid); ++ queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work); ++} ++ + /* + * lock_rwsem must be held for write + */ + static int rbd_lock(struct rbd_device *rbd_dev) + { + struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc; +- struct rbd_client_id cid = rbd_get_cid(rbd_dev); + char cookie[32]; + int ret; + +@@ -3095,9 +3103,7 @@ static int rbd_lock(struct rbd_device *rbd_dev) + return ret; + + rbd_dev->lock_state = RBD_LOCK_STATE_LOCKED; +- strcpy(rbd_dev->lock_cookie, cookie); +- rbd_set_owner_cid(rbd_dev, &cid); +- queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work); ++ __rbd_lock(rbd_dev, cookie); + return 0; + } + +@@ -3883,7 +3889,7 @@ static void rbd_reacquire_lock(struct rbd_device *rbd_dev) + queue_delayed_work(rbd_dev->task_wq, + &rbd_dev->lock_dwork, 0); + } else { +- strcpy(rbd_dev->lock_cookie, cookie); ++ __rbd_lock(rbd_dev, cookie); + } + } + +@@ -4415,7 +4421,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) + segment_size = rbd_obj_bytes(&rbd_dev->header); + blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE); + q->limits.max_sectors = queue_max_hw_sectors(q); +- blk_queue_max_segments(q, segment_size / SECTOR_SIZE); ++ blk_queue_max_segments(q, USHRT_MAX); + blk_queue_max_segment_size(q, segment_size); + blk_queue_io_min(q, segment_size); + blk_queue_io_opt(q, segment_size); +diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c +index a385838e2919..dadacbe558ab 100644 +--- a/drivers/gpu/drm/i915/gvt/gtt.c ++++ b/drivers/gpu/drm/i915/gvt/gtt.c +@@ -1359,12 +1359,15 @@ static int ppgtt_handle_guest_write_page_table_bytes(void *gp, + return ret; + } else { + if (!test_bit(index, spt->post_shadow_bitmap)) { ++ int type = spt->shadow_page.type; ++ + ppgtt_get_shadow_entry(spt, &se, index); + ret = ppgtt_handle_guest_entry_removal(gpt, &se, index); + if (ret) + return ret; ++ ops->set_pfn(&se, vgpu->gtt.scratch_pt[type].page_mfn); ++ ppgtt_set_shadow_entry(spt, &se, index); + } +- + ppgtt_set_post_shadow(spt, index); + } + +diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c +index 82498f8232eb..5c5cb2ceee49 100644 +--- a/drivers/gpu/drm/i915/i915_drv.c ++++ b/drivers/gpu/drm/i915/i915_drv.c +@@ -1693,6 +1693,7 @@ static int i915_drm_resume(struct drm_device *dev) + intel_guc_resume(dev_priv); + + intel_modeset_init_hw(dev); ++ intel_init_clock_gating(dev_priv); + + spin_lock_irq(&dev_priv->irq_lock); + if (dev_priv->display.hpd_irq_setup) +diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h +index ce2ed16f2a30..920c8914cec1 100644 +--- a/drivers/gpu/drm/i915/i915_reg.h ++++ b/drivers/gpu/drm/i915/i915_reg.h +@@ -6987,6 +6987,8 @@ enum { + #define GEN9_SLICE_COMMON_ECO_CHICKEN0 _MMIO(0x7308) + #define DISABLE_PIXEL_MASK_CAMMING (1<<14) + ++#define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) ++ + #define GEN7_L3SQCREG1 _MMIO(0xB010) + #define VLV_B0_WA_L3SQCREG1_VALUE 0x00D30000 + +diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c +index 1c73d5542681..095a2240af4f 100644 +--- a/drivers/gpu/drm/i915/intel_display.c ++++ b/drivers/gpu/drm/i915/intel_display.c +@@ -3800,6 +3800,7 @@ void intel_finish_reset(struct drm_i915_private *dev_priv) + + intel_pps_unlock_regs_wa(dev_priv); + intel_modeset_init_hw(dev); ++ intel_init_clock_gating(dev_priv); + + spin_lock_irq(&dev_priv->irq_lock); + if (dev_priv->display.hpd_irq_setup) +@@ -14406,8 +14407,6 @@ void intel_modeset_init_hw(struct drm_device *dev) + + intel_update_cdclk(dev_priv); + dev_priv->cdclk.logical = dev_priv->cdclk.actual = dev_priv->cdclk.hw; +- +- intel_init_clock_gating(dev_priv); + } + + /* +@@ -15124,6 +15123,15 @@ intel_modeset_setup_hw_state(struct drm_device *dev, + struct intel_encoder *encoder; + int i; + ++ if (IS_HASWELL(dev_priv)) { ++ /* ++ * WaRsPkgCStateDisplayPMReq:hsw ++ * System hang if this isn't done before disabling all planes! ++ */ ++ I915_WRITE(CHICKEN_PAR1_1, ++ I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES); ++ } ++ + intel_modeset_readout_hw_state(dev); + + /* HW state is read out, now we need to sanitize this mess. */ +@@ -15220,6 +15228,8 @@ void intel_modeset_gem_init(struct drm_device *dev) + + intel_init_gt_powersave(dev_priv); + ++ intel_init_clock_gating(dev_priv); ++ + intel_setup_overlay(dev_priv); + } + +diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c +index 3c2d9cf22ed5..b6a7e492c1a3 100644 +--- a/drivers/gpu/drm/i915/intel_engine_cs.c ++++ b/drivers/gpu/drm/i915/intel_engine_cs.c +@@ -1125,6 +1125,11 @@ static int glk_init_workarounds(struct intel_engine_cs *engine) + if (ret) + return ret; + ++ /* WA #0862: Userspace has to set "Barrier Mode" to avoid hangs. */ ++ ret = wa_ring_whitelist_reg(engine, GEN9_SLICE_COMMON_ECO_CHICKEN1); ++ if (ret) ++ return ret; ++ + /* WaToEnableHwFixForPushConstHWBug:glk */ + WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2, + GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); +diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c +index cb950752c346..014e5c08571a 100644 +--- a/drivers/gpu/drm/i915/intel_pm.c ++++ b/drivers/gpu/drm/i915/intel_pm.c +@@ -5669,12 +5669,30 @@ void vlv_wm_sanitize(struct drm_i915_private *dev_priv) + mutex_unlock(&dev_priv->wm.wm_mutex); + } + ++/* ++ * FIXME should probably kill this and improve ++ * the real watermark readout/sanitation instead ++ */ ++static void ilk_init_lp_watermarks(struct drm_i915_private *dev_priv) ++{ ++ I915_WRITE(WM3_LP_ILK, I915_READ(WM3_LP_ILK) & ~WM1_LP_SR_EN); ++ I915_WRITE(WM2_LP_ILK, I915_READ(WM2_LP_ILK) & ~WM1_LP_SR_EN); ++ I915_WRITE(WM1_LP_ILK, I915_READ(WM1_LP_ILK) & ~WM1_LP_SR_EN); ++ ++ /* ++ * Don't touch WM1S_LP_EN here. ++ * Doing so could cause underruns. ++ */ ++} ++ + void ilk_wm_get_hw_state(struct drm_device *dev) + { + struct drm_i915_private *dev_priv = to_i915(dev); + struct ilk_wm_values *hw = &dev_priv->wm.hw; + struct drm_crtc *crtc; + ++ ilk_init_lp_watermarks(dev_priv); ++ + for_each_crtc(dev, crtc) + ilk_pipe_wm_get_hw_state(crtc); + +@@ -7959,18 +7977,6 @@ static void g4x_disable_trickle_feed(struct drm_i915_private *dev_priv) + } + } + +-static void ilk_init_lp_watermarks(struct drm_i915_private *dev_priv) +-{ +- I915_WRITE(WM3_LP_ILK, I915_READ(WM3_LP_ILK) & ~WM1_LP_SR_EN); +- I915_WRITE(WM2_LP_ILK, I915_READ(WM2_LP_ILK) & ~WM1_LP_SR_EN); +- I915_WRITE(WM1_LP_ILK, I915_READ(WM1_LP_ILK) & ~WM1_LP_SR_EN); +- +- /* +- * Don't touch WM1S_LP_EN here. +- * Doing so could cause underruns. +- */ +-} +- + static void ironlake_init_clock_gating(struct drm_i915_private *dev_priv) + { + uint32_t dspclk_gate = ILK_VRHUNIT_CLOCK_GATE_DISABLE; +@@ -8004,8 +8010,6 @@ static void ironlake_init_clock_gating(struct drm_i915_private *dev_priv) + (I915_READ(DISP_ARB_CTL) | + DISP_FBC_WM_DIS)); + +- ilk_init_lp_watermarks(dev_priv); +- + /* + * Based on the document from hardware guys the following bits + * should be set unconditionally in order to enable FBC. +@@ -8118,8 +8122,6 @@ static void gen6_init_clock_gating(struct drm_i915_private *dev_priv) + I915_WRITE(GEN6_GT_MODE, + _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4)); + +- ilk_init_lp_watermarks(dev_priv); +- + I915_WRITE(CACHE_MODE_0, + _MASKED_BIT_DISABLE(CM0_STC_EVICT_DISABLE_LRA_SNB)); + +@@ -8293,8 +8295,6 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv) + { + enum pipe pipe; + +- ilk_init_lp_watermarks(dev_priv); +- + /* WaSwitchSolVfFArbitrationPriority:bdw */ + I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL); + +@@ -8349,8 +8349,6 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv) + + static void haswell_init_clock_gating(struct drm_i915_private *dev_priv) + { +- ilk_init_lp_watermarks(dev_priv); +- + /* L3 caching of data atomics doesn't work -- disable it. */ + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); + I915_WRITE(HSW_ROW_CHICKEN3, +@@ -8394,10 +8392,6 @@ static void haswell_init_clock_gating(struct drm_i915_private *dev_priv) + /* WaSwitchSolVfFArbitrationPriority:hsw */ + I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL); + +- /* WaRsPkgCStateDisplayPMReq:hsw */ +- I915_WRITE(CHICKEN_PAR1_1, +- I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES); +- + lpt_init_clock_gating(dev_priv); + } + +@@ -8405,8 +8399,6 @@ static void ivybridge_init_clock_gating(struct drm_i915_private *dev_priv) + { + uint32_t snpcr; + +- ilk_init_lp_watermarks(dev_priv); +- + I915_WRITE(ILK_DSPCLK_GATE_D, ILK_VRHUNIT_CLOCK_GATE_DISABLE); + + /* WaDisableEarlyCull:ivb */ +diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +index 21c62a34e558..87e8af5776a3 100644 +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +@@ -2731,6 +2731,8 @@ static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv, + } + + view_type = vmw_view_cmd_to_type(header->id); ++ if (view_type == vmw_view_max) ++ return -EINVAL; + cmd = container_of(header, typeof(*cmd), header); + ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, + user_surface_converter, +diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +index b850562fbdd6..62c2f4be8012 100644 +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +@@ -697,7 +697,6 @@ vmw_du_plane_duplicate_state(struct drm_plane *plane) + vps->pinned = 0; + + /* Mapping is managed by prepare_fb/cleanup_fb */ +- memset(&vps->guest_map, 0, sizeof(vps->guest_map)); + memset(&vps->host_map, 0, sizeof(vps->host_map)); + vps->cpp = 0; + +@@ -760,11 +759,6 @@ vmw_du_plane_destroy_state(struct drm_plane *plane, + + + /* Should have been freed by cleanup_fb */ +- if (vps->guest_map.virtual) { +- DRM_ERROR("Guest mapping not freed\n"); +- ttm_bo_kunmap(&vps->guest_map); +- } +- + if (vps->host_map.virtual) { + DRM_ERROR("Host mapping not freed\n"); + ttm_bo_kunmap(&vps->host_map); +diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h +index ff9c8389ff21..cd9da2dd79af 100644 +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h +@@ -175,7 +175,7 @@ struct vmw_plane_state { + int pinned; + + /* For CPU Blit */ +- struct ttm_bo_kmap_obj host_map, guest_map; ++ struct ttm_bo_kmap_obj host_map; + unsigned int cpp; + }; + +diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c +index ca3afae2db1f..4dee05b15552 100644 +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c +@@ -114,7 +114,7 @@ struct vmw_screen_target_display_unit { + bool defined; + + /* For CPU Blit */ +- struct ttm_bo_kmap_obj host_map, guest_map; ++ struct ttm_bo_kmap_obj host_map; + unsigned int cpp; + }; + +@@ -695,7 +695,8 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) + s32 src_pitch, dst_pitch; + u8 *src, *dst; + bool not_used; +- ++ struct ttm_bo_kmap_obj guest_map; ++ int ret; + + if (!dirty->num_hits) + return; +@@ -706,6 +707,13 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) + if (width == 0 || height == 0) + return; + ++ ret = ttm_bo_kmap(&ddirty->buf->base, 0, ddirty->buf->base.num_pages, ++ &guest_map); ++ if (ret) { ++ DRM_ERROR("Failed mapping framebuffer for blit: %d\n", ++ ret); ++ goto out_cleanup; ++ } + + /* Assume we are blitting from Host (display_srf) to Guest (dmabuf) */ + src_pitch = stdu->display_srf->base_size.width * stdu->cpp; +@@ -713,7 +721,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) + src += ddirty->top * src_pitch + ddirty->left * stdu->cpp; + + dst_pitch = ddirty->pitch; +- dst = ttm_kmap_obj_virtual(&stdu->guest_map, ¬_used); ++ dst = ttm_kmap_obj_virtual(&guest_map, ¬_used); + dst += ddirty->fb_top * dst_pitch + ddirty->fb_left * stdu->cpp; + + +@@ -772,6 +780,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) + vmw_fifo_commit(dev_priv, sizeof(*cmd)); + } + ++ ttm_bo_kunmap(&guest_map); + out_cleanup: + ddirty->left = ddirty->top = ddirty->fb_left = ddirty->fb_top = S32_MAX; + ddirty->right = ddirty->bottom = S32_MIN; +@@ -1109,9 +1118,6 @@ vmw_stdu_primary_plane_cleanup_fb(struct drm_plane *plane, + { + struct vmw_plane_state *vps = vmw_plane_state_to_vps(old_state); + +- if (vps->guest_map.virtual) +- ttm_bo_kunmap(&vps->guest_map); +- + if (vps->host_map.virtual) + ttm_bo_kunmap(&vps->host_map); + +@@ -1277,33 +1283,11 @@ vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane, + */ + if (vps->content_fb_type == SEPARATE_DMA && + !(dev_priv->capabilities & SVGA_CAP_3D)) { +- +- struct vmw_framebuffer_dmabuf *new_vfbd; +- +- new_vfbd = vmw_framebuffer_to_vfbd(new_fb); +- +- ret = ttm_bo_reserve(&new_vfbd->buffer->base, false, false, +- NULL); +- if (ret) +- goto out_srf_unpin; +- +- ret = ttm_bo_kmap(&new_vfbd->buffer->base, 0, +- new_vfbd->buffer->base.num_pages, +- &vps->guest_map); +- +- ttm_bo_unreserve(&new_vfbd->buffer->base); +- +- if (ret) { +- DRM_ERROR("Failed to map content buffer to CPU\n"); +- goto out_srf_unpin; +- } +- + ret = ttm_bo_kmap(&vps->surf->res.backup->base, 0, + vps->surf->res.backup->base.num_pages, + &vps->host_map); + if (ret) { + DRM_ERROR("Failed to map display buffer to CPU\n"); +- ttm_bo_kunmap(&vps->guest_map); + goto out_srf_unpin; + } + +@@ -1350,7 +1334,6 @@ vmw_stdu_primary_plane_atomic_update(struct drm_plane *plane, + stdu->display_srf = vps->surf; + stdu->content_fb_type = vps->content_fb_type; + stdu->cpp = vps->cpp; +- memcpy(&stdu->guest_map, &vps->guest_map, sizeof(vps->guest_map)); + memcpy(&stdu->host_map, &vps->host_map, sizeof(vps->host_map)); + + if (!stdu->defined) +diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c +index 514c1000ded1..73feeeeb4283 100644 +--- a/drivers/infiniband/hw/cxgb4/cq.c ++++ b/drivers/infiniband/hw/cxgb4/cq.c +@@ -410,7 +410,7 @@ void c4iw_flush_hw_cq(struct c4iw_cq *chp) + + static int cqe_completes_wr(struct t4_cqe *cqe, struct t4_wq *wq) + { +- if (CQE_OPCODE(cqe) == C4IW_DRAIN_OPCODE) { ++ if (DRAIN_CQE(cqe)) { + WARN_ONCE(1, "Unexpected DRAIN CQE qp id %u!\n", wq->sq.qid); + return 0; + } +@@ -509,7 +509,7 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe, + /* + * Special cqe for drain WR completions... + */ +- if (CQE_OPCODE(hw_cqe) == C4IW_DRAIN_OPCODE) { ++ if (DRAIN_CQE(hw_cqe)) { + *cookie = CQE_DRAIN_COOKIE(hw_cqe); + *cqe = *hw_cqe; + goto skip_cqe; +@@ -766,9 +766,6 @@ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc) + c4iw_invalidate_mr(qhp->rhp, + CQE_WRID_FR_STAG(&cqe)); + break; +- case C4IW_DRAIN_OPCODE: +- wc->opcode = IB_WC_SEND; +- break; + default: + pr_err("Unexpected opcode %d in the CQE received for QPID=0x%0x\n", + CQE_OPCODE(&cqe), CQE_QPID(&cqe)); +diff --git a/drivers/infiniband/hw/cxgb4/ev.c b/drivers/infiniband/hw/cxgb4/ev.c +index 8f963df0bffc..9d25298d96fa 100644 +--- a/drivers/infiniband/hw/cxgb4/ev.c ++++ b/drivers/infiniband/hw/cxgb4/ev.c +@@ -109,9 +109,11 @@ static void post_qp_event(struct c4iw_dev *dev, struct c4iw_cq *chp, + if (qhp->ibqp.event_handler) + (*qhp->ibqp.event_handler)(&event, qhp->ibqp.qp_context); + +- spin_lock_irqsave(&chp->comp_handler_lock, flag); +- (*chp->ibcq.comp_handler)(&chp->ibcq, chp->ibcq.cq_context); +- spin_unlock_irqrestore(&chp->comp_handler_lock, flag); ++ if (t4_clear_cq_armed(&chp->cq)) { ++ spin_lock_irqsave(&chp->comp_handler_lock, flag); ++ (*chp->ibcq.comp_handler)(&chp->ibcq, chp->ibcq.cq_context); ++ spin_unlock_irqrestore(&chp->comp_handler_lock, flag); ++ } + } + + void c4iw_ev_dispatch(struct c4iw_dev *dev, struct t4_cqe *err_cqe) +diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h +index 819a30635d53..20c481115a99 100644 +--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h ++++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h +@@ -631,8 +631,6 @@ static inline int to_ib_qp_state(int c4iw_qp_state) + return IB_QPS_ERR; + } + +-#define C4IW_DRAIN_OPCODE FW_RI_SGE_EC_CR_RETURN +- + static inline u32 c4iw_ib_to_tpt_access(int a) + { + return (a & IB_ACCESS_REMOTE_WRITE ? FW_RI_MEM_ACCESS_REM_WRITE : 0) | +diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c +index e69453665a17..f311ea73c806 100644 +--- a/drivers/infiniband/hw/cxgb4/qp.c ++++ b/drivers/infiniband/hw/cxgb4/qp.c +@@ -794,21 +794,57 @@ static int ring_kernel_rq_db(struct c4iw_qp *qhp, u16 inc) + return 0; + } + +-static void complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr) ++static int ib_to_fw_opcode(int ib_opcode) ++{ ++ int opcode; ++ ++ switch (ib_opcode) { ++ case IB_WR_SEND_WITH_INV: ++ opcode = FW_RI_SEND_WITH_INV; ++ break; ++ case IB_WR_SEND: ++ opcode = FW_RI_SEND; ++ break; ++ case IB_WR_RDMA_WRITE: ++ opcode = FW_RI_RDMA_WRITE; ++ break; ++ case IB_WR_RDMA_READ: ++ case IB_WR_RDMA_READ_WITH_INV: ++ opcode = FW_RI_READ_REQ; ++ break; ++ case IB_WR_REG_MR: ++ opcode = FW_RI_FAST_REGISTER; ++ break; ++ case IB_WR_LOCAL_INV: ++ opcode = FW_RI_LOCAL_INV; ++ break; ++ default: ++ opcode = -EINVAL; ++ } ++ return opcode; ++} ++ ++static int complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr) + { + struct t4_cqe cqe = {}; + struct c4iw_cq *schp; + unsigned long flag; + struct t4_cq *cq; ++ int opcode; + + schp = to_c4iw_cq(qhp->ibqp.send_cq); + cq = &schp->cq; + ++ opcode = ib_to_fw_opcode(wr->opcode); ++ if (opcode < 0) ++ return opcode; ++ + cqe.u.drain_cookie = wr->wr_id; + cqe.header = cpu_to_be32(CQE_STATUS_V(T4_ERR_SWFLUSH) | +- CQE_OPCODE_V(C4IW_DRAIN_OPCODE) | ++ CQE_OPCODE_V(opcode) | + CQE_TYPE_V(1) | + CQE_SWCQE_V(1) | ++ CQE_DRAIN_V(1) | + CQE_QPID_V(qhp->wq.sq.qid)); + + spin_lock_irqsave(&schp->lock, flag); +@@ -817,10 +853,29 @@ static void complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr) + t4_swcq_produce(cq); + spin_unlock_irqrestore(&schp->lock, flag); + +- spin_lock_irqsave(&schp->comp_handler_lock, flag); +- (*schp->ibcq.comp_handler)(&schp->ibcq, +- schp->ibcq.cq_context); +- spin_unlock_irqrestore(&schp->comp_handler_lock, flag); ++ if (t4_clear_cq_armed(&schp->cq)) { ++ spin_lock_irqsave(&schp->comp_handler_lock, flag); ++ (*schp->ibcq.comp_handler)(&schp->ibcq, ++ schp->ibcq.cq_context); ++ spin_unlock_irqrestore(&schp->comp_handler_lock, flag); ++ } ++ return 0; ++} ++ ++static int complete_sq_drain_wrs(struct c4iw_qp *qhp, struct ib_send_wr *wr, ++ struct ib_send_wr **bad_wr) ++{ ++ int ret = 0; ++ ++ while (wr) { ++ ret = complete_sq_drain_wr(qhp, wr); ++ if (ret) { ++ *bad_wr = wr; ++ break; ++ } ++ wr = wr->next; ++ } ++ return ret; + } + + static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr) +@@ -835,9 +890,10 @@ static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr) + + cqe.u.drain_cookie = wr->wr_id; + cqe.header = cpu_to_be32(CQE_STATUS_V(T4_ERR_SWFLUSH) | +- CQE_OPCODE_V(C4IW_DRAIN_OPCODE) | ++ CQE_OPCODE_V(FW_RI_SEND) | + CQE_TYPE_V(0) | + CQE_SWCQE_V(1) | ++ CQE_DRAIN_V(1) | + CQE_QPID_V(qhp->wq.sq.qid)); + + spin_lock_irqsave(&rchp->lock, flag); +@@ -846,10 +902,20 @@ static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr) + t4_swcq_produce(cq); + spin_unlock_irqrestore(&rchp->lock, flag); + +- spin_lock_irqsave(&rchp->comp_handler_lock, flag); +- (*rchp->ibcq.comp_handler)(&rchp->ibcq, +- rchp->ibcq.cq_context); +- spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); ++ if (t4_clear_cq_armed(&rchp->cq)) { ++ spin_lock_irqsave(&rchp->comp_handler_lock, flag); ++ (*rchp->ibcq.comp_handler)(&rchp->ibcq, ++ rchp->ibcq.cq_context); ++ spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); ++ } ++} ++ ++static void complete_rq_drain_wrs(struct c4iw_qp *qhp, struct ib_recv_wr *wr) ++{ ++ while (wr) { ++ complete_rq_drain_wr(qhp, wr); ++ wr = wr->next; ++ } + } + + int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, +@@ -875,7 +941,7 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, + */ + if (qhp->wq.flushed) { + spin_unlock_irqrestore(&qhp->lock, flag); +- complete_sq_drain_wr(qhp, wr); ++ err = complete_sq_drain_wrs(qhp, wr, bad_wr); + return err; + } + num_wrs = t4_sq_avail(&qhp->wq); +@@ -1024,7 +1090,7 @@ int c4iw_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, + */ + if (qhp->wq.flushed) { + spin_unlock_irqrestore(&qhp->lock, flag); +- complete_rq_drain_wr(qhp, wr); ++ complete_rq_drain_wrs(qhp, wr); + return err; + } + num_wrs = t4_rq_avail(&qhp->wq); +@@ -1267,48 +1333,51 @@ static void __flush_qp(struct c4iw_qp *qhp, struct c4iw_cq *rchp, + + pr_debug("%s qhp %p rchp %p schp %p\n", __func__, qhp, rchp, schp); + +- /* locking hierarchy: cq lock first, then qp lock. */ ++ /* locking hierarchy: cqs lock first, then qp lock. */ + spin_lock_irqsave(&rchp->lock, flag); ++ if (schp != rchp) ++ spin_lock(&schp->lock); + spin_lock(&qhp->lock); + + if (qhp->wq.flushed) { + spin_unlock(&qhp->lock); ++ if (schp != rchp) ++ spin_unlock(&schp->lock); + spin_unlock_irqrestore(&rchp->lock, flag); + return; + } + qhp->wq.flushed = 1; ++ t4_set_wq_in_error(&qhp->wq); + + c4iw_flush_hw_cq(rchp); + c4iw_count_rcqes(&rchp->cq, &qhp->wq, &count); + rq_flushed = c4iw_flush_rq(&qhp->wq, &rchp->cq, count); +- spin_unlock(&qhp->lock); +- spin_unlock_irqrestore(&rchp->lock, flag); + +- /* locking hierarchy: cq lock first, then qp lock. */ +- spin_lock_irqsave(&schp->lock, flag); +- spin_lock(&qhp->lock); + if (schp != rchp) + c4iw_flush_hw_cq(schp); + sq_flushed = c4iw_flush_sq(qhp); ++ + spin_unlock(&qhp->lock); +- spin_unlock_irqrestore(&schp->lock, flag); ++ if (schp != rchp) ++ spin_unlock(&schp->lock); ++ spin_unlock_irqrestore(&rchp->lock, flag); + + if (schp == rchp) { +- if (t4_clear_cq_armed(&rchp->cq) && +- (rq_flushed || sq_flushed)) { ++ if ((rq_flushed || sq_flushed) && ++ t4_clear_cq_armed(&rchp->cq)) { + spin_lock_irqsave(&rchp->comp_handler_lock, flag); + (*rchp->ibcq.comp_handler)(&rchp->ibcq, + rchp->ibcq.cq_context); + spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); + } + } else { +- if (t4_clear_cq_armed(&rchp->cq) && rq_flushed) { ++ if (rq_flushed && t4_clear_cq_armed(&rchp->cq)) { + spin_lock_irqsave(&rchp->comp_handler_lock, flag); + (*rchp->ibcq.comp_handler)(&rchp->ibcq, + rchp->ibcq.cq_context); + spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); + } +- if (t4_clear_cq_armed(&schp->cq) && sq_flushed) { ++ if (sq_flushed && t4_clear_cq_armed(&schp->cq)) { + spin_lock_irqsave(&schp->comp_handler_lock, flag); + (*schp->ibcq.comp_handler)(&schp->ibcq, + schp->ibcq.cq_context); +@@ -1325,8 +1394,8 @@ static void flush_qp(struct c4iw_qp *qhp) + rchp = to_c4iw_cq(qhp->ibqp.recv_cq); + schp = to_c4iw_cq(qhp->ibqp.send_cq); + +- t4_set_wq_in_error(&qhp->wq); + if (qhp->ibqp.uobject) { ++ t4_set_wq_in_error(&qhp->wq); + t4_set_cq_in_error(&rchp->cq); + spin_lock_irqsave(&rchp->comp_handler_lock, flag); + (*rchp->ibcq.comp_handler)(&rchp->ibcq, rchp->ibcq.cq_context); +diff --git a/drivers/infiniband/hw/cxgb4/t4.h b/drivers/infiniband/hw/cxgb4/t4.h +index bcb80ca67d3d..80b390e861dc 100644 +--- a/drivers/infiniband/hw/cxgb4/t4.h ++++ b/drivers/infiniband/hw/cxgb4/t4.h +@@ -197,6 +197,11 @@ struct t4_cqe { + #define CQE_SWCQE_G(x) ((((x) >> CQE_SWCQE_S)) & CQE_SWCQE_M) + #define CQE_SWCQE_V(x) ((x)<> CQE_DRAIN_S)) & CQE_DRAIN_M) ++#define CQE_DRAIN_V(x) ((x)<> CQE_STATUS_S)) & CQE_STATUS_M) +@@ -213,6 +218,7 @@ struct t4_cqe { + #define CQE_OPCODE_V(x) ((x)<header))) ++#define DRAIN_CQE(x) (CQE_DRAIN_G(be32_to_cpu((x)->header))) + #define CQE_QPID(x) (CQE_QPID_G(be32_to_cpu((x)->header))) + #define CQE_TYPE(x) (CQE_TYPE_G(be32_to_cpu((x)->header))) + #define SQ_TYPE(x) (CQE_TYPE((x))) +diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c +index 95178b4e3565..ee578fa713c2 100644 +--- a/drivers/infiniband/ulp/srpt/ib_srpt.c ++++ b/drivers/infiniband/ulp/srpt/ib_srpt.c +@@ -1000,8 +1000,7 @@ static int srpt_init_ch_qp(struct srpt_rdma_ch *ch, struct ib_qp *qp) + return -ENOMEM; + + attr->qp_state = IB_QPS_INIT; +- attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ | +- IB_ACCESS_REMOTE_WRITE; ++ attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE; + attr->port_num = ch->sport->port; + attr->pkey_index = 0; + +@@ -1992,7 +1991,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, + goto destroy_ib; + } + +- guid = (__be16 *)¶m->primary_path->sgid.global.interface_id; ++ guid = (__be16 *)¶m->primary_path->dgid.global.interface_id; + snprintf(ch->ini_guid, sizeof(ch->ini_guid), "%04x:%04x:%04x:%04x", + be16_to_cpu(guid[0]), be16_to_cpu(guid[1]), + be16_to_cpu(guid[2]), be16_to_cpu(guid[3])); +diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c +index 8e3adcb46851..6d416fdc25cb 100644 +--- a/drivers/md/dm-bufio.c ++++ b/drivers/md/dm-bufio.c +@@ -1611,7 +1611,8 @@ static unsigned long __scan(struct dm_bufio_client *c, unsigned long nr_to_scan, + int l; + struct dm_buffer *b, *tmp; + unsigned long freed = 0; +- unsigned long count = nr_to_scan; ++ unsigned long count = c->n_buffers[LIST_CLEAN] + ++ c->n_buffers[LIST_DIRTY]; + unsigned long retain_target = get_retain_buffers(c); + + for (l = 0; l < LIST_SIZE; l++) { +@@ -1647,8 +1648,11 @@ static unsigned long + dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) + { + struct dm_bufio_client *c = container_of(shrink, struct dm_bufio_client, shrinker); ++ unsigned long count = ACCESS_ONCE(c->n_buffers[LIST_CLEAN]) + ++ ACCESS_ONCE(c->n_buffers[LIST_DIRTY]); ++ unsigned long retain_target = get_retain_buffers(c); + +- return ACCESS_ONCE(c->n_buffers[LIST_CLEAN]) + ACCESS_ONCE(c->n_buffers[LIST_DIRTY]); ++ return (count < retain_target) ? 0 : (count - retain_target); + } + + /* +diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c +index fcf7235d5742..157e1d9e7725 100644 +--- a/drivers/mmc/host/renesas_sdhi_core.c ++++ b/drivers/mmc/host/renesas_sdhi_core.c +@@ -24,6 +24,7 @@ + #include + #include + #include ++#include + #include + #include + #include +@@ -667,3 +668,5 @@ int renesas_sdhi_remove(struct platform_device *pdev) + return 0; + } + EXPORT_SYMBOL_GPL(renesas_sdhi_remove); ++ ++MODULE_LICENSE("GPL v2"); +diff --git a/drivers/mux/core.c b/drivers/mux/core.c +index 2260063b0ea8..6e5cf9d9cd99 100644 +--- a/drivers/mux/core.c ++++ b/drivers/mux/core.c +@@ -413,6 +413,7 @@ static int of_dev_node_match(struct device *dev, const void *data) + return dev->of_node == data; + } + ++/* Note this function returns a reference to the mux_chip dev. */ + static struct mux_chip *of_find_mux_chip_by_node(struct device_node *np) + { + struct device *dev; +@@ -466,6 +467,7 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name) + (!args.args_count && (mux_chip->controllers > 1))) { + dev_err(dev, "%pOF: wrong #mux-control-cells for %pOF\n", + np, args.np); ++ put_device(&mux_chip->dev); + return ERR_PTR(-EINVAL); + } + +@@ -476,10 +478,10 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name) + if (controller >= mux_chip->controllers) { + dev_err(dev, "%pOF: bad mux controller %u specified in %pOF\n", + np, controller, args.np); ++ put_device(&mux_chip->dev); + return ERR_PTR(-EINVAL); + } + +- get_device(&mux_chip->dev); + return &mux_chip->mux[controller]; + } + EXPORT_SYMBOL_GPL(mux_control_get); +diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c +index 68ac3e88a8ce..8bf80ad9dc44 100644 +--- a/drivers/net/can/usb/gs_usb.c ++++ b/drivers/net/can/usb/gs_usb.c +@@ -449,7 +449,7 @@ static int gs_usb_set_bittiming(struct net_device *netdev) + dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)", + rc); + +- return rc; ++ return (rc > 0) ? 0 : rc; + } + + static void gs_usb_xmit_callback(struct urb *urb) +diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c +index 8404e8852a0f..b4c4a2c76437 100644 +--- a/drivers/net/can/vxcan.c ++++ b/drivers/net/can/vxcan.c +@@ -194,7 +194,7 @@ static int vxcan_newlink(struct net *net, struct net_device *dev, + tbp = peer_tb; + } + +- if (tbp[IFLA_IFNAME]) { ++ if (ifmp && tbp[IFLA_IFNAME]) { + nla_strlcpy(ifname, tbp[IFLA_IFNAME], IFNAMSIZ); + name_assign_type = NET_NAME_USER; + } else { +diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c +index faf7cdc97ebf..311539c6625f 100644 +--- a/drivers/net/ethernet/freescale/fec_main.c ++++ b/drivers/net/ethernet/freescale/fec_main.c +@@ -3458,6 +3458,10 @@ fec_probe(struct platform_device *pdev) + goto failed_regulator; + } + } else { ++ if (PTR_ERR(fep->reg_phy) == -EPROBE_DEFER) { ++ ret = -EPROBE_DEFER; ++ goto failed_regulator; ++ } + fep->reg_phy = NULL; + } + +@@ -3539,8 +3543,9 @@ fec_probe(struct platform_device *pdev) + failed_clk: + if (of_phy_is_fixed_link(np)) + of_phy_deregister_fixed_link(np); +-failed_phy: + of_node_put(phy_node); ++failed_phy: ++ dev_id--; + failed_ioremap: + free_netdev(ndev); + +diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c +index d6d4ed7acf03..31277d3bb7dc 100644 +--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c ++++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c +@@ -1367,6 +1367,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force) + * Checks to see of the link status of the hardware has changed. If a + * change in link status has been detected, then we read the PHY registers + * to get the current speed/duplex if link exists. ++ * ++ * Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link ++ * up). + **/ + static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) + { +@@ -1382,7 +1385,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) + * Change or Rx Sequence Error interrupt. + */ + if (!mac->get_link_status) +- return 0; ++ return 1; + + /* First we want to see if the MII Status Register reports + * link. If so, then we want to get the current speed/duplex +@@ -1613,10 +1616,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) + * different link partner. + */ + ret_val = e1000e_config_fc_after_link_up(hw); +- if (ret_val) ++ if (ret_val) { + e_dbg("Error configuring flow control\n"); ++ return ret_val; ++ } + +- return ret_val; ++ return 1; + } + + static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter) +diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +index 3ead7439821c..99bd6e88ebc7 100644 +--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c ++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +@@ -4235,7 +4235,10 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev, + return -EINVAL; + if (!info->linking) + break; +- if (netdev_has_any_upper_dev(upper_dev)) ++ if (netdev_has_any_upper_dev(upper_dev) && ++ (!netif_is_bridge_master(upper_dev) || ++ !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp, ++ upper_dev))) + return -EINVAL; + if (netif_is_lag_master(upper_dev) && + !mlxsw_sp_master_lag_check(mlxsw_sp, upper_dev, +@@ -4347,6 +4350,7 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev, + u16 vid) + { + struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev); ++ struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp; + struct netdev_notifier_changeupper_info *info = ptr; + struct net_device *upper_dev; + int err = 0; +@@ -4358,7 +4362,10 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev, + return -EINVAL; + if (!info->linking) + break; +- if (netdev_has_any_upper_dev(upper_dev)) ++ if (netdev_has_any_upper_dev(upper_dev) && ++ (!netif_is_bridge_master(upper_dev) || ++ !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp, ++ upper_dev))) + return -EINVAL; + break; + case NETDEV_CHANGEUPPER: +diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +index 84ce83acdc19..88892d47acae 100644 +--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h ++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +@@ -326,6 +326,8 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port, + void mlxsw_sp_port_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port, + struct net_device *brport_dev, + struct net_device *br_dev); ++bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp, ++ const struct net_device *br_dev); + + /* spectrum.c */ + int mlxsw_sp_port_ets_set(struct mlxsw_sp_port *mlxsw_sp_port, +diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +index 5189022a1c8c..c23cc51bb5a5 100644 +--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c ++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +@@ -2536,7 +2536,7 @@ static void __mlxsw_sp_nexthop_neigh_update(struct mlxsw_sp_nexthop *nh, + { + if (!removing) + nh->should_offload = 1; +- else if (nh->offloaded) ++ else + nh->should_offload = 0; + nh->update = 1; + } +diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +index d39ffbfcc436..f5863e5bec81 100644 +--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c ++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +@@ -134,6 +134,12 @@ mlxsw_sp_bridge_device_find(const struct mlxsw_sp_bridge *bridge, + return NULL; + } + ++bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp, ++ const struct net_device *br_dev) ++{ ++ return !!mlxsw_sp_bridge_device_find(mlxsw_sp->bridge, br_dev); ++} ++ + static struct mlxsw_sp_bridge_device * + mlxsw_sp_bridge_device_create(struct mlxsw_sp_bridge *bridge, + struct net_device *br_dev) +diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c +index d2e88a30f57b..db31963c5d9d 100644 +--- a/drivers/net/ethernet/renesas/sh_eth.c ++++ b/drivers/net/ethernet/renesas/sh_eth.c +@@ -3212,18 +3212,37 @@ static int sh_eth_drv_probe(struct platform_device *pdev) + /* ioremap the TSU registers */ + if (mdp->cd->tsu) { + struct resource *rtsu; ++ + rtsu = platform_get_resource(pdev, IORESOURCE_MEM, 1); +- mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu); +- if (IS_ERR(mdp->tsu_addr)) { +- ret = PTR_ERR(mdp->tsu_addr); ++ if (!rtsu) { ++ dev_err(&pdev->dev, "no TSU resource\n"); ++ ret = -ENODEV; ++ goto out_release; ++ } ++ /* We can only request the TSU region for the first port ++ * of the two sharing this TSU for the probe to succeed... ++ */ ++ if (devno % 2 == 0 && ++ !devm_request_mem_region(&pdev->dev, rtsu->start, ++ resource_size(rtsu), ++ dev_name(&pdev->dev))) { ++ dev_err(&pdev->dev, "can't request TSU resource.\n"); ++ ret = -EBUSY; ++ goto out_release; ++ } ++ mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start, ++ resource_size(rtsu)); ++ if (!mdp->tsu_addr) { ++ dev_err(&pdev->dev, "TSU region ioremap() failed.\n"); ++ ret = -ENOMEM; + goto out_release; + } + mdp->port = devno % 2; + ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER; + } + +- /* initialize first or needed device */ +- if (!devno || pd->needs_init) { ++ /* Need to init only the first port of the two sharing a TSU */ ++ if (devno % 2 == 0) { + if (mdp->cd->chip_reset) + mdp->cd->chip_reset(ndev); + +diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +index 28c4d6fa096c..0ad12c81a9e4 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c ++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +@@ -364,9 +364,15 @@ static void stmmac_eee_ctrl_timer(unsigned long arg) + bool stmmac_eee_init(struct stmmac_priv *priv) + { + struct net_device *ndev = priv->dev; ++ int interface = priv->plat->interface; + unsigned long flags; + bool ret = false; + ++ if ((interface != PHY_INTERFACE_MODE_MII) && ++ (interface != PHY_INTERFACE_MODE_GMII) && ++ !phy_interface_mode_is_rgmii(interface)) ++ goto out; ++ + /* Using PCS we cannot dial with the phy registers at this stage + * so we do not support extra feature like EEE. + */ +diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c +index 4b377b978a0b..cb85307f125b 100644 +--- a/drivers/net/phy/phylink.c ++++ b/drivers/net/phy/phylink.c +@@ -1428,9 +1428,8 @@ static void phylink_sfp_link_down(void *upstream) + WARN_ON(!lockdep_rtnl_is_held()); + + set_bit(PHYLINK_DISABLE_LINK, &pl->phylink_disable_state); ++ queue_work(system_power_efficient_wq, &pl->resolve); + flush_work(&pl->resolve); +- +- netif_carrier_off(pl->netdev); + } + + static void phylink_sfp_link_up(void *upstream) +diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c +index 5cb5384697ea..7ae815bee52d 100644 +--- a/drivers/net/phy/sfp-bus.c ++++ b/drivers/net/phy/sfp-bus.c +@@ -359,7 +359,8 @@ EXPORT_SYMBOL_GPL(sfp_register_upstream); + void sfp_unregister_upstream(struct sfp_bus *bus) + { + rtnl_lock(); +- sfp_unregister_bus(bus); ++ if (bus->sfp) ++ sfp_unregister_bus(bus); + bus->upstream = NULL; + bus->netdev = NULL; + rtnl_unlock(); +@@ -464,7 +465,8 @@ EXPORT_SYMBOL_GPL(sfp_register_socket); + void sfp_unregister_socket(struct sfp_bus *bus) + { + rtnl_lock(); +- sfp_unregister_bus(bus); ++ if (bus->netdev) ++ sfp_unregister_bus(bus); + bus->sfp_dev = NULL; + bus->sfp = NULL; + bus->socket_ops = NULL; +diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +index 4fb7647995c3..9875ab5ce18c 100644 +--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h ++++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +@@ -666,11 +666,15 @@ static inline u8 iwl_pcie_get_cmd_index(struct iwl_txq *q, u32 index) + return index & (q->n_window - 1); + } + +-static inline void *iwl_pcie_get_tfd(struct iwl_trans_pcie *trans_pcie, ++static inline void *iwl_pcie_get_tfd(struct iwl_trans *trans, + struct iwl_txq *txq, int idx) + { +- return txq->tfds + trans_pcie->tfd_size * iwl_pcie_get_cmd_index(txq, +- idx); ++ struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); ++ ++ if (trans->cfg->use_tfh) ++ idx = iwl_pcie_get_cmd_index(txq, idx); ++ ++ return txq->tfds + trans_pcie->tfd_size * idx; + } + + static inline void iwl_enable_rfkill_int(struct iwl_trans *trans) +diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c +index d74613fcb756..6f45c8148b27 100644 +--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c ++++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c +@@ -171,8 +171,6 @@ static void iwl_pcie_gen2_tfd_unmap(struct iwl_trans *trans, + + static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq) + { +- struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); +- + /* rd_ptr is bounded by TFD_QUEUE_SIZE_MAX and + * idx is bounded by n_window + */ +@@ -181,7 +179,7 @@ static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq) + lockdep_assert_held(&txq->lock); + + iwl_pcie_gen2_tfd_unmap(trans, &txq->entries[idx].meta, +- iwl_pcie_get_tfd(trans_pcie, txq, idx)); ++ iwl_pcie_get_tfd(trans, txq, idx)); + + /* free SKB */ + if (txq->entries) { +@@ -367,11 +365,9 @@ struct iwl_tfh_tfd *iwl_pcie_gen2_build_tfd(struct iwl_trans *trans, + struct sk_buff *skb, + struct iwl_cmd_meta *out_meta) + { +- struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; + int idx = iwl_pcie_get_cmd_index(txq, txq->write_ptr); +- struct iwl_tfh_tfd *tfd = +- iwl_pcie_get_tfd(trans_pcie, txq, idx); ++ struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, idx); + dma_addr_t tb_phys; + bool amsdu; + int i, len, tb1_len, tb2_len, hdr_len; +@@ -568,8 +564,7 @@ static int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans, + u8 group_id = iwl_cmd_groupid(cmd->id); + const u8 *cmddata[IWL_MAX_CMD_TBS_PER_TFD]; + u16 cmdlen[IWL_MAX_CMD_TBS_PER_TFD]; +- struct iwl_tfh_tfd *tfd = +- iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr); ++ struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr); + + memset(tfd, 0, sizeof(*tfd)); + +diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +index c645d10d3707..4704137a26e0 100644 +--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c ++++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +@@ -373,7 +373,7 @@ static void iwl_pcie_tfd_unmap(struct iwl_trans *trans, + { + struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + int i, num_tbs; +- void *tfd = iwl_pcie_get_tfd(trans_pcie, txq, index); ++ void *tfd = iwl_pcie_get_tfd(trans, txq, index); + + /* Sanity check on number of chunks */ + num_tbs = iwl_pcie_tfd_get_num_tbs(trans, tfd); +@@ -1999,7 +1999,7 @@ static int iwl_fill_data_tbs(struct iwl_trans *trans, struct sk_buff *skb, + } + + trace_iwlwifi_dev_tx(trans->dev, skb, +- iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr), ++ iwl_pcie_get_tfd(trans, txq, txq->write_ptr), + trans_pcie->tfd_size, + &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, + hdr_len); +@@ -2073,7 +2073,7 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, + IEEE80211_CCMP_HDR_LEN : 0; + + trace_iwlwifi_dev_tx(trans->dev, skb, +- iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr), ++ iwl_pcie_get_tfd(trans, txq, txq->write_ptr), + trans_pcie->tfd_size, + &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, 0); + +@@ -2406,7 +2406,7 @@ int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb, + memcpy(&txq->first_tb_bufs[txq->write_ptr], &dev_cmd->hdr, + IWL_FIRST_TB_SIZE); + +- tfd = iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr); ++ tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr); + /* Set up entry for this TFD in Tx byte-count array */ + iwl_pcie_txq_update_byte_cnt_tbl(trans, txq, le16_to_cpu(tx_cmd->len), + iwl_pcie_tfd_get_num_tbs(trans, tfd)); +diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c +index 0765b1797d4c..7f8fa42a1084 100644 +--- a/drivers/platform/x86/wmi.c ++++ b/drivers/platform/x86/wmi.c +@@ -1268,5 +1268,5 @@ static void __exit acpi_wmi_exit(void) + bus_unregister(&wmi_bus_type); + } + +-subsys_initcall(acpi_wmi_init); ++subsys_initcall_sync(acpi_wmi_init); + module_exit(acpi_wmi_exit); +diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c +index 0f695df14c9d..372ce9913e6d 100644 +--- a/drivers/staging/android/ashmem.c ++++ b/drivers/staging/android/ashmem.c +@@ -765,10 +765,12 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) + break; + case ASHMEM_SET_SIZE: + ret = -EINVAL; ++ mutex_lock(&ashmem_mutex); + if (!asma->file) { + ret = 0; + asma->size = (size_t)arg; + } ++ mutex_unlock(&ashmem_mutex); + break; + case ASHMEM_GET_SIZE: + ret = asma->size; +diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c +index def1b05ffca0..284bd1a7b570 100644 +--- a/drivers/usb/gadget/udc/core.c ++++ b/drivers/usb/gadget/udc/core.c +@@ -1158,11 +1158,7 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, + + udc = kzalloc(sizeof(*udc), GFP_KERNEL); + if (!udc) +- goto err1; +- +- ret = device_add(&gadget->dev); +- if (ret) +- goto err2; ++ goto err_put_gadget; + + device_initialize(&udc->dev); + udc->dev.release = usb_udc_release; +@@ -1171,7 +1167,11 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, + udc->dev.parent = parent; + ret = dev_set_name(&udc->dev, "%s", kobject_name(&parent->kobj)); + if (ret) +- goto err3; ++ goto err_put_udc; ++ ++ ret = device_add(&gadget->dev); ++ if (ret) ++ goto err_put_udc; + + udc->gadget = gadget; + gadget->udc = udc; +@@ -1181,7 +1181,7 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, + + ret = device_add(&udc->dev); + if (ret) +- goto err4; ++ goto err_unlist_udc; + + usb_gadget_set_state(gadget, USB_STATE_NOTATTACHED); + udc->vbus = true; +@@ -1189,27 +1189,25 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, + /* pick up one of pending gadget drivers */ + ret = check_pending_gadget_drivers(udc); + if (ret) +- goto err5; ++ goto err_del_udc; + + mutex_unlock(&udc_lock); + + return 0; + +-err5: ++ err_del_udc: + device_del(&udc->dev); + +-err4: ++ err_unlist_udc: + list_del(&udc->list); + mutex_unlock(&udc_lock); + +-err3: +- put_device(&udc->dev); + device_del(&gadget->dev); + +-err2: +- kfree(udc); ++ err_put_udc: ++ put_device(&udc->dev); + +-err1: ++ err_put_gadget: + put_device(&gadget->dev); + return ret; + } +diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c +index 8e7737d7ac0a..03be5d574f23 100644 +--- a/drivers/usb/misc/usb3503.c ++++ b/drivers/usb/misc/usb3503.c +@@ -292,6 +292,8 @@ static int usb3503_probe(struct usb3503 *hub) + if (gpio_is_valid(hub->gpio_reset)) { + err = devm_gpio_request_one(dev, hub->gpio_reset, + GPIOF_OUT_INIT_LOW, "usb3503 reset"); ++ /* Datasheet defines a hardware reset to be at least 100us */ ++ usleep_range(100, 10000); + if (err) { + dev_err(dev, + "unable to request GPIO %d as reset pin (%d)\n", +diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c +index f6ae753ab99b..f932f40302df 100644 +--- a/drivers/usb/mon/mon_bin.c ++++ b/drivers/usb/mon/mon_bin.c +@@ -1004,7 +1004,9 @@ static long mon_bin_ioctl(struct file *file, unsigned int cmd, unsigned long arg + break; + + case MON_IOCQ_RING_SIZE: ++ mutex_lock(&rp->fetch_lock); + ret = rp->b_size; ++ mutex_unlock(&rp->fetch_lock); + break; + + case MON_IOCT_RING_SIZE: +@@ -1231,12 +1233,16 @@ static int mon_bin_vma_fault(struct vm_fault *vmf) + unsigned long offset, chunk_idx; + struct page *pageptr; + ++ mutex_lock(&rp->fetch_lock); + offset = vmf->pgoff << PAGE_SHIFT; +- if (offset >= rp->b_size) ++ if (offset >= rp->b_size) { ++ mutex_unlock(&rp->fetch_lock); + return VM_FAULT_SIGBUS; ++ } + chunk_idx = offset / CHUNK_SIZE; + pageptr = rp->b_vec[chunk_idx].pg; + get_page(pageptr); ++ mutex_unlock(&rp->fetch_lock); + vmf->page = pageptr; + return 0; + } +diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c +index 412f812522ee..aed182d24d23 100644 +--- a/drivers/usb/serial/cp210x.c ++++ b/drivers/usb/serial/cp210x.c +@@ -127,6 +127,7 @@ static const struct usb_device_id id_table[] = { + { USB_DEVICE(0x10C4, 0x8470) }, /* Juniper Networks BX Series System Console */ + { USB_DEVICE(0x10C4, 0x8477) }, /* Balluff RFID */ + { USB_DEVICE(0x10C4, 0x84B6) }, /* Starizona Hyperion */ ++ { USB_DEVICE(0x10C4, 0x85A7) }, /* LifeScan OneTouch Verio IQ */ + { USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */ + { USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */ + { USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */ +@@ -177,6 +178,7 @@ static const struct usb_device_id id_table[] = { + { USB_DEVICE(0x1843, 0x0200) }, /* Vaisala USB Instrument Cable */ + { USB_DEVICE(0x18EF, 0xE00F) }, /* ELV USB-I2C-Interface */ + { USB_DEVICE(0x18EF, 0xE025) }, /* ELV Marble Sound Board 1 */ ++ { USB_DEVICE(0x18EF, 0xE030) }, /* ELV ALC 8xxx Battery Charger */ + { USB_DEVICE(0x18EF, 0xE032) }, /* ELV TFD500 Data Logger */ + { USB_DEVICE(0x1901, 0x0190) }, /* GE B850 CP2105 Recorder interface */ + { USB_DEVICE(0x1901, 0x0193) }, /* GE B650 CP2104 PMC interface */ +diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h +index 9f356f7cf7d5..719ec68ae309 100644 +--- a/drivers/usb/storage/unusual_uas.h ++++ b/drivers/usb/storage/unusual_uas.h +@@ -156,6 +156,13 @@ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999, + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_NO_ATA_1X), + ++/* Reported-by: Icenowy Zheng */ ++UNUSUAL_DEV(0x2537, 0x1068, 0x0000, 0x9999, ++ "Norelsys", ++ "NS1068X", ++ USB_SC_DEVICE, USB_PR_DEVICE, NULL, ++ US_FL_IGNORE_UAS), ++ + /* Reported-by: Takeo Nakayama */ + UNUSUAL_DEV(0x357d, 0x7788, 0x0000, 0x9999, + "JMicron", +diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c +index 17b599b923f3..7f0d22131121 100644 +--- a/drivers/usb/usbip/usbip_common.c ++++ b/drivers/usb/usbip/usbip_common.c +@@ -105,7 +105,7 @@ static void usbip_dump_usb_device(struct usb_device *udev) + dev_dbg(dev, " devnum(%d) devpath(%s) usb speed(%s)", + udev->devnum, udev->devpath, usb_speed_string(udev->speed)); + +- pr_debug("tt %p, ttport %d\n", udev->tt, udev->ttport); ++ pr_debug("tt hub ttport %d\n", udev->ttport); + + dev_dbg(dev, " "); + for (i = 0; i < 16; i++) +@@ -138,12 +138,8 @@ static void usbip_dump_usb_device(struct usb_device *udev) + } + pr_debug("\n"); + +- dev_dbg(dev, "parent %p, bus %p\n", udev->parent, udev->bus); +- +- dev_dbg(dev, +- "descriptor %p, config %p, actconfig %p, rawdescriptors %p\n", +- &udev->descriptor, udev->config, +- udev->actconfig, udev->rawdescriptors); ++ dev_dbg(dev, "parent %s, bus %s\n", dev_name(&udev->parent->dev), ++ udev->bus->bus_name); + + dev_dbg(dev, "have_langid %d, string_langid %d\n", + udev->have_langid, udev->string_langid); +@@ -251,9 +247,6 @@ void usbip_dump_urb(struct urb *urb) + + dev = &urb->dev->dev; + +- dev_dbg(dev, " urb :%p\n", urb); +- dev_dbg(dev, " dev :%p\n", urb->dev); +- + usbip_dump_usb_device(urb->dev); + + dev_dbg(dev, " pipe :%08x ", urb->pipe); +@@ -262,11 +255,9 @@ void usbip_dump_urb(struct urb *urb) + + dev_dbg(dev, " status :%d\n", urb->status); + dev_dbg(dev, " transfer_flags :%08X\n", urb->transfer_flags); +- dev_dbg(dev, " transfer_buffer :%p\n", urb->transfer_buffer); + dev_dbg(dev, " transfer_buffer_length:%d\n", + urb->transfer_buffer_length); + dev_dbg(dev, " actual_length :%d\n", urb->actual_length); +- dev_dbg(dev, " setup_packet :%p\n", urb->setup_packet); + + if (urb->setup_packet && usb_pipetype(urb->pipe) == PIPE_CONTROL) + usbip_dump_usb_ctrlrequest( +@@ -276,8 +267,6 @@ void usbip_dump_urb(struct urb *urb) + dev_dbg(dev, " number_of_packets :%d\n", urb->number_of_packets); + dev_dbg(dev, " interval :%d\n", urb->interval); + dev_dbg(dev, " error_count :%d\n", urb->error_count); +- dev_dbg(dev, " context :%p\n", urb->context); +- dev_dbg(dev, " complete :%p\n", urb->complete); + } + EXPORT_SYMBOL_GPL(usbip_dump_urb); + +diff --git a/drivers/usb/usbip/vudc_rx.c b/drivers/usb/usbip/vudc_rx.c +index e429b59f6f8a..d020e72b3122 100644 +--- a/drivers/usb/usbip/vudc_rx.c ++++ b/drivers/usb/usbip/vudc_rx.c +@@ -132,6 +132,25 @@ static int v_recv_cmd_submit(struct vudc *udc, + urb_p->new = 1; + urb_p->seqnum = pdu->base.seqnum; + ++ if (urb_p->ep->type == USB_ENDPOINT_XFER_ISOC) { ++ /* validate packet size and number of packets */ ++ unsigned int maxp, packets, bytes; ++ ++ maxp = usb_endpoint_maxp(urb_p->ep->desc); ++ maxp *= usb_endpoint_maxp_mult(urb_p->ep->desc); ++ bytes = pdu->u.cmd_submit.transfer_buffer_length; ++ packets = DIV_ROUND_UP(bytes, maxp); ++ ++ if (pdu->u.cmd_submit.number_of_packets < 0 || ++ pdu->u.cmd_submit.number_of_packets > packets) { ++ dev_err(&udc->gadget.dev, ++ "CMD_SUBMIT: isoc invalid num packets %d\n", ++ pdu->u.cmd_submit.number_of_packets); ++ ret = -EMSGSIZE; ++ goto free_urbp; ++ } ++ } ++ + ret = alloc_urb_from_cmd(&urb_p->urb, pdu, urb_p->ep->type); + if (ret) { + usbip_event_add(&udc->ud, VUDC_EVENT_ERROR_MALLOC); +diff --git a/drivers/usb/usbip/vudc_tx.c b/drivers/usb/usbip/vudc_tx.c +index 234661782fa0..3ab4c86486a7 100644 +--- a/drivers/usb/usbip/vudc_tx.c ++++ b/drivers/usb/usbip/vudc_tx.c +@@ -97,6 +97,13 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p) + memset(&pdu_header, 0, sizeof(pdu_header)); + memset(&msg, 0, sizeof(msg)); + ++ if (urb->actual_length > 0 && !urb->transfer_buffer) { ++ dev_err(&udc->gadget.dev, ++ "urb: actual_length %d transfer_buffer null\n", ++ urb->actual_length); ++ return -1; ++ } ++ + if (urb_p->type == USB_ENDPOINT_XFER_ISOC) + iovnum = 2 + urb->number_of_packets; + else +@@ -112,8 +119,8 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p) + + /* 1. setup usbip_header */ + setup_ret_submit_pdu(&pdu_header, urb_p); +- usbip_dbg_stub_tx("setup txdata seqnum: %d urb: %p\n", +- pdu_header.base.seqnum, urb); ++ usbip_dbg_stub_tx("setup txdata seqnum: %d\n", ++ pdu_header.base.seqnum); + usbip_header_correct_endian(&pdu_header, 1); + + iov[iovnum].iov_base = &pdu_header; +diff --git a/include/linux/bpf.h b/include/linux/bpf.h +index f1af7d63d678..0bcf803f20de 100644 +--- a/include/linux/bpf.h ++++ b/include/linux/bpf.h +@@ -51,6 +51,7 @@ struct bpf_map { + u32 pages; + u32 id; + int numa_node; ++ bool unpriv_array; + struct user_struct *user; + const struct bpf_map_ops *ops; + struct work_struct work; +@@ -195,6 +196,7 @@ struct bpf_prog_aux { + struct bpf_array { + struct bpf_map map; + u32 elem_size; ++ u32 index_mask; + /* 'ownership' of prog_array is claimed by the first program that + * is going to use this map or by the first program which FD is stored + * in the map to make sure that all callers and callees have the same +diff --git a/include/linux/cpu.h b/include/linux/cpu.h +index 938ea8ae0ba4..c816e6f2730c 100644 +--- a/include/linux/cpu.h ++++ b/include/linux/cpu.h +@@ -47,6 +47,13 @@ extern void cpu_remove_dev_attr(struct device_attribute *attr); + extern int cpu_add_dev_attr_group(struct attribute_group *attrs); + extern void cpu_remove_dev_attr_group(struct attribute_group *attrs); + ++extern ssize_t cpu_show_meltdown(struct device *dev, ++ struct device_attribute *attr, char *buf); ++extern ssize_t cpu_show_spectre_v1(struct device *dev, ++ struct device_attribute *attr, char *buf); ++extern ssize_t cpu_show_spectre_v2(struct device *dev, ++ struct device_attribute *attr, char *buf); ++ + extern __printf(4, 5) + struct device *cpu_device_create(struct device *parent, void *drvdata, + const struct attribute_group **groups, +diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h +index 06097ef30449..b511f6d24b42 100644 +--- a/include/linux/crash_core.h ++++ b/include/linux/crash_core.h +@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void); + vmcoreinfo_append_str("PAGESIZE=%ld\n", value) + #define VMCOREINFO_SYMBOL(name) \ + vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name) ++#define VMCOREINFO_SYMBOL_ARRAY(name) \ ++ vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name) + #define VMCOREINFO_SIZE(name) \ + vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \ + (unsigned long)sizeof(name)) +diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h +index ff3642d267f7..94081e9a5010 100644 +--- a/include/linux/sh_eth.h ++++ b/include/linux/sh_eth.h +@@ -17,7 +17,6 @@ struct sh_eth_plat_data { + unsigned char mac_addr[ETH_ALEN]; + unsigned no_ether_link:1; + unsigned ether_link_active_low:1; +- unsigned needs_init:1; + }; + + #endif +diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h +index 0477945de1a3..8e1e1dc490fd 100644 +--- a/include/net/sctp/structs.h ++++ b/include/net/sctp/structs.h +@@ -955,7 +955,7 @@ void sctp_transport_burst_limited(struct sctp_transport *); + void sctp_transport_burst_reset(struct sctp_transport *); + unsigned long sctp_transport_timeout(struct sctp_transport *); + void sctp_transport_reset(struct sctp_transport *t); +-void sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu); ++bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu); + void sctp_transport_immediate_rtx(struct sctp_transport *); + void sctp_transport_dst_release(struct sctp_transport *t); + void sctp_transport_dst_confirm(struct sctp_transport *t); +diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h +index e4b0b8e09932..2c735a3e6613 100644 +--- a/include/trace/events/kvm.h ++++ b/include/trace/events/kvm.h +@@ -211,7 +211,7 @@ TRACE_EVENT(kvm_ack_irq, + { KVM_TRACE_MMIO_WRITE, "write" } + + TRACE_EVENT(kvm_mmio, +- TP_PROTO(int type, int len, u64 gpa, u64 val), ++ TP_PROTO(int type, int len, u64 gpa, void *val), + TP_ARGS(type, len, gpa, val), + + TP_STRUCT__entry( +@@ -225,7 +225,10 @@ TRACE_EVENT(kvm_mmio, + __entry->type = type; + __entry->len = len; + __entry->gpa = gpa; +- __entry->val = val; ++ __entry->val = 0; ++ if (val) ++ memcpy(&__entry->val, val, ++ min_t(u32, sizeof(__entry->val), len)); + ), + + TP_printk("mmio %s len %u gpa 0x%llx val 0x%llx", +diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c +index e2636737b69b..a4ae1ca44a57 100644 +--- a/kernel/bpf/arraymap.c ++++ b/kernel/bpf/arraymap.c +@@ -50,9 +50,10 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) + { + bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY; + int numa_node = bpf_map_attr_numa_node(attr); ++ u32 elem_size, index_mask, max_entries; ++ bool unpriv = !capable(CAP_SYS_ADMIN); + struct bpf_array *array; +- u64 array_size; +- u32 elem_size; ++ u64 array_size, mask64; + + /* check sanity of attributes */ + if (attr->max_entries == 0 || attr->key_size != 4 || +@@ -68,11 +69,32 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) + + elem_size = round_up(attr->value_size, 8); + ++ max_entries = attr->max_entries; ++ ++ /* On 32 bit archs roundup_pow_of_two() with max_entries that has ++ * upper most bit set in u32 space is undefined behavior due to ++ * resulting 1U << 32, so do it manually here in u64 space. ++ */ ++ mask64 = fls_long(max_entries - 1); ++ mask64 = 1ULL << mask64; ++ mask64 -= 1; ++ ++ index_mask = mask64; ++ if (unpriv) { ++ /* round up array size to nearest power of 2, ++ * since cpu will speculate within index_mask limits ++ */ ++ max_entries = index_mask + 1; ++ /* Check for overflows. */ ++ if (max_entries < attr->max_entries) ++ return ERR_PTR(-E2BIG); ++ } ++ + array_size = sizeof(*array); + if (percpu) +- array_size += (u64) attr->max_entries * sizeof(void *); ++ array_size += (u64) max_entries * sizeof(void *); + else +- array_size += (u64) attr->max_entries * elem_size; ++ array_size += (u64) max_entries * elem_size; + + /* make sure there is no u32 overflow later in round_up() */ + if (array_size >= U32_MAX - PAGE_SIZE) +@@ -82,6 +104,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) + array = bpf_map_area_alloc(array_size, numa_node); + if (!array) + return ERR_PTR(-ENOMEM); ++ array->index_mask = index_mask; ++ array->map.unpriv_array = unpriv; + + /* copy mandatory map attributes */ + array->map.map_type = attr->map_type; +@@ -117,12 +141,13 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key) + if (unlikely(index >= array->map.max_entries)) + return NULL; + +- return array->value + array->elem_size * index; ++ return array->value + array->elem_size * (index & array->index_mask); + } + + /* emit BPF instructions equivalent to C code of array_map_lookup_elem() */ + static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) + { ++ struct bpf_array *array = container_of(map, struct bpf_array, map); + struct bpf_insn *insn = insn_buf; + u32 elem_size = round_up(map->value_size, 8); + const int ret = BPF_REG_0; +@@ -131,7 +156,12 @@ static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) + + *insn++ = BPF_ALU64_IMM(BPF_ADD, map_ptr, offsetof(struct bpf_array, value)); + *insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0); +- *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 3); ++ if (map->unpriv_array) { ++ *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 4); ++ *insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask); ++ } else { ++ *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 3); ++ } + + if (is_power_of_2(elem_size)) { + *insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(elem_size)); +@@ -153,7 +183,7 @@ static void *percpu_array_map_lookup_elem(struct bpf_map *map, void *key) + if (unlikely(index >= array->map.max_entries)) + return NULL; + +- return this_cpu_ptr(array->pptrs[index]); ++ return this_cpu_ptr(array->pptrs[index & array->index_mask]); + } + + int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value) +@@ -173,7 +203,7 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value) + */ + size = round_up(map->value_size, 8); + rcu_read_lock(); +- pptr = array->pptrs[index]; ++ pptr = array->pptrs[index & array->index_mask]; + for_each_possible_cpu(cpu) { + bpf_long_memcpy(value + off, per_cpu_ptr(pptr, cpu), size); + off += size; +@@ -221,10 +251,11 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value, + return -EEXIST; + + if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) +- memcpy(this_cpu_ptr(array->pptrs[index]), ++ memcpy(this_cpu_ptr(array->pptrs[index & array->index_mask]), + value, map->value_size); + else +- memcpy(array->value + array->elem_size * index, ++ memcpy(array->value + ++ array->elem_size * (index & array->index_mask), + value, map->value_size); + return 0; + } +@@ -258,7 +289,7 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value, + */ + size = round_up(map->value_size, 8); + rcu_read_lock(); +- pptr = array->pptrs[index]; ++ pptr = array->pptrs[index & array->index_mask]; + for_each_possible_cpu(cpu) { + bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value + off, size); + off += size; +@@ -609,6 +640,7 @@ static void *array_of_map_lookup_elem(struct bpf_map *map, void *key) + static u32 array_of_map_gen_lookup(struct bpf_map *map, + struct bpf_insn *insn_buf) + { ++ struct bpf_array *array = container_of(map, struct bpf_array, map); + u32 elem_size = round_up(map->value_size, 8); + struct bpf_insn *insn = insn_buf; + const int ret = BPF_REG_0; +@@ -617,7 +649,12 @@ static u32 array_of_map_gen_lookup(struct bpf_map *map, + + *insn++ = BPF_ALU64_IMM(BPF_ADD, map_ptr, offsetof(struct bpf_array, value)); + *insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0); +- *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5); ++ if (map->unpriv_array) { ++ *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 6); ++ *insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask); ++ } else { ++ *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5); ++ } + if (is_power_of_2(elem_size)) + *insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(elem_size)); + else +diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c +index c5ff809e86d0..75a5c3312f46 100644 +--- a/kernel/bpf/verifier.c ++++ b/kernel/bpf/verifier.c +@@ -1701,6 +1701,13 @@ static int check_call(struct bpf_verifier_env *env, int func_id, int insn_idx) + err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &meta); + if (err) + return err; ++ if (func_id == BPF_FUNC_tail_call) { ++ if (meta.map_ptr == NULL) { ++ verbose("verifier bug\n"); ++ return -EINVAL; ++ } ++ env->insn_aux_data[insn_idx].map_ptr = meta.map_ptr; ++ } + err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &meta); + if (err) + return err; +@@ -2486,6 +2493,11 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) + return -EINVAL; + } + ++ if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) { ++ verbose("BPF_ARSH not supported for 32 bit ALU\n"); ++ return -EINVAL; ++ } ++ + if ((opcode == BPF_LSH || opcode == BPF_RSH || + opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) { + int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32; +@@ -4315,6 +4327,35 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) + */ + insn->imm = 0; + insn->code = BPF_JMP | BPF_TAIL_CALL; ++ ++ /* instead of changing every JIT dealing with tail_call ++ * emit two extra insns: ++ * if (index >= max_entries) goto out; ++ * index &= array->index_mask; ++ * to avoid out-of-bounds cpu speculation ++ */ ++ map_ptr = env->insn_aux_data[i + delta].map_ptr; ++ if (map_ptr == BPF_MAP_PTR_POISON) { ++ verbose("tail_call obusing map_ptr\n"); ++ return -EINVAL; ++ } ++ if (!map_ptr->unpriv_array) ++ continue; ++ insn_buf[0] = BPF_JMP_IMM(BPF_JGE, BPF_REG_3, ++ map_ptr->max_entries, 2); ++ insn_buf[1] = BPF_ALU32_IMM(BPF_AND, BPF_REG_3, ++ container_of(map_ptr, ++ struct bpf_array, ++ map)->index_mask); ++ insn_buf[2] = *insn; ++ cnt = 3; ++ new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); ++ if (!new_prog) ++ return -ENOMEM; ++ ++ delta += cnt - 1; ++ env->prog = prog = new_prog; ++ insn = new_prog->insnsi + i + delta; + continue; + } + +diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c +index 44857278eb8a..030e4286f14c 100644 +--- a/kernel/cgroup/cgroup.c ++++ b/kernel/cgroup/cgroup.c +@@ -4059,26 +4059,24 @@ static void css_task_iter_advance_css_set(struct css_task_iter *it) + + static void css_task_iter_advance(struct css_task_iter *it) + { +- struct list_head *l = it->task_pos; ++ struct list_head *next; + + lockdep_assert_held(&css_set_lock); +- WARN_ON_ONCE(!l); +- + repeat: + /* + * Advance iterator to find next entry. cset->tasks is consumed + * first and then ->mg_tasks. After ->mg_tasks, we move onto the + * next cset. + */ +- l = l->next; ++ next = it->task_pos->next; + +- if (l == it->tasks_head) +- l = it->mg_tasks_head->next; ++ if (next == it->tasks_head) ++ next = it->mg_tasks_head->next; + +- if (l == it->mg_tasks_head) ++ if (next == it->mg_tasks_head) + css_task_iter_advance_css_set(it); + else +- it->task_pos = l; ++ it->task_pos = next; + + /* if PROCS, skip over tasks which aren't group leaders */ + if ((it->flags & CSS_TASK_ITER_PROCS) && it->task_pos && +diff --git a/kernel/crash_core.c b/kernel/crash_core.c +index 6db80fc0810b..2d90996dbe77 100644 +--- a/kernel/crash_core.c ++++ b/kernel/crash_core.c +@@ -409,7 +409,7 @@ static int __init crash_save_vmcoreinfo_init(void) + VMCOREINFO_SYMBOL(contig_page_data); + #endif + #ifdef CONFIG_SPARSEMEM +- VMCOREINFO_SYMBOL(mem_section); ++ VMCOREINFO_SYMBOL_ARRAY(mem_section); + VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); + VMCOREINFO_STRUCT_SIZE(mem_section); + VMCOREINFO_OFFSET(mem_section, section_mem_map); +diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c +index dd7908743dab..9bcbacba82a8 100644 +--- a/kernel/sched/membarrier.c ++++ b/kernel/sched/membarrier.c +@@ -89,7 +89,9 @@ static int membarrier_private_expedited(void) + rcu_read_unlock(); + } + if (!fallback) { ++ preempt_disable(); + smp_call_function_many(tmpmask, ipi_mb, NULL, 1); ++ preempt_enable(); + free_cpumask_var(tmpmask); + } + cpus_read_unlock(); +diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c +index 4a72ee4e2ae9..cf2e70003a53 100644 +--- a/net/8021q/vlan.c ++++ b/net/8021q/vlan.c +@@ -111,12 +111,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head) + vlan_gvrp_uninit_applicant(real_dev); + } + +- /* Take it out of our own structures, but be sure to interlock with +- * HW accelerating devices or SW vlan input packet processing if +- * VLAN is not 0 (leave it there for 802.1p). +- */ +- if (vlan_id) +- vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); ++ vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); + + /* Get rid of the vlan's reference to real_dev */ + dev_put(real_dev); +diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c +index 43ba91c440bc..fc6615d59165 100644 +--- a/net/bluetooth/l2cap_core.c ++++ b/net/bluetooth/l2cap_core.c +@@ -3363,9 +3363,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data + break; + + case L2CAP_CONF_EFS: +- remote_efs = 1; +- if (olen == sizeof(efs)) ++ if (olen == sizeof(efs)) { ++ remote_efs = 1; + memcpy(&efs, (void *) val, olen); ++ } + break; + + case L2CAP_CONF_EWS: +@@ -3584,16 +3585,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len, + break; + + case L2CAP_CONF_EFS: +- if (olen == sizeof(efs)) ++ if (olen == sizeof(efs)) { + memcpy(&efs, (void *)val, olen); + +- if (chan->local_stype != L2CAP_SERV_NOTRAFIC && +- efs.stype != L2CAP_SERV_NOTRAFIC && +- efs.stype != chan->local_stype) +- return -ECONNREFUSED; ++ if (chan->local_stype != L2CAP_SERV_NOTRAFIC && ++ efs.stype != L2CAP_SERV_NOTRAFIC && ++ efs.stype != chan->local_stype) ++ return -ECONNREFUSED; + +- l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), +- (unsigned long) &efs, endptr - ptr); ++ l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), ++ (unsigned long) &efs, endptr - ptr); ++ } + break; + + case L2CAP_CONF_FCS: +diff --git a/net/core/ethtool.c b/net/core/ethtool.c +index 9a9a3d77e327..d374a904f1b1 100644 +--- a/net/core/ethtool.c ++++ b/net/core/ethtool.c +@@ -754,15 +754,6 @@ static int ethtool_set_link_ksettings(struct net_device *dev, + return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings); + } + +-static void +-warn_incomplete_ethtool_legacy_settings_conversion(const char *details) +-{ +- char name[sizeof(current->comm)]; +- +- pr_info_once("warning: `%s' uses legacy ethtool link settings API, %s\n", +- get_task_comm(name, current), details); +-} +- + /* Query device for its ethtool_cmd settings. + * + * Backward compatibility note: for compatibility with legacy ethtool, +@@ -789,10 +780,8 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) + &link_ksettings); + if (err < 0) + return err; +- if (!convert_link_ksettings_to_legacy_settings(&cmd, +- &link_ksettings)) +- warn_incomplete_ethtool_legacy_settings_conversion( +- "link modes are only partially reported"); ++ convert_link_ksettings_to_legacy_settings(&cmd, ++ &link_ksettings); + + /* send a sensible cmd tag back to user */ + cmd.cmd = ETHTOOL_GSET; +diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c +index 217f4e3b82f6..146b50e30659 100644 +--- a/net/core/sock_diag.c ++++ b/net/core/sock_diag.c +@@ -288,7 +288,7 @@ static int sock_diag_bind(struct net *net, int group) + case SKNLGRP_INET6_UDP_DESTROY: + if (!sock_diag_handlers[AF_INET6]) + request_module("net-pf-%d-proto-%d-type-%d", PF_NETLINK, +- NETLINK_SOCK_DIAG, AF_INET); ++ NETLINK_SOCK_DIAG, AF_INET6); + break; + } + return 0; +diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c +index 95516138e861..d6189c2a35e4 100644 +--- a/net/ipv6/exthdrs.c ++++ b/net/ipv6/exthdrs.c +@@ -884,6 +884,15 @@ static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto, + sr_phdr->segments[0] = **addr_p; + *addr_p = &sr_ihdr->segments[sr_ihdr->segments_left]; + ++ if (sr_ihdr->hdrlen > hops * 2) { ++ int tlvs_offset, tlvs_length; ++ ++ tlvs_offset = (1 + hops * 2) << 3; ++ tlvs_length = (sr_ihdr->hdrlen - hops * 2) << 3; ++ memcpy((char *)sr_phdr + tlvs_offset, ++ (char *)sr_ihdr + tlvs_offset, tlvs_length); ++ } ++ + #ifdef CONFIG_IPV6_SEG6_HMAC + if (sr_has_hmac(sr_phdr)) { + struct net *net = NULL; +diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c +index f7dd51c42314..688ba5f7516b 100644 +--- a/net/ipv6/ip6_output.c ++++ b/net/ipv6/ip6_output.c +@@ -1735,9 +1735,10 @@ struct sk_buff *ip6_make_skb(struct sock *sk, + cork.base.opt = NULL; + v6_cork.opt = NULL; + err = ip6_setup_cork(sk, &cork, &v6_cork, ipc6, rt, fl6); +- if (err) ++ if (err) { ++ ip6_cork_release(&cork, &v6_cork); + return ERR_PTR(err); +- ++ } + if (ipc6->dontfrag < 0) + ipc6->dontfrag = inet6_sk(sk)->dontfrag; + +diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c +index ef958d50746b..3f46121ad139 100644 +--- a/net/ipv6/ip6_tunnel.c ++++ b/net/ipv6/ip6_tunnel.c +@@ -1081,10 +1081,11 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, + memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); + neigh_release(neigh); + } +- } else if (!(t->parms.flags & +- (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) { +- /* enable the cache only only if the routing decision does +- * not depend on the current inner header value ++ } else if (t->parms.proto != 0 && !(t->parms.flags & ++ (IP6_TNL_F_USE_ORIG_TCLASS | ++ IP6_TNL_F_USE_ORIG_FWMARK))) { ++ /* enable the cache only if neither the outer protocol nor the ++ * routing decision depends on the current inner header value + */ + use_cache = true; + } +diff --git a/net/rds/rdma.c b/net/rds/rdma.c +index bc2f1e0977d6..634cfcb7bba6 100644 +--- a/net/rds/rdma.c ++++ b/net/rds/rdma.c +@@ -525,6 +525,9 @@ int rds_rdma_extra_size(struct rds_rdma_args *args) + + local_vec = (struct rds_iovec __user *)(unsigned long) args->local_vec_addr; + ++ if (args->nr_local == 0) ++ return -EINVAL; ++ + /* figure out the number of pages in the vector */ + for (i = 0; i < args->nr_local; i++) { + if (copy_from_user(&vec, &local_vec[i], +@@ -874,6 +877,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm, + err: + if (page) + put_page(page); ++ rm->atomic.op_active = 0; + kfree(rm->atomic.op_notifier); + + return ret; +diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c +index e29a48ef7fc3..a0ac42b3ed06 100644 +--- a/net/sched/act_gact.c ++++ b/net/sched/act_gact.c +@@ -159,7 +159,7 @@ static void tcf_gact_stats_update(struct tc_action *a, u64 bytes, u32 packets, + if (action == TC_ACT_SHOT) + this_cpu_ptr(gact->common.cpu_qstats)->drops += packets; + +- tm->lastuse = lastuse; ++ tm->lastuse = max_t(u64, tm->lastuse, lastuse); + } + + static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a, +diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c +index 416627c66f08..6ce8de373f83 100644 +--- a/net/sched/act_mirred.c ++++ b/net/sched/act_mirred.c +@@ -238,7 +238,7 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets, + struct tcf_t *tm = &m->tcf_tm; + + _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets); +- tm->lastuse = lastuse; ++ tm->lastuse = max_t(u64, tm->lastuse, lastuse); + } + + static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, +diff --git a/net/sctp/input.c b/net/sctp/input.c +index 621b5ca3fd1c..141c9c466ec1 100644 +--- a/net/sctp/input.c ++++ b/net/sctp/input.c +@@ -399,20 +399,24 @@ void sctp_icmp_frag_needed(struct sock *sk, struct sctp_association *asoc, + return; + } + +- if (t->param_flags & SPP_PMTUD_ENABLE) { +- /* Update transports view of the MTU */ +- sctp_transport_update_pmtu(t, pmtu); +- +- /* Update association pmtu. */ +- sctp_assoc_sync_pmtu(asoc); +- } ++ if (!(t->param_flags & SPP_PMTUD_ENABLE)) ++ /* We can't allow retransmitting in such case, as the ++ * retransmission would be sized just as before, and thus we ++ * would get another icmp, and retransmit again. ++ */ ++ return; + +- /* Retransmit with the new pmtu setting. +- * Normally, if PMTU discovery is disabled, an ICMP Fragmentation +- * Needed will never be sent, but if a message was sent before +- * PMTU discovery was disabled that was larger than the PMTU, it +- * would not be fragmented, so it must be re-transmitted fragmented. ++ /* Update transports view of the MTU. Return if no update was needed. ++ * If an update wasn't needed/possible, it also doesn't make sense to ++ * try to retransmit now. + */ ++ if (!sctp_transport_update_pmtu(t, pmtu)) ++ return; ++ ++ /* Update association pmtu. */ ++ sctp_assoc_sync_pmtu(asoc); ++ ++ /* Retransmit with the new pmtu setting. */ + sctp_retransmit(&asoc->outqueue, t, SCTP_RTXR_PMTUD); + } + +diff --git a/net/sctp/transport.c b/net/sctp/transport.c +index 2d9bd3776bc8..7ef77fd7b52a 100644 +--- a/net/sctp/transport.c ++++ b/net/sctp/transport.c +@@ -251,28 +251,37 @@ void sctp_transport_pmtu(struct sctp_transport *transport, struct sock *sk) + transport->pathmtu = SCTP_DEFAULT_MAXSEGMENT; + } + +-void sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu) ++bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu) + { + struct dst_entry *dst = sctp_transport_dst_check(t); ++ bool change = true; + + if (unlikely(pmtu < SCTP_DEFAULT_MINSEGMENT)) { +- pr_warn("%s: Reported pmtu %d too low, using default minimum of %d\n", +- __func__, pmtu, SCTP_DEFAULT_MINSEGMENT); +- /* Use default minimum segment size and disable +- * pmtu discovery on this transport. +- */ +- t->pathmtu = SCTP_DEFAULT_MINSEGMENT; +- } else { +- t->pathmtu = pmtu; ++ pr_warn_ratelimited("%s: Reported pmtu %d too low, using default minimum of %d\n", ++ __func__, pmtu, SCTP_DEFAULT_MINSEGMENT); ++ /* Use default minimum segment instead */ ++ pmtu = SCTP_DEFAULT_MINSEGMENT; + } ++ pmtu = SCTP_TRUNC4(pmtu); + + if (dst) { + dst->ops->update_pmtu(dst, t->asoc->base.sk, NULL, pmtu); + dst = sctp_transport_dst_check(t); + } + +- if (!dst) ++ if (!dst) { + t->af_specific->get_dst(t, &t->saddr, &t->fl, t->asoc->base.sk); ++ dst = t->dst; ++ } ++ ++ if (dst) { ++ /* Re-fetch, as under layers may have a higher minimum size */ ++ pmtu = SCTP_TRUNC4(dst_mtu(dst)); ++ change = t->pathmtu != pmtu; ++ } ++ t->pathmtu = pmtu; ++ ++ return change; + } + + /* Caches the dst entry and source address for a transport's destination +diff --git a/security/Kconfig b/security/Kconfig +index 6614b9312b45..b5c2b5d0c6c0 100644 +--- a/security/Kconfig ++++ b/security/Kconfig +@@ -63,7 +63,7 @@ config PAGE_TABLE_ISOLATION + ensuring that the majority of kernel addresses are not mapped + into userspace. + +- See Documentation/x86/pagetable-isolation.txt for more details. ++ See Documentation/x86/pti.txt for more details. + + config SECURITY_INFINIBAND + bool "Infiniband Security Hooks" +diff --git a/security/apparmor/include/perms.h b/security/apparmor/include/perms.h +index 2b27bb79aec4..d7b7e7115160 100644 +--- a/security/apparmor/include/perms.h ++++ b/security/apparmor/include/perms.h +@@ -133,6 +133,9 @@ extern struct aa_perms allperms; + #define xcheck_labels_profiles(L1, L2, FN, args...) \ + xcheck_ns_labels((L1), (L2), xcheck_ns_profile_label, (FN), args) + ++#define xcheck_labels(L1, L2, P, FN1, FN2) \ ++ xcheck(fn_for_each((L1), (P), (FN1)), fn_for_each((L2), (P), (FN2))) ++ + + void aa_perm_mask_to_str(char *str, const char *chrs, u32 mask); + void aa_audit_perm_names(struct audit_buffer *ab, const char **names, u32 mask); +diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c +index 7ca0032e7ba9..b40678f3c1d5 100644 +--- a/security/apparmor/ipc.c ++++ b/security/apparmor/ipc.c +@@ -64,40 +64,48 @@ static void audit_ptrace_cb(struct audit_buffer *ab, void *va) + FLAGS_NONE, GFP_ATOMIC); + } + ++/* assumes check for PROFILE_MEDIATES is already done */ + /* TODO: conditionals */ + static int profile_ptrace_perm(struct aa_profile *profile, +- struct aa_profile *peer, u32 request, +- struct common_audit_data *sa) ++ struct aa_label *peer, u32 request, ++ struct common_audit_data *sa) + { + struct aa_perms perms = { }; + +- /* need because of peer in cross check */ +- if (profile_unconfined(profile) || +- !PROFILE_MEDIATES(profile, AA_CLASS_PTRACE)) +- return 0; +- +- aad(sa)->peer = &peer->label; +- aa_profile_match_label(profile, &peer->label, AA_CLASS_PTRACE, request, ++ aad(sa)->peer = peer; ++ aa_profile_match_label(profile, peer, AA_CLASS_PTRACE, request, + &perms); + aa_apply_modes_to_perms(profile, &perms); + return aa_check_perms(profile, &perms, request, sa, audit_ptrace_cb); + } + +-static int cross_ptrace_perm(struct aa_profile *tracer, +- struct aa_profile *tracee, u32 request, +- struct common_audit_data *sa) ++static int profile_tracee_perm(struct aa_profile *tracee, ++ struct aa_label *tracer, u32 request, ++ struct common_audit_data *sa) + { ++ if (profile_unconfined(tracee) || unconfined(tracer) || ++ !PROFILE_MEDIATES(tracee, AA_CLASS_PTRACE)) ++ return 0; ++ ++ return profile_ptrace_perm(tracee, tracer, request, sa); ++} ++ ++static int profile_tracer_perm(struct aa_profile *tracer, ++ struct aa_label *tracee, u32 request, ++ struct common_audit_data *sa) ++{ ++ if (profile_unconfined(tracer)) ++ return 0; ++ + if (PROFILE_MEDIATES(tracer, AA_CLASS_PTRACE)) +- return xcheck(profile_ptrace_perm(tracer, tracee, request, sa), +- profile_ptrace_perm(tracee, tracer, +- request << PTRACE_PERM_SHIFT, +- sa)); +- /* policy uses the old style capability check for ptrace */ +- if (profile_unconfined(tracer) || tracer == tracee) ++ return profile_ptrace_perm(tracer, tracee, request, sa); ++ ++ /* profile uses the old style capability check for ptrace */ ++ if (&tracer->label == tracee) + return 0; + + aad(sa)->label = &tracer->label; +- aad(sa)->peer = &tracee->label; ++ aad(sa)->peer = tracee; + aad(sa)->request = 0; + aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1); + +@@ -115,10 +123,13 @@ static int cross_ptrace_perm(struct aa_profile *tracer, + int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee, + u32 request) + { ++ struct aa_profile *profile; ++ u32 xrequest = request << PTRACE_PERM_SHIFT; + DEFINE_AUDIT_DATA(sa, LSM_AUDIT_DATA_NONE, OP_PTRACE); + +- return xcheck_labels_profiles(tracer, tracee, cross_ptrace_perm, +- request, &sa); ++ return xcheck_labels(tracer, tracee, profile, ++ profile_tracer_perm(profile, tracee, request, &sa), ++ profile_tracee_perm(profile, tracer, xrequest, &sa)); + } + + +diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c +index e49f448ee04f..c2db7e905f7d 100644 +--- a/sound/core/oss/pcm_oss.c ++++ b/sound/core/oss/pcm_oss.c +@@ -455,7 +455,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream *pcm, + v = snd_pcm_hw_param_last(pcm, params, var, dir); + else + v = snd_pcm_hw_param_first(pcm, params, var, dir); +- snd_BUG_ON(v < 0); + return v; + } + +@@ -1335,8 +1334,11 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha + + if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) + return tmp; +- mutex_lock(&runtime->oss.params_lock); + while (bytes > 0) { ++ if (mutex_lock_interruptible(&runtime->oss.params_lock)) { ++ tmp = -ERESTARTSYS; ++ break; ++ } + if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { + tmp = bytes; + if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes) +@@ -1380,14 +1382,18 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha + xfer += tmp; + if ((substream->f_flags & O_NONBLOCK) != 0 && + tmp != runtime->oss.period_bytes) +- break; ++ tmp = -EAGAIN; + } +- } +- mutex_unlock(&runtime->oss.params_lock); +- return xfer; +- + err: +- mutex_unlock(&runtime->oss.params_lock); ++ mutex_unlock(&runtime->oss.params_lock); ++ if (tmp < 0) ++ break; ++ if (signal_pending(current)) { ++ tmp = -ERESTARTSYS; ++ break; ++ } ++ tmp = 0; ++ } + return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; + } + +@@ -1435,8 +1441,11 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use + + if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) + return tmp; +- mutex_lock(&runtime->oss.params_lock); + while (bytes > 0) { ++ if (mutex_lock_interruptible(&runtime->oss.params_lock)) { ++ tmp = -ERESTARTSYS; ++ break; ++ } + if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { + if (runtime->oss.buffer_used == 0) { + tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1); +@@ -1467,12 +1476,16 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use + bytes -= tmp; + xfer += tmp; + } +- } +- mutex_unlock(&runtime->oss.params_lock); +- return xfer; +- + err: +- mutex_unlock(&runtime->oss.params_lock); ++ mutex_unlock(&runtime->oss.params_lock); ++ if (tmp < 0) ++ break; ++ if (signal_pending(current)) { ++ tmp = -ERESTARTSYS; ++ break; ++ } ++ tmp = 0; ++ } + return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; + } + +diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c +index cadc93792868..85a56af104bd 100644 +--- a/sound/core/oss/pcm_plugin.c ++++ b/sound/core/oss/pcm_plugin.c +@@ -592,18 +592,26 @@ snd_pcm_sframes_t snd_pcm_plug_write_transfer(struct snd_pcm_substream *plug, st + snd_pcm_sframes_t frames = size; + + plugin = snd_pcm_plug_first(plug); +- while (plugin && frames > 0) { ++ while (plugin) { ++ if (frames <= 0) ++ return frames; + if ((next = plugin->next) != NULL) { + snd_pcm_sframes_t frames1 = frames; +- if (plugin->dst_frames) ++ if (plugin->dst_frames) { + frames1 = plugin->dst_frames(plugin, frames); ++ if (frames1 <= 0) ++ return frames1; ++ } + if ((err = next->client_channels(next, frames1, &dst_channels)) < 0) { + return err; + } + if (err != frames1) { + frames = err; +- if (plugin->src_frames) ++ if (plugin->src_frames) { + frames = plugin->src_frames(plugin, frames1); ++ if (frames <= 0) ++ return frames; ++ } + } + } else + dst_channels = NULL; +diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c +index 10e7ef7a8804..db7894bb028c 100644 +--- a/sound/core/pcm_lib.c ++++ b/sound/core/pcm_lib.c +@@ -1632,7 +1632,7 @@ int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm, + return changed; + if (params->rmask) { + int err = snd_pcm_hw_refine(pcm, params); +- if (snd_BUG_ON(err < 0)) ++ if (err < 0) + return err; + } + return snd_pcm_hw_param_value(params, var, dir); +@@ -1678,7 +1678,7 @@ int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm, + return changed; + if (params->rmask) { + int err = snd_pcm_hw_refine(pcm, params); +- if (snd_BUG_ON(err < 0)) ++ if (err < 0) + return err; + } + return snd_pcm_hw_param_value(params, var, dir); +diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c +index 2fec2feac387..499f75b18e09 100644 +--- a/sound/core/pcm_native.c ++++ b/sound/core/pcm_native.c +@@ -2582,7 +2582,7 @@ static snd_pcm_sframes_t forward_appl_ptr(struct snd_pcm_substream *substream, + return ret < 0 ? ret : frames; + } + +-/* decrease the appl_ptr; returns the processed frames or a negative error */ ++/* decrease the appl_ptr; returns the processed frames or zero for error */ + static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream, + snd_pcm_uframes_t frames, + snd_pcm_sframes_t avail) +@@ -2599,7 +2599,12 @@ static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream, + if (appl_ptr < 0) + appl_ptr += runtime->boundary; + ret = pcm_lib_apply_appl_ptr(substream, appl_ptr); +- return ret < 0 ? ret : frames; ++ /* NOTE: we return zero for errors because PulseAudio gets depressed ++ * upon receiving an error from rewind ioctl and stops processing ++ * any longer. Returning zero means that no rewind is done, so ++ * it's not absolutely wrong to answer like that. ++ */ ++ return ret < 0 ? 0 : frames; + } + + static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *substream, +diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c +index 135adb17703c..386ee829c655 100644 +--- a/sound/drivers/aloop.c ++++ b/sound/drivers/aloop.c +@@ -39,6 +39,7 @@ + #include + #include + #include ++#include + #include + #include + +@@ -305,19 +306,6 @@ static int loopback_trigger(struct snd_pcm_substream *substream, int cmd) + return 0; + } + +-static void params_change_substream(struct loopback_pcm *dpcm, +- struct snd_pcm_runtime *runtime) +-{ +- struct snd_pcm_runtime *dst_runtime; +- +- if (dpcm == NULL || dpcm->substream == NULL) +- return; +- dst_runtime = dpcm->substream->runtime; +- if (dst_runtime == NULL) +- return; +- dst_runtime->hw = dpcm->cable->hw; +-} +- + static void params_change(struct snd_pcm_substream *substream) + { + struct snd_pcm_runtime *runtime = substream->runtime; +@@ -329,10 +317,6 @@ static void params_change(struct snd_pcm_substream *substream) + cable->hw.rate_max = runtime->rate; + cable->hw.channels_min = runtime->channels; + cable->hw.channels_max = runtime->channels; +- params_change_substream(cable->streams[SNDRV_PCM_STREAM_PLAYBACK], +- runtime); +- params_change_substream(cable->streams[SNDRV_PCM_STREAM_CAPTURE], +- runtime); + } + + static int loopback_prepare(struct snd_pcm_substream *substream) +@@ -620,26 +604,29 @@ static unsigned int get_cable_index(struct snd_pcm_substream *substream) + static int rule_format(struct snd_pcm_hw_params *params, + struct snd_pcm_hw_rule *rule) + { ++ struct loopback_pcm *dpcm = rule->private; ++ struct loopback_cable *cable = dpcm->cable; ++ struct snd_mask m; + +- struct snd_pcm_hardware *hw = rule->private; +- struct snd_mask *maskp = hw_param_mask(params, rule->var); +- +- maskp->bits[0] &= (u_int32_t)hw->formats; +- maskp->bits[1] &= (u_int32_t)(hw->formats >> 32); +- memset(maskp->bits + 2, 0, (SNDRV_MASK_MAX-64) / 8); /* clear rest */ +- if (! maskp->bits[0] && ! maskp->bits[1]) +- return -EINVAL; +- return 0; ++ snd_mask_none(&m); ++ mutex_lock(&dpcm->loopback->cable_lock); ++ m.bits[0] = (u_int32_t)cable->hw.formats; ++ m.bits[1] = (u_int32_t)(cable->hw.formats >> 32); ++ mutex_unlock(&dpcm->loopback->cable_lock); ++ return snd_mask_refine(hw_param_mask(params, rule->var), &m); + } + + static int rule_rate(struct snd_pcm_hw_params *params, + struct snd_pcm_hw_rule *rule) + { +- struct snd_pcm_hardware *hw = rule->private; ++ struct loopback_pcm *dpcm = rule->private; ++ struct loopback_cable *cable = dpcm->cable; + struct snd_interval t; + +- t.min = hw->rate_min; +- t.max = hw->rate_max; ++ mutex_lock(&dpcm->loopback->cable_lock); ++ t.min = cable->hw.rate_min; ++ t.max = cable->hw.rate_max; ++ mutex_unlock(&dpcm->loopback->cable_lock); + t.openmin = t.openmax = 0; + t.integer = 0; + return snd_interval_refine(hw_param_interval(params, rule->var), &t); +@@ -648,22 +635,44 @@ static int rule_rate(struct snd_pcm_hw_params *params, + static int rule_channels(struct snd_pcm_hw_params *params, + struct snd_pcm_hw_rule *rule) + { +- struct snd_pcm_hardware *hw = rule->private; ++ struct loopback_pcm *dpcm = rule->private; ++ struct loopback_cable *cable = dpcm->cable; + struct snd_interval t; + +- t.min = hw->channels_min; +- t.max = hw->channels_max; ++ mutex_lock(&dpcm->loopback->cable_lock); ++ t.min = cable->hw.channels_min; ++ t.max = cable->hw.channels_max; ++ mutex_unlock(&dpcm->loopback->cable_lock); + t.openmin = t.openmax = 0; + t.integer = 0; + return snd_interval_refine(hw_param_interval(params, rule->var), &t); + } + ++static void free_cable(struct snd_pcm_substream *substream) ++{ ++ struct loopback *loopback = substream->private_data; ++ int dev = get_cable_index(substream); ++ struct loopback_cable *cable; ++ ++ cable = loopback->cables[substream->number][dev]; ++ if (!cable) ++ return; ++ if (cable->streams[!substream->stream]) { ++ /* other stream is still alive */ ++ cable->streams[substream->stream] = NULL; ++ } else { ++ /* free the cable */ ++ loopback->cables[substream->number][dev] = NULL; ++ kfree(cable); ++ } ++} ++ + static int loopback_open(struct snd_pcm_substream *substream) + { + struct snd_pcm_runtime *runtime = substream->runtime; + struct loopback *loopback = substream->private_data; + struct loopback_pcm *dpcm; +- struct loopback_cable *cable; ++ struct loopback_cable *cable = NULL; + int err = 0; + int dev = get_cable_index(substream); + +@@ -682,7 +691,6 @@ static int loopback_open(struct snd_pcm_substream *substream) + if (!cable) { + cable = kzalloc(sizeof(*cable), GFP_KERNEL); + if (!cable) { +- kfree(dpcm); + err = -ENOMEM; + goto unlock; + } +@@ -700,19 +708,19 @@ static int loopback_open(struct snd_pcm_substream *substream) + /* are cached -> they do not reflect the actual state */ + err = snd_pcm_hw_rule_add(runtime, 0, + SNDRV_PCM_HW_PARAM_FORMAT, +- rule_format, &runtime->hw, ++ rule_format, dpcm, + SNDRV_PCM_HW_PARAM_FORMAT, -1); + if (err < 0) + goto unlock; + err = snd_pcm_hw_rule_add(runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, +- rule_rate, &runtime->hw, ++ rule_rate, dpcm, + SNDRV_PCM_HW_PARAM_RATE, -1); + if (err < 0) + goto unlock; + err = snd_pcm_hw_rule_add(runtime, 0, + SNDRV_PCM_HW_PARAM_CHANNELS, +- rule_channels, &runtime->hw, ++ rule_channels, dpcm, + SNDRV_PCM_HW_PARAM_CHANNELS, -1); + if (err < 0) + goto unlock; +@@ -724,6 +732,10 @@ static int loopback_open(struct snd_pcm_substream *substream) + else + runtime->hw = cable->hw; + unlock: ++ if (err < 0) { ++ free_cable(substream); ++ kfree(dpcm); ++ } + mutex_unlock(&loopback->cable_lock); + return err; + } +@@ -732,20 +744,10 @@ static int loopback_close(struct snd_pcm_substream *substream) + { + struct loopback *loopback = substream->private_data; + struct loopback_pcm *dpcm = substream->runtime->private_data; +- struct loopback_cable *cable; +- int dev = get_cable_index(substream); + + loopback_timer_stop(dpcm); + mutex_lock(&loopback->cable_lock); +- cable = loopback->cables[substream->number][dev]; +- if (cable->streams[!substream->stream]) { +- /* other stream is still alive */ +- cable->streams[substream->stream] = NULL; +- } else { +- /* free the cable */ +- loopback->cables[substream->number][dev] = NULL; +- kfree(cable); +- } ++ free_cable(substream); + mutex_unlock(&loopback->cable_lock); + return 0; + } +diff --git a/tools/objtool/check.c b/tools/objtool/check.c +index 9b341584eb1b..f40d46e24bcc 100644 +--- a/tools/objtool/check.c ++++ b/tools/objtool/check.c +@@ -427,6 +427,40 @@ static void add_ignores(struct objtool_file *file) + } + } + ++/* ++ * FIXME: For now, just ignore any alternatives which add retpolines. This is ++ * a temporary hack, as it doesn't allow ORC to unwind from inside a retpoline. ++ * But it at least allows objtool to understand the control flow *around* the ++ * retpoline. ++ */ ++static int add_nospec_ignores(struct objtool_file *file) ++{ ++ struct section *sec; ++ struct rela *rela; ++ struct instruction *insn; ++ ++ sec = find_section_by_name(file->elf, ".rela.discard.nospec"); ++ if (!sec) ++ return 0; ++ ++ list_for_each_entry(rela, &sec->rela_list, list) { ++ if (rela->sym->type != STT_SECTION) { ++ WARN("unexpected relocation symbol type in %s", sec->name); ++ return -1; ++ } ++ ++ insn = find_insn(file, rela->sym->sec, rela->addend); ++ if (!insn) { ++ WARN("bad .discard.nospec entry"); ++ return -1; ++ } ++ ++ insn->ignore_alts = true; ++ } ++ ++ return 0; ++} ++ + /* + * Find the destination instructions for all jumps. + */ +@@ -456,6 +490,13 @@ static int add_jump_destinations(struct objtool_file *file) + } else if (rela->sym->sec->idx) { + dest_sec = rela->sym->sec; + dest_off = rela->sym->sym.st_value + rela->addend + 4; ++ } else if (strstr(rela->sym->name, "_indirect_thunk_")) { ++ /* ++ * Retpoline jumps are really dynamic jumps in ++ * disguise, so convert them accordingly. ++ */ ++ insn->type = INSN_JUMP_DYNAMIC; ++ continue; + } else { + /* sibling call */ + insn->jump_dest = 0; +@@ -502,11 +543,18 @@ static int add_call_destinations(struct objtool_file *file) + dest_off = insn->offset + insn->len + insn->immediate; + insn->call_dest = find_symbol_by_offset(insn->sec, + dest_off); ++ /* ++ * FIXME: Thanks to retpolines, it's now considered ++ * normal for a function to call within itself. So ++ * disable this warning for now. ++ */ ++#if 0 + if (!insn->call_dest) { + WARN_FUNC("can't find call dest symbol at offset 0x%lx", + insn->sec, insn->offset, dest_off); + return -1; + } ++#endif + } else if (rela->sym->type == STT_SECTION) { + insn->call_dest = find_symbol_by_offset(rela->sym->sec, + rela->addend+4); +@@ -671,12 +719,6 @@ static int add_special_section_alts(struct objtool_file *file) + return ret; + + list_for_each_entry_safe(special_alt, tmp, &special_alts, list) { +- alt = malloc(sizeof(*alt)); +- if (!alt) { +- WARN("malloc failed"); +- ret = -1; +- goto out; +- } + + orig_insn = find_insn(file, special_alt->orig_sec, + special_alt->orig_off); +@@ -687,6 +729,10 @@ static int add_special_section_alts(struct objtool_file *file) + goto out; + } + ++ /* Ignore retpoline alternatives. */ ++ if (orig_insn->ignore_alts) ++ continue; ++ + new_insn = NULL; + if (!special_alt->group || special_alt->new_len) { + new_insn = find_insn(file, special_alt->new_sec, +@@ -712,6 +758,13 @@ static int add_special_section_alts(struct objtool_file *file) + goto out; + } + ++ alt = malloc(sizeof(*alt)); ++ if (!alt) { ++ WARN("malloc failed"); ++ ret = -1; ++ goto out; ++ } ++ + alt->insn = new_insn; + list_add_tail(&alt->list, &orig_insn->alts); + +@@ -1028,6 +1081,10 @@ static int decode_sections(struct objtool_file *file) + + add_ignores(file); + ++ ret = add_nospec_ignores(file); ++ if (ret) ++ return ret; ++ + ret = add_jump_destinations(file); + if (ret) + return ret; +diff --git a/tools/objtool/check.h b/tools/objtool/check.h +index 47d9ea70a83d..dbadb304a410 100644 +--- a/tools/objtool/check.h ++++ b/tools/objtool/check.h +@@ -44,7 +44,7 @@ struct instruction { + unsigned int len; + unsigned char type; + unsigned long immediate; +- bool alt_group, visited, dead_end, ignore, hint, save, restore; ++ bool alt_group, visited, dead_end, ignore, hint, save, restore, ignore_alts; + struct symbol *call_dest; + struct instruction *jump_dest; + struct list_head alts; +diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c +index 7a2d221c4702..1241487de93f 100644 +--- a/tools/testing/selftests/bpf/test_verifier.c ++++ b/tools/testing/selftests/bpf/test_verifier.c +@@ -272,6 +272,46 @@ static struct bpf_test tests[] = { + .errstr = "invalid bpf_ld_imm64 insn", + .result = REJECT, + }, ++ { ++ "arsh32 on imm", ++ .insns = { ++ BPF_MOV64_IMM(BPF_REG_0, 1), ++ BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5), ++ BPF_EXIT_INSN(), ++ }, ++ .result = REJECT, ++ .errstr = "BPF_ARSH not supported for 32 bit ALU", ++ }, ++ { ++ "arsh32 on reg", ++ .insns = { ++ BPF_MOV64_IMM(BPF_REG_0, 1), ++ BPF_MOV64_IMM(BPF_REG_1, 5), ++ BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), ++ BPF_EXIT_INSN(), ++ }, ++ .result = REJECT, ++ .errstr = "BPF_ARSH not supported for 32 bit ALU", ++ }, ++ { ++ "arsh64 on imm", ++ .insns = { ++ BPF_MOV64_IMM(BPF_REG_0, 1), ++ BPF_ALU64_IMM(BPF_ARSH, BPF_REG_0, 5), ++ BPF_EXIT_INSN(), ++ }, ++ .result = ACCEPT, ++ }, ++ { ++ "arsh64 on reg", ++ .insns = { ++ BPF_MOV64_IMM(BPF_REG_0, 1), ++ BPF_MOV64_IMM(BPF_REG_1, 5), ++ BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), ++ BPF_EXIT_INSN(), ++ }, ++ .result = ACCEPT, ++ }, + { + "no bpf_exit", + .insns = { +diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile +index 7b1adeee4b0f..91fbfa8fdc15 100644 +--- a/tools/testing/selftests/x86/Makefile ++++ b/tools/testing/selftests/x86/Makefile +@@ -7,7 +7,7 @@ include ../lib.mk + + TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \ + check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test ioperm \ +- protection_keys test_vdso ++ protection_keys test_vdso test_vsyscall + TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \ + test_FCMOV test_FCOMI test_FISTTP \ + vdso_restorer +diff --git a/tools/testing/selftests/x86/test_vsyscall.c b/tools/testing/selftests/x86/test_vsyscall.c +new file mode 100644 +index 000000000000..6e0bd52ad53d +--- /dev/null ++++ b/tools/testing/selftests/x86/test_vsyscall.c +@@ -0,0 +1,500 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++ ++#define _GNU_SOURCE ++ ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++ ++#ifdef __x86_64__ ++# define VSYS(x) (x) ++#else ++# define VSYS(x) 0 ++#endif ++ ++#ifndef SYS_getcpu ++# ifdef __x86_64__ ++# define SYS_getcpu 309 ++# else ++# define SYS_getcpu 318 ++# endif ++#endif ++ ++static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *), ++ int flags) ++{ ++ struct sigaction sa; ++ memset(&sa, 0, sizeof(sa)); ++ sa.sa_sigaction = handler; ++ sa.sa_flags = SA_SIGINFO | flags; ++ sigemptyset(&sa.sa_mask); ++ if (sigaction(sig, &sa, 0)) ++ err(1, "sigaction"); ++} ++ ++/* vsyscalls and vDSO */ ++bool should_read_vsyscall = false; ++ ++typedef long (*gtod_t)(struct timeval *tv, struct timezone *tz); ++gtod_t vgtod = (gtod_t)VSYS(0xffffffffff600000); ++gtod_t vdso_gtod; ++ ++typedef int (*vgettime_t)(clockid_t, struct timespec *); ++vgettime_t vdso_gettime; ++ ++typedef long (*time_func_t)(time_t *t); ++time_func_t vtime = (time_func_t)VSYS(0xffffffffff600400); ++time_func_t vdso_time; ++ ++typedef long (*getcpu_t)(unsigned *, unsigned *, void *); ++getcpu_t vgetcpu = (getcpu_t)VSYS(0xffffffffff600800); ++getcpu_t vdso_getcpu; ++ ++static void init_vdso(void) ++{ ++ void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); ++ if (!vdso) ++ vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); ++ if (!vdso) { ++ printf("[WARN]\tfailed to find vDSO\n"); ++ return; ++ } ++ ++ vdso_gtod = (gtod_t)dlsym(vdso, "__vdso_gettimeofday"); ++ if (!vdso_gtod) ++ printf("[WARN]\tfailed to find gettimeofday in vDSO\n"); ++ ++ vdso_gettime = (vgettime_t)dlsym(vdso, "__vdso_clock_gettime"); ++ if (!vdso_gettime) ++ printf("[WARN]\tfailed to find clock_gettime in vDSO\n"); ++ ++ vdso_time = (time_func_t)dlsym(vdso, "__vdso_time"); ++ if (!vdso_time) ++ printf("[WARN]\tfailed to find time in vDSO\n"); ++ ++ vdso_getcpu = (getcpu_t)dlsym(vdso, "__vdso_getcpu"); ++ if (!vdso_getcpu) { ++ /* getcpu() was never wired up in the 32-bit vDSO. */ ++ printf("[%s]\tfailed to find getcpu in vDSO\n", ++ sizeof(long) == 8 ? "WARN" : "NOTE"); ++ } ++} ++ ++static int init_vsys(void) ++{ ++#ifdef __x86_64__ ++ int nerrs = 0; ++ FILE *maps; ++ char line[128]; ++ bool found = false; ++ ++ maps = fopen("/proc/self/maps", "r"); ++ if (!maps) { ++ printf("[WARN]\tCould not open /proc/self/maps -- assuming vsyscall is r-x\n"); ++ should_read_vsyscall = true; ++ return 0; ++ } ++ ++ while (fgets(line, sizeof(line), maps)) { ++ char r, x; ++ void *start, *end; ++ char name[128]; ++ if (sscanf(line, "%p-%p %c-%cp %*x %*x:%*x %*u %s", ++ &start, &end, &r, &x, name) != 5) ++ continue; ++ ++ if (strcmp(name, "[vsyscall]")) ++ continue; ++ ++ printf("\tvsyscall map: %s", line); ++ ++ if (start != (void *)0xffffffffff600000 || ++ end != (void *)0xffffffffff601000) { ++ printf("[FAIL]\taddress range is nonsense\n"); ++ nerrs++; ++ } ++ ++ printf("\tvsyscall permissions are %c-%c\n", r, x); ++ should_read_vsyscall = (r == 'r'); ++ if (x != 'x') { ++ vgtod = NULL; ++ vtime = NULL; ++ vgetcpu = NULL; ++ } ++ ++ found = true; ++ break; ++ } ++ ++ fclose(maps); ++ ++ if (!found) { ++ printf("\tno vsyscall map in /proc/self/maps\n"); ++ should_read_vsyscall = false; ++ vgtod = NULL; ++ vtime = NULL; ++ vgetcpu = NULL; ++ } ++ ++ return nerrs; ++#else ++ return 0; ++#endif ++} ++ ++/* syscalls */ ++static inline long sys_gtod(struct timeval *tv, struct timezone *tz) ++{ ++ return syscall(SYS_gettimeofday, tv, tz); ++} ++ ++static inline int sys_clock_gettime(clockid_t id, struct timespec *ts) ++{ ++ return syscall(SYS_clock_gettime, id, ts); ++} ++ ++static inline long sys_time(time_t *t) ++{ ++ return syscall(SYS_time, t); ++} ++ ++static inline long sys_getcpu(unsigned * cpu, unsigned * node, ++ void* cache) ++{ ++ return syscall(SYS_getcpu, cpu, node, cache); ++} ++ ++static jmp_buf jmpbuf; ++ ++static void sigsegv(int sig, siginfo_t *info, void *ctx_void) ++{ ++ siglongjmp(jmpbuf, 1); ++} ++ ++static double tv_diff(const struct timeval *a, const struct timeval *b) ++{ ++ return (double)(a->tv_sec - b->tv_sec) + ++ (double)((int)a->tv_usec - (int)b->tv_usec) * 1e-6; ++} ++ ++static int check_gtod(const struct timeval *tv_sys1, ++ const struct timeval *tv_sys2, ++ const struct timezone *tz_sys, ++ const char *which, ++ const struct timeval *tv_other, ++ const struct timezone *tz_other) ++{ ++ int nerrs = 0; ++ double d1, d2; ++ ++ if (tz_other && (tz_sys->tz_minuteswest != tz_other->tz_minuteswest || tz_sys->tz_dsttime != tz_other->tz_dsttime)) { ++ printf("[FAIL] %s tz mismatch\n", which); ++ nerrs++; ++ } ++ ++ d1 = tv_diff(tv_other, tv_sys1); ++ d2 = tv_diff(tv_sys2, tv_other); ++ printf("\t%s time offsets: %lf %lf\n", which, d1, d2); ++ ++ if (d1 < 0 || d2 < 0) { ++ printf("[FAIL]\t%s time was inconsistent with the syscall\n", which); ++ nerrs++; ++ } else { ++ printf("[OK]\t%s gettimeofday()'s timeval was okay\n", which); ++ } ++ ++ return nerrs; ++} ++ ++static int test_gtod(void) ++{ ++ struct timeval tv_sys1, tv_sys2, tv_vdso, tv_vsys; ++ struct timezone tz_sys, tz_vdso, tz_vsys; ++ long ret_vdso = -1; ++ long ret_vsys = -1; ++ int nerrs = 0; ++ ++ printf("[RUN]\ttest gettimeofday()\n"); ++ ++ if (sys_gtod(&tv_sys1, &tz_sys) != 0) ++ err(1, "syscall gettimeofday"); ++ if (vdso_gtod) ++ ret_vdso = vdso_gtod(&tv_vdso, &tz_vdso); ++ if (vgtod) ++ ret_vsys = vgtod(&tv_vsys, &tz_vsys); ++ if (sys_gtod(&tv_sys2, &tz_sys) != 0) ++ err(1, "syscall gettimeofday"); ++ ++ if (vdso_gtod) { ++ if (ret_vdso == 0) { ++ nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vDSO", &tv_vdso, &tz_vdso); ++ } else { ++ printf("[FAIL]\tvDSO gettimeofday() failed: %ld\n", ret_vdso); ++ nerrs++; ++ } ++ } ++ ++ if (vgtod) { ++ if (ret_vsys == 0) { ++ nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vsyscall", &tv_vsys, &tz_vsys); ++ } else { ++ printf("[FAIL]\tvsys gettimeofday() failed: %ld\n", ret_vsys); ++ nerrs++; ++ } ++ } ++ ++ return nerrs; ++} ++ ++static int test_time(void) { ++ int nerrs = 0; ++ ++ printf("[RUN]\ttest time()\n"); ++ long t_sys1, t_sys2, t_vdso = 0, t_vsys = 0; ++ long t2_sys1 = -1, t2_sys2 = -1, t2_vdso = -1, t2_vsys = -1; ++ t_sys1 = sys_time(&t2_sys1); ++ if (vdso_time) ++ t_vdso = vdso_time(&t2_vdso); ++ if (vtime) ++ t_vsys = vtime(&t2_vsys); ++ t_sys2 = sys_time(&t2_sys2); ++ if (t_sys1 < 0 || t_sys1 != t2_sys1 || t_sys2 < 0 || t_sys2 != t2_sys2) { ++ printf("[FAIL]\tsyscall failed (ret1:%ld output1:%ld ret2:%ld output2:%ld)\n", t_sys1, t2_sys1, t_sys2, t2_sys2); ++ nerrs++; ++ return nerrs; ++ } ++ ++ if (vdso_time) { ++ if (t_vdso < 0 || t_vdso != t2_vdso) { ++ printf("[FAIL]\tvDSO failed (ret:%ld output:%ld)\n", t_vdso, t2_vdso); ++ nerrs++; ++ } else if (t_vdso < t_sys1 || t_vdso > t_sys2) { ++ printf("[FAIL]\tvDSO returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vdso, t_sys2); ++ nerrs++; ++ } else { ++ printf("[OK]\tvDSO time() is okay\n"); ++ } ++ } ++ ++ if (vtime) { ++ if (t_vsys < 0 || t_vsys != t2_vsys) { ++ printf("[FAIL]\tvsyscall failed (ret:%ld output:%ld)\n", t_vsys, t2_vsys); ++ nerrs++; ++ } else if (t_vsys < t_sys1 || t_vsys > t_sys2) { ++ printf("[FAIL]\tvsyscall returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vsys, t_sys2); ++ nerrs++; ++ } else { ++ printf("[OK]\tvsyscall time() is okay\n"); ++ } ++ } ++ ++ return nerrs; ++} ++ ++static int test_getcpu(int cpu) ++{ ++ int nerrs = 0; ++ long ret_sys, ret_vdso = -1, ret_vsys = -1; ++ ++ printf("[RUN]\tgetcpu() on CPU %d\n", cpu); ++ ++ cpu_set_t cpuset; ++ CPU_ZERO(&cpuset); ++ CPU_SET(cpu, &cpuset); ++ if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) { ++ printf("[SKIP]\tfailed to force CPU %d\n", cpu); ++ return nerrs; ++ } ++ ++ unsigned cpu_sys, cpu_vdso, cpu_vsys, node_sys, node_vdso, node_vsys; ++ unsigned node = 0; ++ bool have_node = false; ++ ret_sys = sys_getcpu(&cpu_sys, &node_sys, 0); ++ if (vdso_getcpu) ++ ret_vdso = vdso_getcpu(&cpu_vdso, &node_vdso, 0); ++ if (vgetcpu) ++ ret_vsys = vgetcpu(&cpu_vsys, &node_vsys, 0); ++ ++ if (ret_sys == 0) { ++ if (cpu_sys != cpu) { ++ printf("[FAIL]\tsyscall reported CPU %hu but should be %d\n", cpu_sys, cpu); ++ nerrs++; ++ } ++ ++ have_node = true; ++ node = node_sys; ++ } ++ ++ if (vdso_getcpu) { ++ if (ret_vdso) { ++ printf("[FAIL]\tvDSO getcpu() failed\n"); ++ nerrs++; ++ } else { ++ if (!have_node) { ++ have_node = true; ++ node = node_vdso; ++ } ++ ++ if (cpu_vdso != cpu) { ++ printf("[FAIL]\tvDSO reported CPU %hu but should be %d\n", cpu_vdso, cpu); ++ nerrs++; ++ } else { ++ printf("[OK]\tvDSO reported correct CPU\n"); ++ } ++ ++ if (node_vdso != node) { ++ printf("[FAIL]\tvDSO reported node %hu but should be %hu\n", node_vdso, node); ++ nerrs++; ++ } else { ++ printf("[OK]\tvDSO reported correct node\n"); ++ } ++ } ++ } ++ ++ if (vgetcpu) { ++ if (ret_vsys) { ++ printf("[FAIL]\tvsyscall getcpu() failed\n"); ++ nerrs++; ++ } else { ++ if (!have_node) { ++ have_node = true; ++ node = node_vsys; ++ } ++ ++ if (cpu_vsys != cpu) { ++ printf("[FAIL]\tvsyscall reported CPU %hu but should be %d\n", cpu_vsys, cpu); ++ nerrs++; ++ } else { ++ printf("[OK]\tvsyscall reported correct CPU\n"); ++ } ++ ++ if (node_vsys != node) { ++ printf("[FAIL]\tvsyscall reported node %hu but should be %hu\n", node_vsys, node); ++ nerrs++; ++ } else { ++ printf("[OK]\tvsyscall reported correct node\n"); ++ } ++ } ++ } ++ ++ return nerrs; ++} ++ ++static int test_vsys_r(void) ++{ ++#ifdef __x86_64__ ++ printf("[RUN]\tChecking read access to the vsyscall page\n"); ++ bool can_read; ++ if (sigsetjmp(jmpbuf, 1) == 0) { ++ *(volatile int *)0xffffffffff600000; ++ can_read = true; ++ } else { ++ can_read = false; ++ } ++ ++ if (can_read && !should_read_vsyscall) { ++ printf("[FAIL]\tWe have read access, but we shouldn't\n"); ++ return 1; ++ } else if (!can_read && should_read_vsyscall) { ++ printf("[FAIL]\tWe don't have read access, but we should\n"); ++ return 1; ++ } else { ++ printf("[OK]\tgot expected result\n"); ++ } ++#endif ++ ++ return 0; ++} ++ ++ ++#ifdef __x86_64__ ++#define X86_EFLAGS_TF (1UL << 8) ++static volatile sig_atomic_t num_vsyscall_traps; ++ ++static unsigned long get_eflags(void) ++{ ++ unsigned long eflags; ++ asm volatile ("pushfq\n\tpopq %0" : "=rm" (eflags)); ++ return eflags; ++} ++ ++static void set_eflags(unsigned long eflags) ++{ ++ asm volatile ("pushq %0\n\tpopfq" : : "rm" (eflags) : "flags"); ++} ++ ++static void sigtrap(int sig, siginfo_t *info, void *ctx_void) ++{ ++ ucontext_t *ctx = (ucontext_t *)ctx_void; ++ unsigned long ip = ctx->uc_mcontext.gregs[REG_RIP]; ++ ++ if (((ip ^ 0xffffffffff600000UL) & ~0xfffUL) == 0) ++ num_vsyscall_traps++; ++} ++ ++static int test_native_vsyscall(void) ++{ ++ time_t tmp; ++ bool is_native; ++ ++ if (!vtime) ++ return 0; ++ ++ printf("[RUN]\tchecking for native vsyscall\n"); ++ sethandler(SIGTRAP, sigtrap, 0); ++ set_eflags(get_eflags() | X86_EFLAGS_TF); ++ vtime(&tmp); ++ set_eflags(get_eflags() & ~X86_EFLAGS_TF); ++ ++ /* ++ * If vsyscalls are emulated, we expect a single trap in the ++ * vsyscall page -- the call instruction will trap with RIP ++ * pointing to the entry point before emulation takes over. ++ * In native mode, we expect two traps, since whatever code ++ * the vsyscall page contains will be more than just a ret ++ * instruction. ++ */ ++ is_native = (num_vsyscall_traps > 1); ++ ++ printf("\tvsyscalls are %s (%d instructions in vsyscall page)\n", ++ (is_native ? "native" : "emulated"), ++ (int)num_vsyscall_traps); ++ ++ return 0; ++} ++#endif ++ ++int main(int argc, char **argv) ++{ ++ int nerrs = 0; ++ ++ init_vdso(); ++ nerrs += init_vsys(); ++ ++ nerrs += test_gtod(); ++ nerrs += test_time(); ++ nerrs += test_getcpu(0); ++ nerrs += test_getcpu(1); ++ ++ sethandler(SIGSEGV, sigsegv, 0); ++ nerrs += test_vsys_r(); ++ ++#ifdef __x86_64__ ++ nerrs += test_native_vsyscall(); ++#endif ++ ++ return nerrs ? 1 : 0; ++} +diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c +index b6e715fd3c90..dac7ceb1a677 100644 +--- a/virt/kvm/arm/mmio.c ++++ b/virt/kvm/arm/mmio.c +@@ -112,7 +112,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run) + } + + trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr, +- data); ++ &data); + data = vcpu_data_host_to_guest(vcpu, data, len); + vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data); + } +@@ -182,14 +182,14 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, + data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt), + len); + +- trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data); ++ trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, &data); + kvm_mmio_write_buf(data_buf, len, data); + + ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, fault_ipa, len, + data_buf); + } else { + trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, len, +- fault_ipa, 0); ++ fault_ipa, NULL); + + ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, fault_ipa, len, + data_buf);