From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 834B4138D21 for ; Fri, 17 Jul 2015 15:24:52 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id DF20B14014; Fri, 17 Jul 2015 15:24:48 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 43A7414014 for ; Fri, 17 Jul 2015 15:24:48 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 8F7BA340D6E for ; Fri, 17 Jul 2015 15:24:46 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 8F662AC for ; Fri, 17 Jul 2015 15:24:43 +0000 (UTC) From: "Mike Pagano" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Mike Pagano" Message-ID: <1437146679.d5f2323178576f3fb33aa8881f5c6475085fea06.mpagano@gentoo> Subject: [gentoo-commits] proj/linux-patches:4.1 commit in: / X-VCS-Repository: proj/linux-patches X-VCS-Files: 0000_README 5015_kdbus-6-27-15.patch 5015_kdbus-7-17-15.patch X-VCS-Directories: / X-VCS-Committer: mpagano X-VCS-Committer-Name: Mike Pagano X-VCS-Revision: d5f2323178576f3fb33aa8881f5c6475085fea06 X-VCS-Branch: 4.1 Date: Fri, 17 Jul 2015 15:24:43 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: c706e057-f39f-41e4-9d01-aa654cd2d2f2 X-Archives-Hash: 7daca1028554e1440fe6c0bb9688dca0 commit: d5f2323178576f3fb33aa8881f5c6475085fea06 Author: Mike Pagano gentoo org> AuthorDate: Fri Jul 17 15:24:39 2015 +0000 Commit: Mike Pagano gentoo org> CommitDate: Fri Jul 17 15:24:39 2015 +0000 URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=d5f23231 Update for kdbus patch 0000_README | 2 +- ...kdbus-6-27-15.patch => 5015_kdbus-7-17-15.patch | 6808 ++++++++++++++------ 2 files changed, 4790 insertions(+), 2020 deletions(-) diff --git a/0000_README b/0000_README index 784c55d..43154ce 100644 --- a/0000_README +++ b/0000_README @@ -91,6 +91,6 @@ Patch: 5010_enable-additional-cpu-optimizations-for-gcc-4.9.patch From: https://github.com/graysky2/kernel_gcc_patch/ Desc: Kernel patch enables gcc >= v4.9 optimizations for additional CPUs. -Patch: 5015_kdbus-6-27-15.patch +Patch: 5015_kdbus-7-17-15.patch From: https://lkml.org Desc: Kernel-level IPC implementation diff --git a/5015_kdbus-6-27-15.patch b/5015_kdbus-7-17-15.patch similarity index 86% rename from 5015_kdbus-6-27-15.patch rename to 5015_kdbus-7-17-15.patch index bc17abe..61102dd 100644 --- a/5015_kdbus-6-27-15.patch +++ b/5015_kdbus-7-17-15.patch @@ -8,6 +8,19 @@ index bc05482..e2127a7 100644 + filesystems filesystems ia64 kdbus laptops mic misc-devices \ networking pcmcia prctl ptp spi timers vDSO video4linux \ watchdog +diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt +index f5a8ca2..750d577 100644 +--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt ++++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt +@@ -1,7 +1,7 @@ + * Marvell Armada 370 / Armada XP Ethernet Controller (NETA) + + Required properties: +-- compatible: "marvell,armada-370-neta" or "marvell,armada-xp-neta". ++- compatible: should be "marvell,armada-370-neta". + - reg: address and length of the register set for the device. + - interrupts: interrupt for the device + - phy: See ethernet.txt file in the same directory. diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 51f4221..ec7c81b 100644 --- a/Documentation/ioctl/ioctl-number.txt @@ -30,10 +43,10 @@ index 0000000..b4a77cc +*.html diff --git a/Documentation/kdbus/Makefile b/Documentation/kdbus/Makefile new file mode 100644 -index 0000000..af87641 +index 0000000..8caffe5 --- /dev/null +++ b/Documentation/kdbus/Makefile -@@ -0,0 +1,40 @@ +@@ -0,0 +1,44 @@ +DOCS := \ + kdbus.xml \ + kdbus.bus.xml \ @@ -74,12 +87,16 @@ index 0000000..af87641 +htmldocs: $(HTMLFILES) + +clean-files := $(MANFILES) $(HTMLFILES) ++ ++# we don't support other %docs targets right now ++%docs: ++ @true diff --git a/Documentation/kdbus/kdbus.bus.xml b/Documentation/kdbus/kdbus.bus.xml new file mode 100644 -index 0000000..4b9a0ac +index 0000000..83f1198 --- /dev/null +++ b/Documentation/kdbus/kdbus.bus.xml -@@ -0,0 +1,359 @@ +@@ -0,0 +1,344 @@ + + @@ -280,21 +297,6 @@ index 0000000..4b9a0ac + + + -+ KDBUS_ITEM_ATTACH_FLAGS_RECV -+ -+ -+ An optional item that contains a set of required attach flags -+ that connections must allow. This item is used as a -+ negotiation measure during connection creation. If connections -+ do not satisfy the bus requirements, they are not allowed on -+ the bus. If not set, the bus does not require any metadata to -+ be attached; in this case connections are free to set their -+ own attach flags. -+ -+ -+ -+ -+ + KDBUS_ITEM_ATTACH_FLAGS_SEND + + @@ -441,10 +443,10 @@ index 0000000..4b9a0ac + diff --git a/Documentation/kdbus/kdbus.connection.xml b/Documentation/kdbus/kdbus.connection.xml new file mode 100644 -index 0000000..cefb419 +index 0000000..4bb5f30 --- /dev/null +++ b/Documentation/kdbus/kdbus.connection.xml -@@ -0,0 +1,1250 @@ +@@ -0,0 +1,1244 @@ + + @@ -802,13 +804,7 @@ index 0000000..cefb419 + Set the bits for metadata this connection permits to be sent to the + receiving peer. Only metadata items that are both allowed to be sent + by the sender and that are requested by the receiver will be attached -+ to the message. Note, however, that the bus may optionally require -+ some of those bits to be set. If the match fails, the ioctl will fail -+ with errno set to -+ ECONNREFUSED. In either case, when returning the -+ field will be set to the mask of metadata items that are enforced by -+ the bus with the KDBUS_FLAGS_KERNEL bit set as -+ well. ++ to the message. + + + @@ -7474,9 +7470,20 @@ index d8afd29..02f7668 100644 M: Vivek Goyal M: Haren Myneni diff --git a/Makefile b/Makefile -index f5c8983..a1c8d57 100644 +index cef84c0..a1c8d57 100644 --- a/Makefile +++ b/Makefile +@@ -1,8 +1,8 @@ + VERSION = 4 + PATCHLEVEL = 1 +-SUBLEVEL = 2 ++SUBLEVEL = 0 + EXTRAVERSION = +-NAME = Series 4800 ++NAME = Hurr durr I'ma sheep + + # *DOCUMENTATION* + # To see a list of typical targets execute "make help" @@ -1343,6 +1343,7 @@ $(help-board-dirs): help-%: %docs: scripts_basic FORCE $(Q)$(MAKE) $(build)=scripts build_docproc @@ -7485,6 +7492,2075 @@ index f5c8983..a1c8d57 100644 else # KBUILD_EXTMOD +diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi +index 06a2f2a..ec96f0b 100644 +--- a/arch/arm/boot/dts/armada-370-xp.dtsi ++++ b/arch/arm/boot/dts/armada-370-xp.dtsi +@@ -270,6 +270,7 @@ + }; + + eth0: ethernet@70000 { ++ compatible = "marvell,armada-370-neta"; + reg = <0x70000 0x4000>; + interrupts = <8>; + clocks = <&gateclk 4>; +@@ -285,6 +286,7 @@ + }; + + eth1: ethernet@74000 { ++ compatible = "marvell,armada-370-neta"; + reg = <0x74000 0x4000>; + interrupts = <10>; + clocks = <&gateclk 3>; +diff --git a/arch/arm/boot/dts/armada-370.dtsi b/arch/arm/boot/dts/armada-370.dtsi +index ca4257b..00b50db5 100644 +--- a/arch/arm/boot/dts/armada-370.dtsi ++++ b/arch/arm/boot/dts/armada-370.dtsi +@@ -307,14 +307,6 @@ + dmacap,memset; + }; + }; +- +- ethernet@70000 { +- compatible = "marvell,armada-370-neta"; +- }; +- +- ethernet@74000 { +- compatible = "marvell,armada-370-neta"; +- }; + }; + }; + }; +diff --git a/arch/arm/boot/dts/armada-xp-mv78260.dtsi b/arch/arm/boot/dts/armada-xp-mv78260.dtsi +index c5fdc99..8479fdc 100644 +--- a/arch/arm/boot/dts/armada-xp-mv78260.dtsi ++++ b/arch/arm/boot/dts/armada-xp-mv78260.dtsi +@@ -318,7 +318,7 @@ + }; + + eth3: ethernet@34000 { +- compatible = "marvell,armada-xp-neta"; ++ compatible = "marvell,armada-370-neta"; + reg = <0x34000 0x4000>; + interrupts = <14>; + clocks = <&gateclk 1>; +diff --git a/arch/arm/boot/dts/armada-xp-mv78460.dtsi b/arch/arm/boot/dts/armada-xp-mv78460.dtsi +index 0e24f1a..661d54c 100644 +--- a/arch/arm/boot/dts/armada-xp-mv78460.dtsi ++++ b/arch/arm/boot/dts/armada-xp-mv78460.dtsi +@@ -356,7 +356,7 @@ + }; + + eth3: ethernet@34000 { +- compatible = "marvell,armada-xp-neta"; ++ compatible = "marvell,armada-370-neta"; + reg = <0x34000 0x4000>; + interrupts = <14>; + clocks = <&gateclk 1>; +diff --git a/arch/arm/boot/dts/armada-xp.dtsi b/arch/arm/boot/dts/armada-xp.dtsi +index 8fdd6d7..013d63f 100644 +--- a/arch/arm/boot/dts/armada-xp.dtsi ++++ b/arch/arm/boot/dts/armada-xp.dtsi +@@ -177,7 +177,7 @@ + }; + + eth2: ethernet@30000 { +- compatible = "marvell,armada-xp-neta"; ++ compatible = "marvell,armada-370-neta"; + reg = <0x30000 0x4000>; + interrupts = <12>; + clocks = <&gateclk 2>; +@@ -220,14 +220,6 @@ + }; + }; + +- ethernet@70000 { +- compatible = "marvell,armada-xp-neta"; +- }; +- +- ethernet@74000 { +- compatible = "marvell,armada-xp-neta"; +- }; +- + xor@f0900 { + compatible = "marvell,orion-xor"; + reg = <0xF0900 0x100 +diff --git a/arch/arm/boot/dts/sun5i-a10s.dtsi b/arch/arm/boot/dts/sun5i-a10s.dtsi +index 3794ca1..2fd8988 100644 +--- a/arch/arm/boot/dts/sun5i-a10s.dtsi ++++ b/arch/arm/boot/dts/sun5i-a10s.dtsi +@@ -573,7 +573,7 @@ + }; + + rtp: rtp@01c25000 { +- compatible = "allwinner,sun5i-a13-ts"; ++ compatible = "allwinner,sun4i-a10-ts"; + reg = <0x01c25000 0x100>; + interrupts = <29>; + #thermal-sensor-cells = <0>; +diff --git a/arch/arm/boot/dts/sun5i-a13.dtsi b/arch/arm/boot/dts/sun5i-a13.dtsi +index 5098185..883cb48 100644 +--- a/arch/arm/boot/dts/sun5i-a13.dtsi ++++ b/arch/arm/boot/dts/sun5i-a13.dtsi +@@ -555,7 +555,7 @@ + }; + + rtp: rtp@01c25000 { +- compatible = "allwinner,sun5i-a13-ts"; ++ compatible = "allwinner,sun4i-a10-ts"; + reg = <0x01c25000 0x100>; + interrupts = <29>; + #thermal-sensor-cells = <0>; +diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi b/arch/arm/boot/dts/sun7i-a20.dtsi +index 2b4847c..fdd1817 100644 +--- a/arch/arm/boot/dts/sun7i-a20.dtsi ++++ b/arch/arm/boot/dts/sun7i-a20.dtsi +@@ -1042,7 +1042,7 @@ + }; + + rtp: rtp@01c25000 { +- compatible = "allwinner,sun5i-a13-ts"; ++ compatible = "allwinner,sun4i-a10-ts"; + reg = <0x01c25000 0x100>; + interrupts = ; + #thermal-sensor-cells = <0>; +diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S +index f7db3a5..79caf79 100644 +--- a/arch/arm/kvm/interrupts.S ++++ b/arch/arm/kvm/interrupts.S +@@ -170,9 +170,13 @@ __kvm_vcpu_return: + @ Don't trap coprocessor accesses for host kernel + set_hstr vmexit + set_hdcr vmexit +- set_hcptr vmexit, (HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11)), after_vfp_restore ++ set_hcptr vmexit, (HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11)) + + #ifdef CONFIG_VFPv3 ++ @ Save floating point registers we if let guest use them. ++ tst r2, #(HCPTR_TCP(10) | HCPTR_TCP(11)) ++ bne after_vfp_restore ++ + @ Switch VFP/NEON hardware state to the host's + add r7, vcpu, #VCPU_VFP_GUEST + store_vfp_state r7 +@@ -184,8 +188,6 @@ after_vfp_restore: + @ Restore FPEXC_EN which we clobbered on entry + pop {r2} + VFPFMXR FPEXC, r2 +-#else +-after_vfp_restore: + #endif + + @ Reset Hyp-role +@@ -481,7 +483,7 @@ switch_to_guest_vfp: + push {r3-r7} + + @ NEON/VFP used. Turn on VFP access. +- set_hcptr vmtrap, (HCPTR_TCP(10) | HCPTR_TCP(11)) ++ set_hcptr vmexit, (HCPTR_TCP(10) | HCPTR_TCP(11)) + + @ Switch VFP/NEON hardware state to the guest's + add r7, r0, #VCPU_VFP_HOST +diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S +index 48efe2e..35e4a3a 100644 +--- a/arch/arm/kvm/interrupts_head.S ++++ b/arch/arm/kvm/interrupts_head.S +@@ -591,13 +591,8 @@ ARM_BE8(rev r6, r6 ) + .endm + + /* Configures the HCPTR (Hyp Coprocessor Trap Register) on entry/return +- * (hardware reset value is 0). Keep previous value in r2. +- * An ISB is emited on vmexit/vmtrap, but executed on vmexit only if +- * VFP wasn't already enabled (always executed on vmtrap). +- * If a label is specified with vmexit, it is branched to if VFP wasn't +- * enabled. +- */ +-.macro set_hcptr operation, mask, label = none ++ * (hardware reset value is 0). Keep previous value in r2. */ ++.macro set_hcptr operation, mask + mrc p15, 4, r2, c1, c1, 2 + ldr r3, =\mask + .if \operation == vmentry +@@ -606,17 +601,6 @@ ARM_BE8(rev r6, r6 ) + bic r3, r2, r3 @ Don't trap defined coproc-accesses + .endif + mcr p15, 4, r3, c1, c1, 2 +- .if \operation != vmentry +- .if \operation == vmexit +- tst r2, #(HCPTR_TCP(10) | HCPTR_TCP(11)) +- beq 1f +- .endif +- isb +- .if \label != none +- b \label +- .endif +-1: +- .endif + .endm + + /* Configures the HDCR (Hyp Debug Configuration Register) on entry/return +diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c +index 531e922..02fa8ef 100644 +--- a/arch/arm/kvm/psci.c ++++ b/arch/arm/kvm/psci.c +@@ -230,6 +230,10 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) + case PSCI_0_2_FN64_AFFINITY_INFO: + val = kvm_psci_vcpu_affinity_info(vcpu); + break; ++ case PSCI_0_2_FN_MIGRATE: ++ case PSCI_0_2_FN64_MIGRATE: ++ val = PSCI_RET_NOT_SUPPORTED; ++ break; + case PSCI_0_2_FN_MIGRATE_INFO_TYPE: + /* + * Trusted OS is MP hence does not require migration +@@ -238,6 +242,10 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) + */ + val = PSCI_0_2_TOS_MP; + break; ++ case PSCI_0_2_FN_MIGRATE_INFO_UP_CPU: ++ case PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU: ++ val = PSCI_RET_NOT_SUPPORTED; ++ break; + case PSCI_0_2_FN_SYSTEM_OFF: + kvm_psci_system_off(vcpu); + /* +@@ -263,8 +271,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) + ret = 0; + break; + default: +- val = PSCI_RET_NOT_SUPPORTED; +- break; ++ return -EINVAL; + } + + *vcpu_reg(vcpu, 0) = val; +@@ -284,9 +291,12 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu) + case KVM_PSCI_FN_CPU_ON: + val = kvm_psci_vcpu_on(vcpu); + break; +- default: ++ case KVM_PSCI_FN_CPU_SUSPEND: ++ case KVM_PSCI_FN_MIGRATE: + val = PSCI_RET_NOT_SUPPORTED; + break; ++ default: ++ return -EINVAL; + } + + *vcpu_reg(vcpu, 0) = val; +diff --git a/arch/arm/mach-imx/clk-imx6q.c b/arch/arm/mach-imx/clk-imx6q.c +index a2e8ef3..469a150 100644 +--- a/arch/arm/mach-imx/clk-imx6q.c ++++ b/arch/arm/mach-imx/clk-imx6q.c +@@ -443,7 +443,7 @@ static void __init imx6q_clocks_init(struct device_node *ccm_node) + clk[IMX6QDL_CLK_GPMI_IO] = imx_clk_gate2("gpmi_io", "enfc", base + 0x78, 28); + clk[IMX6QDL_CLK_GPMI_APB] = imx_clk_gate2("gpmi_apb", "usdhc3", base + 0x78, 30); + clk[IMX6QDL_CLK_ROM] = imx_clk_gate2("rom", "ahb", base + 0x7c, 0); +- clk[IMX6QDL_CLK_SATA] = imx_clk_gate2("sata", "ahb", base + 0x7c, 4); ++ clk[IMX6QDL_CLK_SATA] = imx_clk_gate2("sata", "ipg", base + 0x7c, 4); + clk[IMX6QDL_CLK_SDMA] = imx_clk_gate2("sdma", "ahb", base + 0x7c, 6); + clk[IMX6QDL_CLK_SPBA] = imx_clk_gate2("spba", "ipg", base + 0x7c, 12); + clk[IMX6QDL_CLK_SPDIF] = imx_clk_gate2("spdif", "spdif_podf", base + 0x7c, 14); +diff --git a/arch/arm/mach-mvebu/pm-board.c b/arch/arm/mach-mvebu/pm-board.c +index 301ab38..6dfd4ab 100644 +--- a/arch/arm/mach-mvebu/pm-board.c ++++ b/arch/arm/mach-mvebu/pm-board.c +@@ -43,9 +43,6 @@ static void mvebu_armada_xp_gp_pm_enter(void __iomem *sdram_reg, u32 srcmd) + for (i = 0; i < ARMADA_XP_GP_PIC_NR_GPIOS; i++) + ackcmd |= BIT(pic_raw_gpios[i]); + +- srcmd = cpu_to_le32(srcmd); +- ackcmd = cpu_to_le32(ackcmd); +- + /* + * Wait a while, the PIC needs quite a bit of time between the + * two GPIO commands. +diff --git a/arch/arm/mach-tegra/cpuidle-tegra20.c b/arch/arm/mach-tegra/cpuidle-tegra20.c +index 7469347..88de2dc 100644 +--- a/arch/arm/mach-tegra/cpuidle-tegra20.c ++++ b/arch/arm/mach-tegra/cpuidle-tegra20.c +@@ -34,7 +34,6 @@ + #include "iomap.h" + #include "irq.h" + #include "pm.h" +-#include "reset.h" + #include "sleep.h" + + #ifdef CONFIG_PM_SLEEP +@@ -71,13 +70,15 @@ static struct cpuidle_driver tegra_idle_driver = { + + #ifdef CONFIG_PM_SLEEP + #ifdef CONFIG_SMP ++static void __iomem *pmc = IO_ADDRESS(TEGRA_PMC_BASE); ++ + static int tegra20_reset_sleeping_cpu_1(void) + { + int ret = 0; + + tegra_pen_lock(); + +- if (readb(tegra20_cpu1_resettable_status) == CPU_RESETTABLE) ++ if (readl(pmc + PMC_SCRATCH41) == CPU_RESETTABLE) + tegra20_cpu_shutdown(1); + else + ret = -EINVAL; +diff --git a/arch/arm/mach-tegra/reset-handler.S b/arch/arm/mach-tegra/reset-handler.S +index e3070fd..71be4af 100644 +--- a/arch/arm/mach-tegra/reset-handler.S ++++ b/arch/arm/mach-tegra/reset-handler.S +@@ -169,10 +169,10 @@ after_errata: + cmp r6, #TEGRA20 + bne 1f + /* If not CPU0, don't let CPU0 reset CPU1 now that CPU1 is coming up. */ +- mov32 r5, TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET +- mov r0, #CPU_NOT_RESETTABLE ++ mov32 r5, TEGRA_PMC_BASE ++ mov r0, #0 + cmp r10, #0 +- strneb r0, [r5, #__tegra20_cpu1_resettable_status_offset] ++ strne r0, [r5, #PMC_SCRATCH41] + 1: + #endif + +@@ -281,10 +281,6 @@ __tegra_cpu_reset_handler_data: + .rept TEGRA_RESET_DATA_SIZE + .long 0 + .endr +- .globl __tegra20_cpu1_resettable_status_offset +- .equ __tegra20_cpu1_resettable_status_offset, \ +- . - __tegra_cpu_reset_handler_start +- .byte 0 + .align L1_CACHE_SHIFT + + ENTRY(__tegra_cpu_reset_handler_end) +diff --git a/arch/arm/mach-tegra/reset.h b/arch/arm/mach-tegra/reset.h +index 29c3dec..76a9343 100644 +--- a/arch/arm/mach-tegra/reset.h ++++ b/arch/arm/mach-tegra/reset.h +@@ -35,7 +35,6 @@ extern unsigned long __tegra_cpu_reset_handler_data[TEGRA_RESET_DATA_SIZE]; + + void __tegra_cpu_reset_handler_start(void); + void __tegra_cpu_reset_handler(void); +-void __tegra20_cpu1_resettable_status_offset(void); + void __tegra_cpu_reset_handler_end(void); + void tegra_secondary_startup(void); + +@@ -48,9 +47,6 @@ void tegra_secondary_startup(void); + (IO_ADDRESS(TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET + \ + ((u32)&__tegra_cpu_reset_handler_data[TEGRA_RESET_MASK_LP2] - \ + (u32)__tegra_cpu_reset_handler_start))) +-#define tegra20_cpu1_resettable_status \ +- (IO_ADDRESS(TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET + \ +- (u32)__tegra20_cpu1_resettable_status_offset)) + #endif + + #define tegra_cpu_reset_handler_offset \ +diff --git a/arch/arm/mach-tegra/sleep-tegra20.S b/arch/arm/mach-tegra/sleep-tegra20.S +index e6b684e..be4bc5f 100644 +--- a/arch/arm/mach-tegra/sleep-tegra20.S ++++ b/arch/arm/mach-tegra/sleep-tegra20.S +@@ -97,10 +97,9 @@ ENDPROC(tegra20_hotplug_shutdown) + ENTRY(tegra20_cpu_shutdown) + cmp r0, #0 + reteq lr @ must not be called for CPU 0 +- mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT +- ldr r2, =__tegra20_cpu1_resettable_status_offset ++ mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41 + mov r12, #CPU_RESETTABLE +- strb r12, [r1, r2] ++ str r12, [r1] + + cpu_to_halt_reg r1, r0 + ldr r3, =TEGRA_FLOW_CTRL_VIRT +@@ -183,41 +182,38 @@ ENDPROC(tegra_pen_unlock) + /* + * tegra20_cpu_clear_resettable(void) + * +- * Called to clear the "resettable soon" flag in IRAM variable when ++ * Called to clear the "resettable soon" flag in PMC_SCRATCH41 when + * it is expected that the secondary CPU will be idle soon. + */ + ENTRY(tegra20_cpu_clear_resettable) +- mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT +- ldr r2, =__tegra20_cpu1_resettable_status_offset ++ mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41 + mov r12, #CPU_NOT_RESETTABLE +- strb r12, [r1, r2] ++ str r12, [r1] + ret lr + ENDPROC(tegra20_cpu_clear_resettable) + + /* + * tegra20_cpu_set_resettable_soon(void) + * +- * Called to set the "resettable soon" flag in IRAM variable when ++ * Called to set the "resettable soon" flag in PMC_SCRATCH41 when + * it is expected that the secondary CPU will be idle soon. + */ + ENTRY(tegra20_cpu_set_resettable_soon) +- mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT +- ldr r2, =__tegra20_cpu1_resettable_status_offset ++ mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41 + mov r12, #CPU_RESETTABLE_SOON +- strb r12, [r1, r2] ++ str r12, [r1] + ret lr + ENDPROC(tegra20_cpu_set_resettable_soon) + + /* + * tegra20_cpu_is_resettable_soon(void) + * +- * Returns true if the "resettable soon" flag in IRAM variable has been ++ * Returns true if the "resettable soon" flag in PMC_SCRATCH41 has been + * set because it is expected that the secondary CPU will be idle soon. + */ + ENTRY(tegra20_cpu_is_resettable_soon) +- mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT +- ldr r2, =__tegra20_cpu1_resettable_status_offset +- ldrb r12, [r1, r2] ++ mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41 ++ ldr r12, [r1] + cmp r12, #CPU_RESETTABLE_SOON + moveq r0, #1 + movne r0, #0 +@@ -260,10 +256,9 @@ ENTRY(tegra20_sleep_cpu_secondary_finish) + mov r0, #TEGRA_FLUSH_CACHE_LOUIS + bl tegra_disable_clean_inv_dcache + +- mov32 r0, TEGRA_IRAM_RESET_BASE_VIRT +- ldr r4, =__tegra20_cpu1_resettable_status_offset ++ mov32 r0, TEGRA_PMC_VIRT + PMC_SCRATCH41 + mov r3, #CPU_RESETTABLE +- strb r3, [r0, r4] ++ str r3, [r0] + + bl tegra_cpu_do_idle + +@@ -279,10 +274,10 @@ ENTRY(tegra20_sleep_cpu_secondary_finish) + + bl tegra_pen_lock + +- mov32 r0, TEGRA_IRAM_RESET_BASE_VIRT +- ldr r4, =__tegra20_cpu1_resettable_status_offset ++ mov32 r3, TEGRA_PMC_VIRT ++ add r0, r3, #PMC_SCRATCH41 + mov r3, #CPU_NOT_RESETTABLE +- strb r3, [r0, r4] ++ str r3, [r0] + + bl tegra_pen_unlock + +diff --git a/arch/arm/mach-tegra/sleep.h b/arch/arm/mach-tegra/sleep.h +index 0d59360..92d46ec 100644 +--- a/arch/arm/mach-tegra/sleep.h ++++ b/arch/arm/mach-tegra/sleep.h +@@ -18,7 +18,6 @@ + #define __MACH_TEGRA_SLEEP_H + + #include "iomap.h" +-#include "irammap.h" + + #define TEGRA_ARM_PERIF_VIRT (TEGRA_ARM_PERIF_BASE - IO_CPU_PHYS \ + + IO_CPU_VIRT) +@@ -30,9 +29,6 @@ + + IO_APB_VIRT) + #define TEGRA_PMC_VIRT (TEGRA_PMC_BASE - IO_APB_PHYS + IO_APB_VIRT) + +-#define TEGRA_IRAM_RESET_BASE_VIRT (IO_IRAM_VIRT + \ +- TEGRA_IRAM_RESET_HANDLER_OFFSET) +- + /* PMC_SCRATCH37-39 and 41 are used for tegra_pen_lock and idle */ + #define PMC_SCRATCH37 0x130 + #define PMC_SCRATCH38 0x134 +diff --git a/arch/mips/include/asm/mach-generic/spaces.h b/arch/mips/include/asm/mach-generic/spaces.h +index afc96ec..9488fa5 100644 +--- a/arch/mips/include/asm/mach-generic/spaces.h ++++ b/arch/mips/include/asm/mach-generic/spaces.h +@@ -94,11 +94,7 @@ + #endif + + #ifndef FIXADDR_TOP +-#ifdef CONFIG_KVM_GUEST +-#define FIXADDR_TOP ((unsigned long)(long)(int)0x7ffe0000) +-#else + #define FIXADDR_TOP ((unsigned long)(long)(int)0xfffe0000) + #endif +-#endif + + #endif /* __ASM_MACH_GENERIC_SPACES_H */ +diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c +index 52f205a..bb68e8d 100644 +--- a/arch/mips/kvm/mips.c ++++ b/arch/mips/kvm/mips.c +@@ -982,7 +982,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) + + /* If nothing is dirty, don't bother messing with page tables. */ + if (is_dirty) { +- memslot = id_to_memslot(kvm->memslots, log->slot); ++ memslot = &kvm->memslots->memslots[log->slot]; + + ga = memslot->base_gfn << PAGE_SHIFT; + ga_end = ga + (memslot->npages << PAGE_SHIFT); +diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c +index d90893b..12b6384 100644 +--- a/arch/powerpc/perf/core-book3s.c ++++ b/arch/powerpc/perf/core-book3s.c +@@ -131,16 +131,7 @@ static void pmao_restore_workaround(bool ebb) { } + + static bool regs_use_siar(struct pt_regs *regs) + { +- /* +- * When we take a performance monitor exception the regs are setup +- * using perf_read_regs() which overloads some fields, in particular +- * regs->result to tell us whether to use SIAR. +- * +- * However if the regs are from another exception, eg. a syscall, then +- * they have not been setup using perf_read_regs() and so regs->result +- * is something random. +- */ +- return ((TRAP(regs) == 0xf00) && regs->result); ++ return !!regs->result; + } + + /* +diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c +index 49b7445..9f73c80 100644 +--- a/arch/s390/kernel/crash_dump.c ++++ b/arch/s390/kernel/crash_dump.c +@@ -415,7 +415,7 @@ static void *nt_s390_vx_low(void *ptr, __vector128 *vx_regs) + ptr += len; + /* Copy lower halves of SIMD registers 0-15 */ + for (i = 0; i < 16; i++) { +- memcpy(ptr, &vx_regs[i].u[2], 8); ++ memcpy(ptr, &vx_regs[i], 8); + ptr += 8; + } + return ptr; +diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c +index b745a10..9de4726 100644 +--- a/arch/s390/kvm/interrupt.c ++++ b/arch/s390/kvm/interrupt.c +@@ -1061,7 +1061,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) + if (sclp_has_sigpif()) + return __inject_extcall_sigpif(vcpu, src_id); + +- if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs)) ++ if (!test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs)) + return -EBUSY; + *extcall = irq->u.extcall; + atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags); +@@ -1606,9 +1606,6 @@ void kvm_s390_clear_float_irqs(struct kvm *kvm) + int i; + + spin_lock(&fi->lock); +- fi->pending_irqs = 0; +- memset(&fi->srv_signal, 0, sizeof(fi->srv_signal)); +- memset(&fi->mchk, 0, sizeof(fi->mchk)); + for (i = 0; i < FIRQ_LIST_COUNT; i++) + clear_irq_list(&fi->lists[i]); + for (i = 0; i < FIRQ_MAX_COUNT; i++) +diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c +index 9afb9d6..55423d8 100644 +--- a/arch/s390/net/bpf_jit_comp.c ++++ b/arch/s390/net/bpf_jit_comp.c +@@ -227,7 +227,7 @@ static inline void reg_set_seen(struct bpf_jit *jit, u32 b1) + ({ \ + /* Branch instruction needs 6 bytes */ \ + int rel = (addrs[i + off + 1] - (addrs[i + 1] - 6)) / 2;\ +- _EMIT6(op1 | reg(b1, b2) << 16 | (rel & 0xffff), op2 | mask); \ ++ _EMIT6(op1 | reg(b1, b2) << 16 | rel, op2 | mask); \ + REG_SET_SEEN(b1); \ + REG_SET_SEEN(b2); \ + }) +diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h +index 41b06fc..f4a555b 100644 +--- a/arch/x86/include/asm/kvm_host.h ++++ b/arch/x86/include/asm/kvm_host.h +@@ -591,7 +591,7 @@ struct kvm_arch { + struct kvm_pic *vpic; + struct kvm_ioapic *vioapic; + struct kvm_pit *vpit; +- atomic_t vapics_in_nmi_mode; ++ int vapics_in_nmi_mode; + struct mutex apic_map_lock; + struct kvm_apic_map *apic_map; + +diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c +index aa4e3a7..4f7001f 100644 +--- a/arch/x86/kernel/cpu/perf_event.c ++++ b/arch/x86/kernel/cpu/perf_event.c +@@ -270,7 +270,11 @@ msr_fail: + + static void hw_perf_event_destroy(struct perf_event *event) + { +- x86_release_hardware(); ++ if (atomic_dec_and_mutex_lock(&active_events, &pmc_reserve_mutex)) { ++ release_pmc_hardware(); ++ release_ds_buffers(); ++ mutex_unlock(&pmc_reserve_mutex); ++ } + } + + void hw_perf_lbr_event_destroy(struct perf_event *event) +@@ -320,35 +324,6 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct perf_event *event) + return x86_pmu_extra_regs(val, event); + } + +-int x86_reserve_hardware(void) +-{ +- int err = 0; +- +- if (!atomic_inc_not_zero(&active_events)) { +- mutex_lock(&pmc_reserve_mutex); +- if (atomic_read(&active_events) == 0) { +- if (!reserve_pmc_hardware()) +- err = -EBUSY; +- else +- reserve_ds_buffers(); +- } +- if (!err) +- atomic_inc(&active_events); +- mutex_unlock(&pmc_reserve_mutex); +- } +- +- return err; +-} +- +-void x86_release_hardware(void) +-{ +- if (atomic_dec_and_mutex_lock(&active_events, &pmc_reserve_mutex)) { +- release_pmc_hardware(); +- release_ds_buffers(); +- mutex_unlock(&pmc_reserve_mutex); +- } +-} +- + /* + * Check if we can create event of a certain type (that no conflicting events + * are present). +@@ -361,10 +336,9 @@ int x86_add_exclusive(unsigned int what) + return 0; + + mutex_lock(&pmc_reserve_mutex); +- for (i = 0; i < ARRAY_SIZE(x86_pmu.lbr_exclusive); i++) { ++ for (i = 0; i < ARRAY_SIZE(x86_pmu.lbr_exclusive); i++) + if (i != what && atomic_read(&x86_pmu.lbr_exclusive[i])) + goto out; +- } + + atomic_inc(&x86_pmu.lbr_exclusive[what]); + ret = 0; +@@ -553,7 +527,19 @@ static int __x86_pmu_event_init(struct perf_event *event) + if (!x86_pmu_initialized()) + return -ENODEV; + +- err = x86_reserve_hardware(); ++ err = 0; ++ if (!atomic_inc_not_zero(&active_events)) { ++ mutex_lock(&pmc_reserve_mutex); ++ if (atomic_read(&active_events) == 0) { ++ if (!reserve_pmc_hardware()) ++ err = -EBUSY; ++ else ++ reserve_ds_buffers(); ++ } ++ if (!err) ++ atomic_inc(&active_events); ++ mutex_unlock(&pmc_reserve_mutex); ++ } + if (err) + return err; + +diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h +index f068695..ef78516 100644 +--- a/arch/x86/kernel/cpu/perf_event.h ++++ b/arch/x86/kernel/cpu/perf_event.h +@@ -703,10 +703,6 @@ int x86_add_exclusive(unsigned int what); + + void x86_del_exclusive(unsigned int what); + +-int x86_reserve_hardware(void); +- +-void x86_release_hardware(void); +- + void hw_perf_lbr_event_destroy(struct perf_event *event); + + int x86_setup_perfctr(struct perf_event *event); +diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c +index 2813ea0..a1e35c9 100644 +--- a/arch/x86/kernel/cpu/perf_event_intel.c ++++ b/arch/x86/kernel/cpu/perf_event_intel.c +@@ -3253,8 +3253,6 @@ __init int intel_pmu_init(void) + + case 61: /* 14nm Broadwell Core-M */ + case 86: /* 14nm Broadwell Xeon D */ +- case 71: /* 14nm Broadwell + GT3e (Intel Iris Pro graphics) */ +- case 79: /* 14nm Broadwell Server */ + x86_pmu.late_ack = true; + memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids)); + memcpy(hw_cache_extra_regs, hsw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs)); +@@ -3324,13 +3322,13 @@ __init int intel_pmu_init(void) + * counter, so do not extend mask to generic counters + */ + for_each_event_constraint(c, x86_pmu.event_constraints) { +- if (c->cmask == FIXED_EVENT_FLAGS +- && c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES) { +- c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1; ++ if (c->cmask != FIXED_EVENT_FLAGS ++ || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) { ++ continue; + } +- c->idxmsk64 &= +- ~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed)); +- c->weight = hweight64(c->idxmsk64); ++ ++ c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1; ++ c->weight += x86_pmu.num_counters; + } + } + +diff --git a/arch/x86/kernel/cpu/perf_event_intel_bts.c b/arch/x86/kernel/cpu/perf_event_intel_bts.c +index 7795f3f..ac1f0c5 100644 +--- a/arch/x86/kernel/cpu/perf_event_intel_bts.c ++++ b/arch/x86/kernel/cpu/perf_event_intel_bts.c +@@ -483,26 +483,17 @@ static int bts_event_add(struct perf_event *event, int mode) + + static void bts_event_destroy(struct perf_event *event) + { +- x86_release_hardware(); + x86_del_exclusive(x86_lbr_exclusive_bts); + } + + static int bts_event_init(struct perf_event *event) + { +- int ret; +- + if (event->attr.type != bts_pmu.type) + return -ENOENT; + + if (x86_add_exclusive(x86_lbr_exclusive_bts)) + return -EBUSY; + +- ret = x86_reserve_hardware(); +- if (ret) { +- x86_del_exclusive(x86_lbr_exclusive_bts); +- return ret; +- } +- + event->destroy = bts_event_destroy; + + return 0; +diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S +index 7e429c9..53eeb22 100644 +--- a/arch/x86/kernel/head_32.S ++++ b/arch/x86/kernel/head_32.S +@@ -62,16 +62,9 @@ + #define PAGE_TABLE_SIZE(pages) ((pages) / PTRS_PER_PGD) + #endif + +-/* +- * Number of possible pages in the lowmem region. +- * +- * We shift 2 by 31 instead of 1 by 32 to the left in order to avoid a +- * gas warning about overflowing shift count when gas has been compiled +- * with only a host target support using a 32-bit type for internal +- * representation. +- */ +-LOWMEM_PAGES = (((2<<31) - __PAGE_OFFSET) >> PAGE_SHIFT) +- ++/* Number of possible pages in the lowmem region */ ++LOWMEM_PAGES = (((1<<32) - __PAGE_OFFSET) >> PAGE_SHIFT) ++ + /* Enough space to fit pagetables for the low memory linear map */ + MAPPING_BEYOND_END = PAGE_TABLE_SIZE(LOWMEM_PAGES) << PAGE_SHIFT + +diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c +index f90952f..4dce6f8 100644 +--- a/arch/x86/kvm/i8254.c ++++ b/arch/x86/kvm/i8254.c +@@ -305,7 +305,7 @@ static void pit_do_work(struct kthread_work *work) + * LVT0 to NMI delivery. Other PIC interrupts are just sent to + * VCPU0, and only if its LVT0 is in EXTINT mode. + */ +- if (atomic_read(&kvm->arch.vapics_in_nmi_mode) > 0) ++ if (kvm->arch.vapics_in_nmi_mode > 0) + kvm_for_each_vcpu(i, vcpu, kvm) + kvm_apic_nmi_wd_deliver(vcpu); + } +diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c +index 67d07e0..4c7deb4 100644 +--- a/arch/x86/kvm/lapic.c ++++ b/arch/x86/kvm/lapic.c +@@ -1250,10 +1250,10 @@ static void apic_manage_nmi_watchdog(struct kvm_lapic *apic, u32 lvt0_val) + if (!nmi_wd_enabled) { + apic_debug("Receive NMI setting on APIC_LVT0 " + "for cpu %d\n", apic->vcpu->vcpu_id); +- atomic_inc(&apic->vcpu->kvm->arch.vapics_in_nmi_mode); ++ apic->vcpu->kvm->arch.vapics_in_nmi_mode++; + } + } else if (nmi_wd_enabled) +- atomic_dec(&apic->vcpu->kvm->arch.vapics_in_nmi_mode); ++ apic->vcpu->kvm->arch.vapics_in_nmi_mode--; + } + + static int apic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) +@@ -1808,7 +1808,6 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu, + apic_update_ppr(apic); + hrtimer_cancel(&apic->lapic_timer.timer); + apic_update_lvtt(apic); +- apic_manage_nmi_watchdog(apic, kvm_apic_get_reg(apic, APIC_LVT0)); + update_divide_count(apic); + start_apic_timer(apic); + apic->irr_pending = true; +diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c +index 4911bf1..9afa233 100644 +--- a/arch/x86/kvm/svm.c ++++ b/arch/x86/kvm/svm.c +@@ -511,10 +511,8 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) + { + struct vcpu_svm *svm = to_svm(vcpu); + +- if (svm->vmcb->control.next_rip != 0) { +- WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS)); ++ if (svm->vmcb->control.next_rip != 0) + svm->next_rip = svm->vmcb->control.next_rip; +- } + + if (!svm->next_rip) { + if (emulate_instruction(vcpu, EMULTYPE_SKIP) != +@@ -4319,9 +4317,7 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu, + break; + } + +- /* TODO: Advertise NRIPS to guest hypervisor unconditionally */ +- if (static_cpu_has(X86_FEATURE_NRIPS)) +- vmcb->control.next_rip = info->next_rip; ++ vmcb->control.next_rip = info->next_rip; + vmcb->control.exit_code = icpt_info.exit_code; + vmexit = nested_svm_exit_handled(svm); + +diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c +index ff99117..14a63ed 100644 +--- a/arch/x86/pci/acpi.c ++++ b/arch/x86/pci/acpi.c +@@ -81,17 +81,6 @@ static const struct dmi_system_id pci_crs_quirks[] __initconst = { + DMI_MATCH(DMI_BIOS_VENDOR, "Phoenix Technologies, LTD"), + }, + }, +- /* https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/931368 */ +- /* https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1033299 */ +- { +- .callback = set_use_crs, +- .ident = "Foxconn K8M890-8237A", +- .matches = { +- DMI_MATCH(DMI_BOARD_VENDOR, "Foxconn"), +- DMI_MATCH(DMI_BOARD_NAME, "K8M890-8237A"), +- DMI_MATCH(DMI_BIOS_VENDOR, "Phoenix Technologies, LTD"), +- }, +- }, + + /* Now for the blacklist.. */ + +@@ -132,10 +121,8 @@ void __init pci_acpi_crs_quirks(void) + { + int year; + +- if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) && year < 2008) { +- if (iomem_resource.end <= 0xffffffff) +- pci_use_crs = false; +- } ++ if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) && year < 2008) ++ pci_use_crs = false; + + dmi_check_system(pci_crs_quirks); + +diff --git a/drivers/bluetooth/ath3k.c b/drivers/bluetooth/ath3k.c +index e527a3e..8c81af6 100644 +--- a/drivers/bluetooth/ath3k.c ++++ b/drivers/bluetooth/ath3k.c +@@ -80,7 +80,6 @@ static const struct usb_device_id ath3k_table[] = { + { USB_DEVICE(0x0489, 0xe057) }, + { USB_DEVICE(0x0489, 0xe056) }, + { USB_DEVICE(0x0489, 0xe05f) }, +- { USB_DEVICE(0x0489, 0xe076) }, + { USB_DEVICE(0x0489, 0xe078) }, + { USB_DEVICE(0x04c5, 0x1330) }, + { USB_DEVICE(0x04CA, 0x3004) }, +@@ -89,7 +88,6 @@ static const struct usb_device_id ath3k_table[] = { + { USB_DEVICE(0x04CA, 0x3007) }, + { USB_DEVICE(0x04CA, 0x3008) }, + { USB_DEVICE(0x04CA, 0x300b) }, +- { USB_DEVICE(0x04CA, 0x300d) }, + { USB_DEVICE(0x04CA, 0x300f) }, + { USB_DEVICE(0x04CA, 0x3010) }, + { USB_DEVICE(0x0930, 0x0219) }, +@@ -115,7 +113,6 @@ static const struct usb_device_id ath3k_table[] = { + { USB_DEVICE(0x13d3, 0x3408) }, + { USB_DEVICE(0x13d3, 0x3423) }, + { USB_DEVICE(0x13d3, 0x3432) }, +- { USB_DEVICE(0x13d3, 0x3474) }, + + /* Atheros AR5BBU12 with sflash firmware */ + { USB_DEVICE(0x0489, 0xE02C) }, +@@ -140,7 +137,6 @@ static const struct usb_device_id ath3k_blist_tbl[] = { + { USB_DEVICE(0x0489, 0xe056), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0489, 0xe057), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0489, 0xe05f), .driver_info = BTUSB_ATH3012 }, +- { USB_DEVICE(0x0489, 0xe076), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0489, 0xe078), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04c5, 0x1330), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x3004), .driver_info = BTUSB_ATH3012 }, +@@ -149,7 +145,6 @@ static const struct usb_device_id ath3k_blist_tbl[] = { + { USB_DEVICE(0x04ca, 0x3007), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x3008), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x300b), .driver_info = BTUSB_ATH3012 }, +- { USB_DEVICE(0x04ca, 0x300d), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x300f), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x3010), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0930, 0x0219), .driver_info = BTUSB_ATH3012 }, +@@ -175,7 +170,6 @@ static const struct usb_device_id ath3k_blist_tbl[] = { + { USB_DEVICE(0x13d3, 0x3408), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x13d3, 0x3423), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x13d3, 0x3432), .driver_info = BTUSB_ATH3012 }, +- { USB_DEVICE(0x13d3, 0x3474), .driver_info = BTUSB_ATH3012 }, + + /* Atheros AR5BBU22 with sflash firmware */ + { USB_DEVICE(0x0489, 0xE036), .driver_info = BTUSB_ATH3012 }, +diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c +index 420cc9f..3c10d4d 100644 +--- a/drivers/bluetooth/btusb.c ++++ b/drivers/bluetooth/btusb.c +@@ -178,7 +178,6 @@ static const struct usb_device_id blacklist_table[] = { + { USB_DEVICE(0x0489, 0xe056), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0489, 0xe057), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0489, 0xe05f), .driver_info = BTUSB_ATH3012 }, +- { USB_DEVICE(0x0489, 0xe076), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0489, 0xe078), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04c5, 0x1330), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x3004), .driver_info = BTUSB_ATH3012 }, +@@ -187,7 +186,6 @@ static const struct usb_device_id blacklist_table[] = { + { USB_DEVICE(0x04ca, 0x3007), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x3008), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x300b), .driver_info = BTUSB_ATH3012 }, +- { USB_DEVICE(0x04ca, 0x300d), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x300f), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x04ca, 0x3010), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x0930, 0x0219), .driver_info = BTUSB_ATH3012 }, +@@ -213,7 +211,6 @@ static const struct usb_device_id blacklist_table[] = { + { USB_DEVICE(0x13d3, 0x3408), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x13d3, 0x3423), .driver_info = BTUSB_ATH3012 }, + { USB_DEVICE(0x13d3, 0x3432), .driver_info = BTUSB_ATH3012 }, +- { USB_DEVICE(0x13d3, 0x3474), .driver_info = BTUSB_ATH3012 }, + + /* Atheros AR5BBU12 with sflash firmware */ + { USB_DEVICE(0x0489, 0xe02c), .driver_info = BTUSB_IGNORE }, +diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c +index c45d274..6414661 100644 +--- a/drivers/cpufreq/intel_pstate.c ++++ b/drivers/cpufreq/intel_pstate.c +@@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate) + + val |= vid; + +- wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val); ++ wrmsrl(MSR_IA32_PERF_CTL, val); + } + + #define BYT_BCLK_FREQS 5 +diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c +index 3442764..5937207 100644 +--- a/drivers/cpuidle/cpuidle-powernv.c ++++ b/drivers/cpuidle/cpuidle-powernv.c +@@ -60,8 +60,6 @@ static int nap_loop(struct cpuidle_device *dev, + return index; + } + +-/* Register for fastsleep only in oneshot mode of broadcast */ +-#ifdef CONFIG_TICK_ONESHOT + static int fastsleep_loop(struct cpuidle_device *dev, + struct cpuidle_driver *drv, + int index) +@@ -85,7 +83,7 @@ static int fastsleep_loop(struct cpuidle_device *dev, + + return index; + } +-#endif ++ + /* + * States for dedicated partition case. + */ +@@ -211,14 +209,7 @@ static int powernv_add_idle_states(void) + powernv_states[nr_idle_states].flags = 0; + powernv_states[nr_idle_states].target_residency = 100; + powernv_states[nr_idle_states].enter = &nap_loop; +- } +- +- /* +- * All cpuidle states with CPUIDLE_FLAG_TIMER_STOP set must come +- * within this config dependency check. +- */ +-#ifdef CONFIG_TICK_ONESHOT +- if (flags[i] & OPAL_PM_SLEEP_ENABLED || ++ } else if (flags[i] & OPAL_PM_SLEEP_ENABLED || + flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) { + /* Add FASTSLEEP state */ + strcpy(powernv_states[nr_idle_states].name, "FastSleep"); +@@ -227,7 +218,7 @@ static int powernv_add_idle_states(void) + powernv_states[nr_idle_states].target_residency = 300000; + powernv_states[nr_idle_states].enter = &fastsleep_loop; + } +-#endif ++ + powernv_states[nr_idle_states].exit_latency = + ((unsigned int)latency_ns[i]) / 1000; + +diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c +index f062158..857414a 100644 +--- a/drivers/crypto/talitos.c ++++ b/drivers/crypto/talitos.c +@@ -925,8 +925,7 @@ static int sg_to_link_tbl(struct scatterlist *sg, int sg_count, + sg_count--; + link_tbl_ptr--; + } +- link_tbl_ptr->len = cpu_to_be16(be16_to_cpu(link_tbl_ptr->len) +- + cryptlen); ++ be16_add_cpu(&link_tbl_ptr->len, cryptlen); + + /* tag end of link table */ + link_tbl_ptr->j_extent = DESC_PTR_LNKTBL_RETURN; +@@ -2562,7 +2561,6 @@ static struct talitos_crypto_alg *talitos_alg_alloc(struct device *dev, + break; + default: + dev_err(dev, "unknown algorithm type %d\n", t_alg->algt.type); +- kfree(t_alg); + return ERR_PTR(-EINVAL); + } + +diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c +index ca9f4ed..e1c7e9e 100644 +--- a/drivers/iommu/amd_iommu.c ++++ b/drivers/iommu/amd_iommu.c +@@ -1869,15 +1869,9 @@ static void free_pt_##LVL (unsigned long __pt) \ + pt = (u64 *)__pt; \ + \ + for (i = 0; i < 512; ++i) { \ +- /* PTE present? */ \ + if (!IOMMU_PTE_PRESENT(pt[i])) \ + continue; \ + \ +- /* Large PTE? */ \ +- if (PM_PTE_LEVEL(pt[i]) == 0 || \ +- PM_PTE_LEVEL(pt[i]) == 7) \ +- continue; \ +- \ + p = (unsigned long)IOMMU_PTE_PAGE(pt[i]); \ + FN(p); \ + } \ +diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c +index 65075ef..66a803b 100644 +--- a/drivers/iommu/arm-smmu.c ++++ b/drivers/iommu/arm-smmu.c +@@ -1567,7 +1567,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) + return -ENODEV; + } + +- if ((id & ID0_S1TS) && ((smmu->version == 1) || !(id & ID0_ATOSNS))) { ++ if ((id & ID0_S1TS) && ((smmu->version == 1) || (id & ID0_ATOSNS))) { + smmu->features |= ARM_SMMU_FEAT_TRANS_OPS; + dev_notice(smmu->dev, "\taddress translation ops\n"); + } +diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c +index 9231cdf..c80287a 100644 +--- a/drivers/mmc/host/sdhci.c ++++ b/drivers/mmc/host/sdhci.c +@@ -848,7 +848,7 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd) + int sg_cnt; + + sg_cnt = sdhci_pre_dma_transfer(host, data, NULL); +- if (sg_cnt <= 0) { ++ if (sg_cnt == 0) { + /* + * This only happens when someone fed + * us an invalid request. +diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c +index e9b1810..b0f6924 100644 +--- a/drivers/net/can/dev.c ++++ b/drivers/net/can/dev.c +@@ -440,9 +440,6 @@ unsigned int can_get_echo_skb(struct net_device *dev, unsigned int idx) + struct can_frame *cf = (struct can_frame *)skb->data; + u8 dlc = cf->can_dlc; + +- if (!(skb->tstamp.tv64)) +- __net_timestamp(skb); +- + netif_rx(priv->echo_skb[idx]); + priv->echo_skb[idx] = NULL; + +@@ -578,7 +575,6 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, struct can_frame **cf) + if (unlikely(!skb)) + return NULL; + +- __net_timestamp(skb); + skb->protocol = htons(ETH_P_CAN); + skb->pkt_type = PACKET_BROADCAST; + skb->ip_summed = CHECKSUM_UNNECESSARY; +@@ -607,7 +603,6 @@ struct sk_buff *alloc_canfd_skb(struct net_device *dev, + if (unlikely(!skb)) + return NULL; + +- __net_timestamp(skb); + skb->protocol = htons(ETH_P_CANFD); + skb->pkt_type = PACKET_BROADCAST; + skb->ip_summed = CHECKSUM_UNNECESSARY; +diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c +index f64f529..c837eb9 100644 +--- a/drivers/net/can/slcan.c ++++ b/drivers/net/can/slcan.c +@@ -207,7 +207,6 @@ static void slc_bump(struct slcan *sl) + if (!skb) + return; + +- __net_timestamp(skb); + skb->dev = sl->dev; + skb->protocol = htons(ETH_P_CAN); + skb->pkt_type = PACKET_BROADCAST; +diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c +index 0ce868d..674f367 100644 +--- a/drivers/net/can/vcan.c ++++ b/drivers/net/can/vcan.c +@@ -78,9 +78,6 @@ static void vcan_rx(struct sk_buff *skb, struct net_device *dev) + skb->dev = dev; + skb->ip_summed = CHECKSUM_UNNECESSARY; + +- if (!(skb->tstamp.tv64)) +- __net_timestamp(skb); +- + netif_rx_ni(skb); + } + +diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c +index 5c92fb7..d81fc6b 100644 +--- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c ++++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c +@@ -263,7 +263,7 @@ static int xgbe_alloc_pages(struct xgbe_prv_data *pdata, + int ret; + + /* Try to obtain pages, decreasing order if necessary */ +- gfp |= __GFP_COLD | __GFP_COMP | __GFP_NOWARN; ++ gfp |= __GFP_COLD | __GFP_COMP; + while (order >= 0) { + pages = alloc_pages(gfp, order); + if (pages) +diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +index 8a97d28..33501bc 100644 +--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c ++++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +@@ -9323,8 +9323,7 @@ unload_error: + * function stop ramrod is sent, since as part of this ramrod FW access + * PTP registers. + */ +- if (bp->flags & PTP_SUPPORTED) +- bnx2x_stop_ptp(bp); ++ bnx2x_stop_ptp(bp); + + /* Disable HW interrupts, NAPI */ + bnx2x_netif_stop(bp, 1); +diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c +index 74d0389..ce5f7f9 100644 +--- a/drivers/net/ethernet/marvell/mvneta.c ++++ b/drivers/net/ethernet/marvell/mvneta.c +@@ -310,7 +310,6 @@ struct mvneta_port { + unsigned int link; + unsigned int duplex; + unsigned int speed; +- unsigned int tx_csum_limit; + int use_inband_status:1; + }; + +@@ -1014,12 +1013,6 @@ static void mvneta_defaults_set(struct mvneta_port *pp) + val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER); + val |= MVNETA_GMAC_1MS_CLOCK_ENABLE; + mvreg_write(pp, MVNETA_GMAC_CLOCK_DIVIDER, val); +- } else { +- val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG); +- val &= ~(MVNETA_GMAC_INBAND_AN_ENABLE | +- MVNETA_GMAC_AN_SPEED_EN | +- MVNETA_GMAC_AN_DUPLEX_EN); +- mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val); + } + + mvneta_set_ucast_table(pp, -1); +@@ -2509,10 +2502,8 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu) + + dev->mtu = mtu; + +- if (!netif_running(dev)) { +- netdev_update_features(dev); ++ if (!netif_running(dev)) + return 0; +- } + + /* The interface is running, so we have to force a + * reallocation of the queues +@@ -2541,26 +2532,9 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu) + mvneta_start_dev(pp); + mvneta_port_up(pp); + +- netdev_update_features(dev); +- + return 0; + } + +-static netdev_features_t mvneta_fix_features(struct net_device *dev, +- netdev_features_t features) +-{ +- struct mvneta_port *pp = netdev_priv(dev); +- +- if (pp->tx_csum_limit && dev->mtu > pp->tx_csum_limit) { +- features &= ~(NETIF_F_IP_CSUM | NETIF_F_TSO); +- netdev_info(dev, +- "Disable IP checksum for MTU greater than %dB\n", +- pp->tx_csum_limit); +- } +- +- return features; +-} +- + /* Get mac address */ + static void mvneta_get_mac_addr(struct mvneta_port *pp, unsigned char *addr) + { +@@ -2882,7 +2856,6 @@ static const struct net_device_ops mvneta_netdev_ops = { + .ndo_set_rx_mode = mvneta_set_rx_mode, + .ndo_set_mac_address = mvneta_set_mac_addr, + .ndo_change_mtu = mvneta_change_mtu, +- .ndo_fix_features = mvneta_fix_features, + .ndo_get_stats64 = mvneta_get_stats64, + .ndo_do_ioctl = mvneta_ioctl, + }; +@@ -3128,9 +3101,6 @@ static int mvneta_probe(struct platform_device *pdev) + } + } + +- if (of_device_is_compatible(dn, "marvell,armada-370-neta")) +- pp->tx_csum_limit = 1600; +- + pp->tx_ring_size = MVNETA_MAX_TXD; + pp->rx_ring_size = MVNETA_MAX_RXD; + +@@ -3209,7 +3179,6 @@ static int mvneta_remove(struct platform_device *pdev) + + static const struct of_device_id mvneta_match[] = { + { .compatible = "marvell,armada-370-neta" }, +- { .compatible = "marvell,armada-xp-neta" }, + { } + }; + MODULE_DEVICE_TABLE(of, mvneta_match); +diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +index a5a0b84..cf467a9 100644 +--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c ++++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +@@ -1973,6 +1973,10 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv) + mlx4_en_destroy_cq(priv, &priv->rx_cq[i]); + } + ++ if (priv->base_tx_qpn) { ++ mlx4_qp_release_range(priv->mdev->dev, priv->base_tx_qpn, priv->tx_ring_num); ++ priv->base_tx_qpn = 0; ++ } + } + + int mlx4_en_alloc_resources(struct mlx4_en_priv *priv) +diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c +index eab4e08..2a77a6b 100644 +--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c +@@ -723,7 +723,7 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb, + } + #endif + static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, +- netdev_features_t dev_features) ++ int hwtstamp_rx_filter) + { + __wsum hw_checksum = 0; + +@@ -731,8 +731,14 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, + + hw_checksum = csum_unfold((__force __sum16)cqe->checksum); + +- if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK) && +- !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) { ++ if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) && ++ hwtstamp_rx_filter != HWTSTAMP_FILTER_NONE) { ++ /* next protocol non IPv4 or IPv6 */ ++ if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto ++ != htons(ETH_P_IP) && ++ ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto ++ != htons(ETH_P_IPV6)) ++ return -1; + hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr); + hdr += sizeof(struct vlan_hdr); + } +@@ -895,8 +901,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud + + if (ip_summed == CHECKSUM_COMPLETE) { + void *va = skb_frag_address(skb_shinfo(gro_skb)->frags); +- if (check_csum(cqe, gro_skb, va, +- dev->features)) { ++ if (check_csum(cqe, gro_skb, va, ring->hwtstamp_rx_filter)) { + ip_summed = CHECKSUM_NONE; + ring->csum_none++; + ring->csum_complete--; +@@ -951,7 +956,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud + } + + if (ip_summed == CHECKSUM_COMPLETE) { +- if (check_csum(cqe, skb, skb->data, dev->features)) { ++ if (check_csum(cqe, skb, skb->data, ring->hwtstamp_rx_filter)) { + ip_summed = CHECKSUM_NONE; + ring->csum_complete--; + ring->csum_none++; +diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c +index c10d98f..7bed3a8 100644 +--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c ++++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c +@@ -66,7 +66,6 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv, + ring->size = size; + ring->size_mask = size - 1; + ring->stride = stride; +- ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS; + + tmp = size * sizeof(struct mlx4_en_tx_info); + ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node); +@@ -181,7 +180,6 @@ void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv, + mlx4_bf_free(mdev->dev, &ring->bf); + mlx4_qp_remove(mdev->dev, &ring->qp); + mlx4_qp_free(mdev->dev, &ring->qp); +- mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1); + mlx4_en_unmap_buffer(&ring->wqres.buf); + mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size); + kfree(ring->bounce_buf); +@@ -233,11 +231,6 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv, + MLX4_QP_STATE_RST, NULL, 0, 0, &ring->qp); + } + +-static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring) +-{ +- return ring->prod - ring->cons > ring->full_size; +-} +- + static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv, + struct mlx4_en_tx_ring *ring, int index, + u8 owner) +@@ -480,10 +473,11 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev, + + netdev_tx_completed_queue(ring->tx_queue, packets, bytes); + +- /* Wakeup Tx queue if this stopped, and ring is not full. ++ /* ++ * Wakeup Tx queue if this stopped, and at least 1 packet ++ * was completed + */ +- if (netif_tx_queue_stopped(ring->tx_queue) && +- !mlx4_en_is_tx_ring_full(ring)) { ++ if (netif_tx_queue_stopped(ring->tx_queue) && txbbs_skipped > 0) { + netif_tx_wake_queue(ring->tx_queue); + ring->wake_queue++; + } +@@ -927,7 +921,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) + skb_tx_timestamp(skb); + + /* Check available TXBBs And 2K spare for prefetch */ +- stop_queue = mlx4_en_is_tx_ring_full(ring); ++ stop_queue = (int)(ring->prod - ring_cons) > ++ ring->size - HEADROOM - MAX_DESC_TXBBS; + if (unlikely(stop_queue)) { + netif_tx_stop_queue(ring->tx_queue); + ring->queue_stopped++; +@@ -996,7 +991,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) + smp_rmb(); + + ring_cons = ACCESS_ONCE(ring->cons); +- if (unlikely(!mlx4_en_is_tx_ring_full(ring))) { ++ if (unlikely(((int)(ring->prod - ring_cons)) <= ++ ring->size - HEADROOM - MAX_DESC_TXBBS)) { + netif_tx_wake_queue(ring->tx_queue); + ring->wake_queue++; + } +diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c +index 0d80aed..6fce587 100644 +--- a/drivers/net/ethernet/mellanox/mlx4/intf.c ++++ b/drivers/net/ethernet/mellanox/mlx4/intf.c +@@ -93,14 +93,8 @@ int mlx4_register_interface(struct mlx4_interface *intf) + mutex_lock(&intf_mutex); + + list_add_tail(&intf->list, &intf_list); +- list_for_each_entry(priv, &dev_list, dev_list) { +- if (mlx4_is_mfunc(&priv->dev) && (intf->flags & MLX4_INTFF_BONDING)) { +- mlx4_dbg(&priv->dev, +- "SRIOV, disabling HA mode for intf proto %d\n", intf->protocol); +- intf->flags &= ~MLX4_INTFF_BONDING; +- } ++ list_for_each_entry(priv, &dev_list, dev_list) + mlx4_add_device(intf, priv); +- } + + mutex_unlock(&intf_mutex); + +diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +index 909fcf8..d021f07 100644 +--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h ++++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +@@ -279,7 +279,6 @@ struct mlx4_en_tx_ring { + u32 size; /* number of TXBBs */ + u32 size_mask; + u16 stride; +- u32 full_size; + u16 cqn; /* index of port CQ associated with this ring */ + u32 buf_size; + __be32 doorbell_qpn; +@@ -580,6 +579,7 @@ struct mlx4_en_priv { + int vids[128]; + bool wol; + struct device *ddev; ++ int base_tx_qpn; + struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE]; + struct hwtstamp_config hwtstamp_config; + +diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c +index d551df6..bdfe51f 100644 +--- a/drivers/net/phy/phy_device.c ++++ b/drivers/net/phy/phy_device.c +@@ -796,11 +796,10 @@ static int genphy_config_advert(struct phy_device *phydev) + if (phydev->supported & (SUPPORTED_1000baseT_Half | + SUPPORTED_1000baseT_Full)) { + adv |= ethtool_adv_to_mii_ctrl1000_t(advertise); ++ if (adv != oldadv) ++ changed = 1; + } + +- if (adv != oldadv) +- changed = 1; +- + err = phy_write(phydev, MII_CTRL1000, adv); + if (err < 0) + return err; +diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c +index 4cdac78..b2f9521 100644 +--- a/drivers/net/wireless/b43/main.c ++++ b/drivers/net/wireless/b43/main.c +@@ -5365,10 +5365,6 @@ static void b43_supported_bands(struct b43_wldev *dev, bool *have_2ghz_phy, + *have_5ghz_phy = true; + return; + case 0x4321: /* BCM4306 */ +- /* There are 14e4:4321 PCI devs with 2.4 GHz BCM4321 (N-PHY) */ +- if (dev->phy.type != B43_PHYTYPE_G) +- break; +- /* fall through */ + case 0x4313: /* BCM4311 */ + case 0x431a: /* BCM4318 */ + case 0x432a: /* BCM4321 */ +diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c +index ec383b0..968787a 100644 +--- a/drivers/net/xen-netback/xenbus.c ++++ b/drivers/net/xen-netback/xenbus.c +@@ -681,9 +681,6 @@ static int xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif) + char *node; + unsigned maxlen = strlen(dev->nodename) + sizeof("/rate"); + +- if (vif->credit_watch.node) +- return -EADDRINUSE; +- + node = kmalloc(maxlen, GFP_KERNEL); + if (!node) + return -ENOMEM; +@@ -773,7 +770,6 @@ static void connect(struct backend_info *be) + } + + xen_net_read_rate(dev, &credit_bytes, &credit_usec); +- xen_unregister_watchers(be->vif); + xen_register_watchers(dev, be->vif); + read_xenbus_vif_flags(be); + +diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c +index f8d8fdb..6f1fa17 100644 +--- a/drivers/s390/kvm/virtio_ccw.c ++++ b/drivers/s390/kvm/virtio_ccw.c +@@ -65,7 +65,6 @@ struct virtio_ccw_device { + bool is_thinint; + bool going_away; + bool device_lost; +- unsigned int config_ready; + void *airq_info; + }; + +@@ -834,11 +833,8 @@ static void virtio_ccw_get_config(struct virtio_device *vdev, + if (ret) + goto out_free; + +- memcpy(vcdev->config, config_area, offset + len); +- if (buf) +- memcpy(buf, &vcdev->config[offset], len); +- if (vcdev->config_ready < offset + len) +- vcdev->config_ready = offset + len; ++ memcpy(vcdev->config, config_area, sizeof(vcdev->config)); ++ memcpy(buf, &vcdev->config[offset], len); + + out_free: + kfree(config_area); +@@ -861,9 +857,6 @@ static void virtio_ccw_set_config(struct virtio_device *vdev, + if (!config_area) + goto out_free; + +- /* Make sure we don't overwrite fields. */ +- if (vcdev->config_ready < offset) +- virtio_ccw_get_config(vdev, 0, NULL, offset); + memcpy(&vcdev->config[offset], buf, len); + /* Write the config area to the host. */ + memcpy(config_area, vcdev->config, sizeof(vcdev->config)); +diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c +index a086e1d..5c8f581 100644 +--- a/drivers/usb/class/cdc-acm.c ++++ b/drivers/usb/class/cdc-acm.c +@@ -1477,11 +1477,6 @@ skip_countries: + goto alloc_fail8; + } + +- if (quirks & CLEAR_HALT_CONDITIONS) { +- usb_clear_halt(usb_dev, usb_rcvbulkpipe(usb_dev, epread->bEndpointAddress)); +- usb_clear_halt(usb_dev, usb_sndbulkpipe(usb_dev, epwrite->bEndpointAddress)); +- } +- + return 0; + alloc_fail8: + if (acm->country_codes) { +@@ -1761,10 +1756,6 @@ static const struct usb_device_id acm_ids[] = { + .driver_info = NO_UNION_NORMAL, /* reports zero length descriptor */ + }, + +- { USB_DEVICE(0x2912, 0x0001), /* ATOL FPrint */ +- .driver_info = CLEAR_HALT_CONDITIONS, +- }, +- + /* Nokia S60 phones expose two ACM channels. The first is + * a modem and is picked up by the standard AT-command + * information below. The second is 'vendor-specific' but +diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h +index b3b6c9d..ffeb3c8 100644 +--- a/drivers/usb/class/cdc-acm.h ++++ b/drivers/usb/class/cdc-acm.h +@@ -133,4 +133,3 @@ struct acm { + #define NO_DATA_INTERFACE BIT(4) + #define IGNORE_DEVICE BIT(5) + #define QUIRK_CONTROL_LINE_STATE BIT(6) +-#define CLEAR_HALT_CONDITIONS BIT(7) +diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c +index 45b8c8b..3507f88 100644 +--- a/drivers/usb/gadget/function/f_fs.c ++++ b/drivers/usb/gadget/function/f_fs.c +@@ -3435,7 +3435,6 @@ done: + static void ffs_closed(struct ffs_data *ffs) + { + struct ffs_dev *ffs_obj; +- struct f_fs_opts *opts; + + ENTER(); + ffs_dev_lock(); +@@ -3450,13 +3449,8 @@ static void ffs_closed(struct ffs_data *ffs) + ffs_obj->ffs_closed_callback) + ffs_obj->ffs_closed_callback(ffs); + +- if (ffs_obj->opts) +- opts = ffs_obj->opts; +- else +- goto done; +- +- if (opts->no_configfs || !opts->func_inst.group.cg_item.ci_parent +- || !atomic_read(&opts->func_inst.group.cg_item.ci_kref.refcount)) ++ if (!ffs_obj->opts || ffs_obj->opts->no_configfs ++ || !ffs_obj->opts->func_inst.group.cg_item.ci_parent) + goto done; + + unregister_gadget_item(ffs_obj->opts-> +diff --git a/fs/dcache.c b/fs/dcache.c +index 50bb3c2..37b5afd 100644 +--- a/fs/dcache.c ++++ b/fs/dcache.c +@@ -2927,6 +2927,17 @@ restart: + vfsmnt = &mnt->mnt; + continue; + } ++ /* ++ * Filesystems needing to implement special "root names" ++ * should do so with ->d_dname() ++ */ ++ if (IS_ROOT(dentry) && ++ (dentry->d_name.len != 1 || ++ dentry->d_name.name[0] != '/')) { ++ WARN(1, "Root dentry has weird name <%.*s>\n", ++ (int) dentry->d_name.len, ++ dentry->d_name.name); ++ } + if (!error) + error = is_mounted(vfsmnt) ? 1 : 2; + break; +diff --git a/fs/inode.c b/fs/inode.c +index 6e342ca..ea37cd1 100644 +--- a/fs/inode.c ++++ b/fs/inode.c +@@ -1693,8 +1693,8 @@ int file_remove_suid(struct file *file) + error = security_inode_killpriv(dentry); + if (!error && killsuid) + error = __remove_suid(dentry, killsuid); +- if (!error) +- inode_has_no_xattr(inode); ++ if (!error && (inode->i_sb->s_flags & MS_NOSEC)) ++ inode->i_flags |= S_NOSEC; + + return error; + } +diff --git a/fs/namespace.c b/fs/namespace.c +index 1d4a97c..1b9e111 100644 +--- a/fs/namespace.c ++++ b/fs/namespace.c +@@ -3185,15 +3185,11 @@ bool fs_fully_visible(struct file_system_type *type) + if (mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root) + continue; + +- /* This mount is not fully visible if there are any +- * locked child mounts that cover anything except for +- * empty directories. ++ /* This mount is not fully visible if there are any child mounts ++ * that cover anything except for empty directories. + */ + list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) { + struct inode *inode = child->mnt_mountpoint->d_inode; +- /* Only worry about locked mounts */ +- if (!(mnt->mnt.mnt_flags & MNT_LOCKED)) +- continue; + if (!S_ISDIR(inode->i_mode)) + goto next; + if (inode->i_nlink > 2) +diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c +index a7106ed..2c10360 100644 +--- a/fs/ufs/balloc.c ++++ b/fs/ufs/balloc.c +@@ -51,8 +51,8 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count) + + if (ufs_fragnum(fragment) + count > uspi->s_fpg) + ufs_error (sb, "ufs_free_fragments", "internal error"); +- +- mutex_lock(&UFS_SB(sb)->s_lock); ++ ++ lock_ufs(sb); + + cgno = ufs_dtog(uspi, fragment); + bit = ufs_dtogd(uspi, fragment); +@@ -115,13 +115,13 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count) + if (sb->s_flags & MS_SYNCHRONOUS) + ubh_sync_block(UCPI_UBH(ucpi)); + ufs_mark_sb_dirty(sb); +- +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ ++ unlock_ufs(sb); + UFSD("EXIT\n"); + return; + + failed: +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT (FAILED)\n"); + return; + } +@@ -151,7 +151,7 @@ void ufs_free_blocks(struct inode *inode, u64 fragment, unsigned count) + goto failed; + } + +- mutex_lock(&UFS_SB(sb)->s_lock); ++ lock_ufs(sb); + + do_more: + overflow = 0; +@@ -211,12 +211,12 @@ do_more: + } + + ufs_mark_sb_dirty(sb); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT\n"); + return; + + failed_unlock: +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + failed: + UFSD("EXIT (FAILED)\n"); + return; +@@ -357,7 +357,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + usb1 = ubh_get_usb_first(uspi); + *err = -ENOSPC; + +- mutex_lock(&UFS_SB(sb)->s_lock); ++ lock_ufs(sb); + tmp = ufs_data_ptr_to_cpu(sb, p); + + if (count + ufs_fragnum(fragment) > uspi->s_fpb) { +@@ -378,19 +378,19 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + "fragment %llu, tmp %llu\n", + (unsigned long long)fragment, + (unsigned long long)tmp); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + return INVBLOCK; + } + if (fragment < UFS_I(inode)->i_lastfrag) { + UFSD("EXIT (ALREADY ALLOCATED)\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + return 0; + } + } + else { + if (tmp) { + UFSD("EXIT (ALREADY ALLOCATED)\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + return 0; + } + } +@@ -399,7 +399,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + * There is not enough space for user on the device + */ + if (!capable(CAP_SYS_RESOURCE) && ufs_freespace(uspi, UFS_MINFREE) <= 0) { +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT (FAILED)\n"); + return 0; + } +@@ -424,7 +424,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + ufs_clear_frags(inode, result + oldcount, + newcount - oldcount, locked_page != NULL); + } +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT, result %llu\n", (unsigned long long)result); + return result; + } +@@ -439,7 +439,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + fragment + count); + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, + locked_page != NULL); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT, result %llu\n", (unsigned long long)result); + return result; + } +@@ -477,7 +477,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + *err = 0; + UFS_I(inode)->i_lastfrag = max(UFS_I(inode)->i_lastfrag, + fragment + count); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + if (newcount < request) + ufs_free_fragments (inode, result + newcount, request - newcount); + ufs_free_fragments (inode, tmp, oldcount); +@@ -485,7 +485,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment, + return result; + } + +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT (FAILED)\n"); + return 0; + } +diff --git a/fs/ufs/ialloc.c b/fs/ufs/ialloc.c +index fd0203c..7caa016 100644 +--- a/fs/ufs/ialloc.c ++++ b/fs/ufs/ialloc.c +@@ -69,11 +69,11 @@ void ufs_free_inode (struct inode * inode) + + ino = inode->i_ino; + +- mutex_lock(&UFS_SB(sb)->s_lock); ++ lock_ufs(sb); + + if (!((ino > 1) && (ino < (uspi->s_ncg * uspi->s_ipg )))) { + ufs_warning(sb, "ufs_free_inode", "reserved inode or nonexistent inode %u\n", ino); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + return; + } + +@@ -81,7 +81,7 @@ void ufs_free_inode (struct inode * inode) + bit = ufs_inotocgoff (ino); + ucpi = ufs_load_cylinder (sb, cg); + if (!ucpi) { +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + return; + } + ucg = ubh_get_ucg(UCPI_UBH(ucpi)); +@@ -115,7 +115,7 @@ void ufs_free_inode (struct inode * inode) + ubh_sync_block(UCPI_UBH(ucpi)); + + ufs_mark_sb_dirty(sb); +- mutex_unlock(&UFS_SB(sb)->s_lock); ++ unlock_ufs(sb); + UFSD("EXIT\n"); + } + +@@ -193,7 +193,7 @@ struct inode *ufs_new_inode(struct inode *dir, umode_t mode) + sbi = UFS_SB(sb); + uspi = sbi->s_uspi; + +- mutex_lock(&sbi->s_lock); ++ lock_ufs(sb); + + /* + * Try to place the inode in its parent directory +@@ -331,21 +331,21 @@ cg_found: + sync_dirty_buffer(bh); + brelse(bh); + } +- mutex_unlock(&sbi->s_lock); ++ unlock_ufs(sb); + + UFSD("allocating inode %lu\n", inode->i_ino); + UFSD("EXIT\n"); + return inode; + + fail_remove_inode: +- mutex_unlock(&sbi->s_lock); ++ unlock_ufs(sb); + clear_nlink(inode); + unlock_new_inode(inode); + iput(inode); + UFSD("EXIT (FAILED): err %d\n", err); + return ERR_PTR(err); + failed: +- mutex_unlock(&sbi->s_lock); ++ unlock_ufs(sb); + make_bad_inode(inode); + iput (inode); + UFSD("EXIT (FAILED): err %d\n", err); +diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c +index 2d93ab0..be7d42c 100644 +--- a/fs/ufs/inode.c ++++ b/fs/ufs/inode.c +@@ -902,9 +902,6 @@ void ufs_evict_inode(struct inode * inode) + invalidate_inode_buffers(inode); + clear_inode(inode); + +- if (want_delete) { +- lock_ufs(inode->i_sb); ++ if (want_delete) + ufs_free_inode(inode); +- unlock_ufs(inode->i_sb); +- } + } +diff --git a/fs/ufs/namei.c b/fs/ufs/namei.c +index 60ee322..e491a93 100644 +--- a/fs/ufs/namei.c ++++ b/fs/ufs/namei.c +@@ -128,12 +128,12 @@ static int ufs_symlink (struct inode * dir, struct dentry * dentry, + if (l > sb->s_blocksize) + goto out_notlocked; + +- lock_ufs(dir->i_sb); + inode = ufs_new_inode(dir, S_IFLNK | S_IRWXUGO); + err = PTR_ERR(inode); + if (IS_ERR(inode)) +- goto out; ++ goto out_notlocked; + ++ lock_ufs(dir->i_sb); + if (l > UFS_SB(sb)->s_uspi->s_maxsymlinklen) { + /* slow symlink */ + inode->i_op = &ufs_symlink_inode_operations; +@@ -174,12 +174,7 @@ static int ufs_link (struct dentry * old_dentry, struct inode * dir, + inode_inc_link_count(inode); + ihold(inode); + +- error = ufs_add_link(dentry, inode); +- if (error) { +- inode_dec_link_count(inode); +- iput(inode); +- } else +- d_instantiate(dentry, inode); ++ error = ufs_add_nondir(dentry, inode); + unlock_ufs(dir->i_sb); + return error; + } +@@ -189,13 +184,9 @@ static int ufs_mkdir(struct inode * dir, struct dentry * dentry, umode_t mode) + struct inode * inode; + int err; + +- lock_ufs(dir->i_sb); +- inode_inc_link_count(dir); +- + inode = ufs_new_inode(dir, S_IFDIR|mode); +- err = PTR_ERR(inode); + if (IS_ERR(inode)) +- goto out_dir; ++ return PTR_ERR(inode); + + inode->i_op = &ufs_dir_inode_operations; + inode->i_fop = &ufs_dir_operations; +@@ -203,6 +194,9 @@ static int ufs_mkdir(struct inode * dir, struct dentry * dentry, umode_t mode) + + inode_inc_link_count(inode); + ++ lock_ufs(dir->i_sb); ++ inode_inc_link_count(dir); ++ + err = ufs_make_empty(inode, dir); + if (err) + goto out_fail; +@@ -212,7 +206,6 @@ static int ufs_mkdir(struct inode * dir, struct dentry * dentry, umode_t mode) + goto out_fail; + unlock_ufs(dir->i_sb); + +- unlock_new_inode(inode); + d_instantiate(dentry, inode); + out: + return err; +@@ -222,7 +215,6 @@ out_fail: + inode_dec_link_count(inode); + unlock_new_inode(inode); + iput (inode); +-out_dir: + inode_dec_link_count(dir); + unlock_ufs(dir->i_sb); + goto out; +diff --git a/fs/ufs/super.c b/fs/ufs/super.c +index dc33f94..b3bc3e7 100644 +--- a/fs/ufs/super.c ++++ b/fs/ufs/super.c +@@ -694,7 +694,6 @@ static int ufs_sync_fs(struct super_block *sb, int wait) + unsigned flags; + + lock_ufs(sb); +- mutex_lock(&UFS_SB(sb)->s_lock); + + UFSD("ENTER\n"); + +@@ -712,7 +711,6 @@ static int ufs_sync_fs(struct super_block *sb, int wait) + ufs_put_cstotal(sb); + + UFSD("EXIT\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + + return 0; +@@ -801,7 +799,6 @@ static int ufs_fill_super(struct super_block *sb, void *data, int silent) + UFSD("flag %u\n", (int)(sb->s_flags & MS_RDONLY)); + + mutex_init(&sbi->mutex); +- mutex_init(&sbi->s_lock); + spin_lock_init(&sbi->work_lock); + INIT_DELAYED_WORK(&sbi->sync_work, delayed_sync_fs); + /* +@@ -1280,7 +1277,6 @@ static int ufs_remount (struct super_block *sb, int *mount_flags, char *data) + + sync_filesystem(sb); + lock_ufs(sb); +- mutex_lock(&UFS_SB(sb)->s_lock); + uspi = UFS_SB(sb)->s_uspi; + flags = UFS_SB(sb)->s_flags; + usb1 = ubh_get_usb_first(uspi); +@@ -1294,7 +1290,6 @@ static int ufs_remount (struct super_block *sb, int *mount_flags, char *data) + new_mount_opt = 0; + ufs_set_opt (new_mount_opt, ONERROR_LOCK); + if (!ufs_parse_options (data, &new_mount_opt)) { +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return -EINVAL; + } +@@ -1302,14 +1297,12 @@ static int ufs_remount (struct super_block *sb, int *mount_flags, char *data) + new_mount_opt |= ufstype; + } else if ((new_mount_opt & UFS_MOUNT_UFSTYPE) != ufstype) { + pr_err("ufstype can't be changed during remount\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return -EINVAL; + } + + if ((*mount_flags & MS_RDONLY) == (sb->s_flags & MS_RDONLY)) { + UFS_SB(sb)->s_mount_opt = new_mount_opt; +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return 0; + } +@@ -1333,7 +1326,6 @@ static int ufs_remount (struct super_block *sb, int *mount_flags, char *data) + */ + #ifndef CONFIG_UFS_FS_WRITE + pr_err("ufs was compiled with read-only support, can't be mounted as read-write\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return -EINVAL; + #else +@@ -1343,13 +1335,11 @@ static int ufs_remount (struct super_block *sb, int *mount_flags, char *data) + ufstype != UFS_MOUNT_UFSTYPE_SUNx86 && + ufstype != UFS_MOUNT_UFSTYPE_UFS2) { + pr_err("this ufstype is read-only supported\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return -EINVAL; + } + if (!ufs_read_cylinder_structures(sb)) { + pr_err("failed during remounting\n"); +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return -EPERM; + } +@@ -1357,7 +1347,6 @@ static int ufs_remount (struct super_block *sb, int *mount_flags, char *data) + #endif + } + UFS_SB(sb)->s_mount_opt = new_mount_opt; +- mutex_unlock(&UFS_SB(sb)->s_lock); + unlock_ufs(sb); + return 0; + } +diff --git a/fs/ufs/ufs.h b/fs/ufs/ufs.h +index cf6368d..2a07396 100644 +--- a/fs/ufs/ufs.h ++++ b/fs/ufs/ufs.h +@@ -30,7 +30,6 @@ struct ufs_sb_info { + int work_queued; /* non-zero if the delayed work is queued */ + struct delayed_work sync_work; /* FS sync delayed work */ + spinlock_t work_lock; /* protects sync_work and work_queued */ +- struct mutex s_lock; + }; + + struct ufs_inode_info { +diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h +index 8ba379f..3573a81 100644 +--- a/include/net/netns/sctp.h ++++ b/include/net/netns/sctp.h +@@ -31,7 +31,6 @@ struct netns_sctp { + struct list_head addr_waitq; + struct timer_list addr_wq_timer; + struct list_head auto_asconf_splist; +- /* Lock that protects both addr_waitq and auto_asconf_splist */ + spinlock_t addr_wq_lock; + + /* Lock that protects the local_addr_list writers */ +diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h +index 495c87e..2bb2fcf 100644 +--- a/include/net/sctp/structs.h ++++ b/include/net/sctp/structs.h +@@ -223,10 +223,6 @@ struct sctp_sock { + atomic_t pd_mode; + /* Receive to here while partial delivery is in effect. */ + struct sk_buff_head pd_lobby; +- +- /* These must be the last fields, as they will skipped on copies, +- * like on accept and peeloff operations +- */ + struct list_head auto_asconf_list; + int do_auto_asconf; + }; diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild index 1a0006a..4842a98 100644 --- a/include/uapi/linux/Kbuild @@ -7499,10 +9575,10 @@ index 1a0006a..4842a98 100644 header-y += kernelcapi.h diff --git a/include/uapi/linux/kdbus.h b/include/uapi/linux/kdbus.h new file mode 100644 -index 0000000..00a6e14 +index 0000000..ecffc6b --- /dev/null +++ b/include/uapi/linux/kdbus.h -@@ -0,0 +1,979 @@ +@@ -0,0 +1,980 @@ +/* + * kdbus is free software; you can redistribute it and/or modify it under + * the terms of the GNU Lesser General Public License as published by the @@ -7879,6 +9955,7 @@ index 0000000..00a6e14 + KDBUS_ITEM_ATTACH_FLAGS_RECV, + KDBUS_ITEM_ID, + KDBUS_ITEM_NAME, ++ KDBUS_ITEM_DST_ID, + + /* keep these item types in sync with KDBUS_ATTACH_* flags */ + _KDBUS_ITEM_ATTACH_BASE = 0x1000, @@ -8529,11 +10606,22 @@ index 86c7300..68ec416 100644 +obj-$(CONFIG_KDBUS) += kdbus/ diff --git a/ipc/kdbus/Makefile b/ipc/kdbus/Makefile new file mode 100644 -index 0000000..7ee9271 +index 0000000..66663a1 --- /dev/null +++ b/ipc/kdbus/Makefile -@@ -0,0 +1,22 @@ -+kdbus-y := \ +@@ -0,0 +1,33 @@ ++# ++# By setting KDBUS_EXT=2, the kdbus module will be built as kdbus2.ko, and ++# KBUILD_MODNAME=kdbus2. This has the effect that all exported objects have ++# different names than usually (kdbus2fs, /sys/fs/kdbus2/) and you can run ++# your test-infrastructure against the kdbus2.ko, while running your system ++# on kdbus.ko. ++# ++# To just build the module, use: ++# make KDBUS_EXT=2 M=ipc/kdbus ++# ++ ++kdbus$(KDBUS_EXT)-y := \ + bus.o \ + connection.o \ + endpoint.o \ @@ -8554,13 +10642,13 @@ index 0000000..7ee9271 + queue.o \ + util.o + -+obj-$(CONFIG_KDBUS) += kdbus.o ++obj-$(CONFIG_KDBUS) += kdbus$(KDBUS_EXT).o diff --git a/ipc/kdbus/bus.c b/ipc/kdbus/bus.c new file mode 100644 -index 0000000..bbdf0f2 +index 0000000..a67f825 --- /dev/null +++ b/ipc/kdbus/bus.c -@@ -0,0 +1,542 @@ +@@ -0,0 +1,514 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -8629,23 +10717,16 @@ index 0000000..bbdf0f2 + const char *name, + struct kdbus_bloom_parameter *bloom, + const u64 *pattach_owner, -+ const u64 *pattach_recv, + u64 flags, kuid_t uid, kgid_t gid) +{ + struct kdbus_bus *b; + u64 attach_owner; -+ u64 attach_recv; + int ret; + + if (bloom->size < 8 || bloom->size > KDBUS_BUS_BLOOM_MAX_SIZE || + !KDBUS_IS_ALIGNED8(bloom->size) || bloom->n_hash < 1) + return ERR_PTR(-EINVAL); + -+ ret = kdbus_sanitize_attach_flags(pattach_recv ? *pattach_recv : 0, -+ &attach_recv); -+ if (ret < 0) -+ return ERR_PTR(ret); -+ + ret = kdbus_sanitize_attach_flags(pattach_owner ? *pattach_owner : 0, + &attach_owner); + if (ret < 0) @@ -8674,7 +10755,6 @@ index 0000000..bbdf0f2 + + b->id = atomic64_inc_return(&domain->last_id); + b->bus_flags = flags; -+ b->attach_flags_req = attach_recv; + b->attach_flags_owner = attach_owner; + generate_random_uuid(b->id128); + b->bloom = *bloom; @@ -8803,9 +10883,9 @@ index 0000000..bbdf0f2 + * kdbus_bus_broadcast() - send a message to all subscribed connections + * @bus: The bus the connections are connected to + * @conn_src: The source connection, may be %NULL for kernel notifications -+ * @kmsg: The message to send. ++ * @staging: Staging object containing the message to send + * -+ * Send @kmsg to all connections that are currently active on the bus. ++ * Send message to all connections that are currently active on the bus. + * Connections must still have matches installed in order to let the message + * pass. + * @@ -8813,7 +10893,7 @@ index 0000000..bbdf0f2 + */ +void kdbus_bus_broadcast(struct kdbus_bus *bus, + struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg) ++ struct kdbus_staging *staging) +{ + struct kdbus_conn *conn_dst; + unsigned int i; @@ -8830,12 +10910,10 @@ index 0000000..bbdf0f2 + * can re-construct order via sequence numbers), but we should at least + * try to avoid re-ordering for monitors. + */ -+ kdbus_bus_eavesdrop(bus, conn_src, kmsg); ++ kdbus_bus_eavesdrop(bus, conn_src, staging); + + down_read(&bus->conn_rwlock); + hash_for_each(bus->conn_hash, i, conn_dst, hentry) { -+ if (conn_dst->id == kmsg->msg.src_id) -+ continue; + if (!kdbus_conn_is_ordinary(conn_dst)) + continue; + @@ -8843,8 +10921,8 @@ index 0000000..bbdf0f2 + * Check if there is a match for the kmsg object in + * the destination connection match db + */ -+ if (!kdbus_match_db_match_kmsg(conn_dst->match_db, conn_src, -+ kmsg)) ++ if (!kdbus_match_db_match_msg(conn_dst->match_db, conn_src, ++ staging)) + continue; + + if (conn_src) { @@ -8855,13 +10933,6 @@ index 0000000..bbdf0f2 + */ + if (!kdbus_conn_policy_talk(conn_dst, NULL, conn_src)) + continue; -+ -+ ret = kdbus_kmsg_collect_metadata(kmsg, conn_src, -+ conn_dst); -+ if (ret < 0) { -+ kdbus_conn_lost_message(conn_dst); -+ continue; -+ } + } else { + /* + * Check if there is a policy db that prevents the @@ -8869,11 +10940,12 @@ index 0000000..bbdf0f2 + * notification + */ + if (!kdbus_conn_policy_see_notification(conn_dst, NULL, -+ kmsg)) ++ staging->msg)) + continue; + } + -+ ret = kdbus_conn_entry_insert(conn_src, conn_dst, kmsg, NULL); ++ ret = kdbus_conn_entry_insert(conn_src, conn_dst, staging, ++ NULL, NULL); + if (ret < 0) + kdbus_conn_lost_message(conn_dst); + } @@ -8884,16 +10956,16 @@ index 0000000..bbdf0f2 + * kdbus_bus_eavesdrop() - send a message to all subscribed monitors + * @bus: The bus the monitors are connected to + * @conn_src: The source connection, may be %NULL for kernel notifications -+ * @kmsg: The message to send. ++ * @staging: Staging object containing the message to send + * -+ * Send @kmsg to all monitors that are currently active on the bus. Monitors ++ * Send message to all monitors that are currently active on the bus. Monitors + * must still have matches installed in order to let the message pass. + * + * The caller must hold the name-registry lock of @bus. + */ +void kdbus_bus_eavesdrop(struct kdbus_bus *bus, + struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg) ++ struct kdbus_staging *staging) +{ + struct kdbus_conn *conn_dst; + int ret; @@ -8907,16 +10979,8 @@ index 0000000..bbdf0f2 + + down_read(&bus->conn_rwlock); + list_for_each_entry(conn_dst, &bus->monitors_list, monitor_entry) { -+ if (conn_src) { -+ ret = kdbus_kmsg_collect_metadata(kmsg, conn_src, -+ conn_dst); -+ if (ret < 0) { -+ kdbus_conn_lost_message(conn_dst); -+ continue; -+ } -+ } -+ -+ ret = kdbus_conn_entry_insert(conn_src, conn_dst, kmsg, NULL); ++ ret = kdbus_conn_entry_insert(conn_src, conn_dst, staging, ++ NULL, NULL); + if (ret < 0) + kdbus_conn_lost_message(conn_dst); + } @@ -8943,7 +11007,6 @@ index 0000000..bbdf0f2 + { .type = KDBUS_ITEM_MAKE_NAME, .mandatory = true }, + { .type = KDBUS_ITEM_BLOOM_PARAMETER, .mandatory = true }, + { .type = KDBUS_ITEM_ATTACH_FLAGS_SEND }, -+ { .type = KDBUS_ITEM_ATTACH_FLAGS_RECV }, + }; + struct kdbus_args args = { + .allowed_flags = KDBUS_FLAG_NEGOTIATE | @@ -8962,7 +11025,6 @@ index 0000000..bbdf0f2 + bus = kdbus_bus_new(domain, + argv[1].item->str, &argv[2].item->bloom_parameter, + argv[3].item ? argv[3].item->data64 : NULL, -+ argv[4].item ? argv[4].item->data64 : NULL, + cmd->flags, current_euid(), current_egid()); + if (IS_ERR(bus)) { + ret = PTR_ERR(bus); @@ -9029,13 +11091,12 @@ index 0000000..bbdf0f2 + struct kdbus_cmd_info *cmd; + struct kdbus_bus *bus = conn->ep->bus; + struct kdbus_pool_slice *slice = NULL; ++ struct kdbus_item *meta_items = NULL; + struct kdbus_item_header item_hdr; + struct kdbus_info info = {}; -+ size_t meta_size, name_len; -+ struct kvec kvec[5]; -+ u64 hdr_size = 0; -+ u64 attach_flags; -+ size_t cnt = 0; ++ size_t meta_size, name_len, cnt = 0; ++ struct kvec kvec[6]; ++ u64 attach_flags, size = 0; + int ret; + + struct kdbus_arg argv[] = { @@ -9057,8 +11118,8 @@ index 0000000..bbdf0f2 + + attach_flags &= bus->attach_flags_owner; + -+ ret = kdbus_meta_export_prepare(bus->creator_meta, NULL, -+ &attach_flags, &meta_size); ++ ret = kdbus_meta_emit(bus->creator_meta, NULL, NULL, conn, ++ attach_flags, &meta_items, &meta_size); + if (ret < 0) + goto exit; + @@ -9068,30 +11129,29 @@ index 0000000..bbdf0f2 + item_hdr.type = KDBUS_ITEM_MAKE_NAME; + item_hdr.size = KDBUS_ITEM_HEADER_SIZE + name_len; + -+ kdbus_kvec_set(&kvec[cnt++], &info, sizeof(info), &hdr_size); -+ kdbus_kvec_set(&kvec[cnt++], &item_hdr, sizeof(item_hdr), &hdr_size); -+ kdbus_kvec_set(&kvec[cnt++], bus->node.name, name_len, &hdr_size); -+ cnt += !!kdbus_kvec_pad(&kvec[cnt], &hdr_size); ++ kdbus_kvec_set(&kvec[cnt++], &info, sizeof(info), &size); ++ kdbus_kvec_set(&kvec[cnt++], &item_hdr, sizeof(item_hdr), &size); ++ kdbus_kvec_set(&kvec[cnt++], bus->node.name, name_len, &size); ++ cnt += !!kdbus_kvec_pad(&kvec[cnt], &size); ++ if (meta_size > 0) { ++ kdbus_kvec_set(&kvec[cnt++], meta_items, meta_size, &size); ++ cnt += !!kdbus_kvec_pad(&kvec[cnt], &size); ++ } ++ ++ info.size = size; + -+ slice = kdbus_pool_slice_alloc(conn->pool, hdr_size + meta_size, false); ++ slice = kdbus_pool_slice_alloc(conn->pool, size, false); + if (IS_ERR(slice)) { + ret = PTR_ERR(slice); + slice = NULL; + goto exit; + } + -+ ret = kdbus_meta_export(bus->creator_meta, NULL, attach_flags, -+ slice, hdr_size, &meta_size); ++ ret = kdbus_pool_slice_copy_kvec(slice, 0, kvec, cnt, size); + if (ret < 0) + goto exit; + -+ info.size = hdr_size + meta_size; -+ -+ ret = kdbus_pool_slice_copy_kvec(slice, 0, kvec, cnt, hdr_size); -+ if (ret < 0) -+ goto exit; -+ -+ kdbus_pool_slice_publish(slice, &cmd->offset, &cmd->info_size); ++ kdbus_pool_slice_publish(slice, &cmd->offset, &cmd->info_size); + + if (kdbus_member_set_user(&cmd->offset, argp, typeof(*cmd), offset) || + kdbus_member_set_user(&cmd->info_size, argp, @@ -9100,15 +11160,15 @@ index 0000000..bbdf0f2 + +exit: + kdbus_pool_slice_release(slice); -+ ++ kfree(meta_items); + return kdbus_args_clear(&args, ret); +} diff --git a/ipc/kdbus/bus.h b/ipc/kdbus/bus.h new file mode 100644 -index 0000000..5bea5ef +index 0000000..238986e --- /dev/null +++ b/ipc/kdbus/bus.h -@@ -0,0 +1,101 @@ +@@ -0,0 +1,99 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -9140,7 +11200,7 @@ index 0000000..5bea5ef + +struct kdbus_conn; +struct kdbus_domain; -+struct kdbus_kmsg; ++struct kdbus_staging; +struct kdbus_user; + +/** @@ -9148,7 +11208,6 @@ index 0000000..5bea5ef + * @node: kdbus_node + * @id: ID of this bus in the domain + * @bus_flags: Simple pass-through flags from userspace to userspace -+ * @attach_flags_req: KDBUS_ATTACH_* flags required by connecting peers + * @attach_flags_owner: KDBUS_ATTACH_* flags of bus creator that other + * connections can see or query + * @id128: Unique random 128 bit ID of this bus @@ -9171,7 +11230,6 @@ index 0000000..5bea5ef + /* static */ + u64 id; + u64 bus_flags; -+ u64 attach_flags_req; + u64 attach_flags_owner; + u8 id128[16]; + struct kdbus_bloom_parameter bloom; @@ -9200,10 +11258,10 @@ index 0000000..5bea5ef +struct kdbus_conn *kdbus_bus_find_conn_by_id(struct kdbus_bus *bus, u64 id); +void kdbus_bus_broadcast(struct kdbus_bus *bus, + struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg); ++ struct kdbus_staging *staging); +void kdbus_bus_eavesdrop(struct kdbus_bus *bus, + struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg); ++ struct kdbus_staging *staging); + +struct kdbus_bus *kdbus_cmd_bus_make(struct kdbus_domain *domain, + void __user *argp); @@ -9212,10 +11270,10 @@ index 0000000..5bea5ef +#endif diff --git a/ipc/kdbus/connection.c b/ipc/kdbus/connection.c new file mode 100644 -index 0000000..9993753 +index 0000000..d94b417e --- /dev/null +++ b/ipc/kdbus/connection.c -@@ -0,0 +1,2178 @@ +@@ -0,0 +1,2207 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -9330,10 +11388,6 @@ index 0000000..9993753 + if (ret < 0) + return ERR_PTR(ret); + -+ /* The attach flags must always satisfy the bus requirements. */ -+ if (bus->attach_flags_req & ~attach_flags_send) -+ return ERR_PTR(-ECONNREFUSED); -+ + conn = kzalloc(sizeof(*conn), GFP_KERNEL); + if (!conn) + return ERR_PTR(-ENOMEM); @@ -9352,6 +11406,8 @@ index 0000000..9993753 + atomic_set(&conn->lost_count, 0); + INIT_DELAYED_WORK(&conn->work, kdbus_reply_list_scan_work); + conn->cred = get_current_cred(); ++ conn->pid = get_pid(task_pid(current)); ++ get_fs_root(current->fs, &conn->root_path); + init_waitqueue_head(&conn->wait); + kdbus_queue_init(&conn->queue); + conn->privileged = privileged; @@ -9391,22 +11447,28 @@ index 0000000..9993753 + BUILD_BUG_ON(sizeof(bus->id128) != sizeof(hello->id128)); + memcpy(hello->id128, bus->id128, sizeof(hello->id128)); + -+ conn->meta = kdbus_meta_proc_new(); -+ if (IS_ERR(conn->meta)) { -+ ret = PTR_ERR(conn->meta); -+ conn->meta = NULL; -+ goto exit_unref; -+ } -+ + /* privileged processes can impersonate somebody else */ + if (creds || pids || seclabel) { -+ ret = kdbus_meta_proc_fake(conn->meta, creds, pids, seclabel); -+ if (ret < 0) ++ conn->meta_fake = kdbus_meta_fake_new(); ++ if (IS_ERR(conn->meta_fake)) { ++ ret = PTR_ERR(conn->meta_fake); ++ conn->meta_fake = NULL; + goto exit_unref; ++ } + -+ conn->faked_meta = true; ++ ret = kdbus_meta_fake_collect(conn->meta_fake, ++ creds, pids, seclabel); ++ if (ret < 0) ++ goto exit_unref; + } else { -+ ret = kdbus_meta_proc_collect(conn->meta, ++ conn->meta_proc = kdbus_meta_proc_new(); ++ if (IS_ERR(conn->meta_proc)) { ++ ret = PTR_ERR(conn->meta_proc); ++ conn->meta_proc = NULL; ++ goto exit_unref; ++ } ++ ++ ret = kdbus_meta_proc_collect(conn->meta_proc, + KDBUS_ATTACH_CREDS | + KDBUS_ATTACH_PIDS | + KDBUS_ATTACH_AUXGROUPS | @@ -9489,10 +11551,13 @@ index 0000000..9993753 + kdbus_user_unref(conn->user); + } + -+ kdbus_meta_proc_unref(conn->meta); ++ kdbus_meta_fake_free(conn->meta_fake); ++ kdbus_meta_proc_unref(conn->meta_proc); + kdbus_match_db_free(conn->match_db); + kdbus_pool_free(conn->pool); + kdbus_ep_unref(conn->ep); ++ path_put(&conn->root_path); ++ put_pid(conn->pid); + put_cred(conn->cred); + kfree(conn->description); + kfree(conn->quota); @@ -9824,9 +11889,9 @@ index 0000000..9993753 +} + +struct kdbus_quota { -+ uint32_t memory; -+ uint16_t msgs; -+ uint8_t fds; ++ u32 memory; ++ u16 msgs; ++ u8 fds; +}; + +/** @@ -9864,7 +11929,7 @@ index 0000000..9993753 + * allocation schemes. Furthermore, resource utilization should be + * maximized, so only minimal resources stay reserved. However, we need + * to adapt to a dynamic number of users, as we cannot know how many -+ * users will talk to a connection. Therefore, the current allocations ++ * users will talk to a connection. Therefore, the current allocation + * works like this: + * We limit the number of bytes in a destination's pool per sending + * user. The space available for a user is 33% of the unused pool space @@ -9906,7 +11971,7 @@ index 0000000..9993753 + + /* + * Pool owner slices are un-accounted slices; they can claim more -+ * than 50% of the queue. However, the slice we're dealing with here ++ * than 50% of the queue. However, the slices we're dealing with here + * belong to the incoming queue, hence they are 'accounted' slices + * to which the 50%-limit applies. + */ @@ -9988,9 +12053,9 @@ index 0000000..9993753 + +/* Callers should take the conn_dst lock */ +static struct kdbus_queue_entry * -+kdbus_conn_entry_make(struct kdbus_conn *conn_dst, -+ const struct kdbus_kmsg *kmsg, -+ struct kdbus_user *user) ++kdbus_conn_entry_make(struct kdbus_conn *conn_src, ++ struct kdbus_conn *conn_dst, ++ struct kdbus_staging *staging) +{ + /* The remote connection was disconnected */ + if (!kdbus_conn_active(conn_dst)) @@ -10005,10 +12070,10 @@ index 0000000..9993753 + */ + if (!kdbus_conn_is_monitor(conn_dst) && + !(conn_dst->flags & KDBUS_HELLO_ACCEPT_FD) && -+ kmsg->res && kmsg->res->fds_count > 0) ++ staging->gaps && staging->gaps->n_fds > 0) + return ERR_PTR(-ECOMM); + -+ return kdbus_queue_entry_new(conn_dst, kmsg, user); ++ return kdbus_queue_entry_new(conn_src, conn_dst, staging); +} + +/* @@ -10017,12 +12082,11 @@ index 0000000..9993753 + * The connection's queue will never get to see it. + */ +static int kdbus_conn_entry_sync_attach(struct kdbus_conn *conn_dst, -+ const struct kdbus_kmsg *kmsg, ++ struct kdbus_staging *staging, + struct kdbus_reply *reply_wake) +{ + struct kdbus_queue_entry *entry; -+ int remote_ret; -+ int ret = 0; ++ int remote_ret, ret = 0; + + mutex_lock(&reply_wake->reply_dst->lock); + @@ -10031,8 +12095,8 @@ index 0000000..9993753 + * entry and attach it to the reply object + */ + if (reply_wake->waiting) { -+ entry = kdbus_conn_entry_make(conn_dst, kmsg, -+ reply_wake->reply_src->user); ++ entry = kdbus_conn_entry_make(reply_wake->reply_src, conn_dst, ++ staging); + if (IS_ERR(entry)) + ret = PTR_ERR(entry); + else @@ -10073,23 +12137,24 @@ index 0000000..9993753 + * kdbus_conn_entry_insert() - enqueue a message into the receiver's pool + * @conn_src: The sending connection + * @conn_dst: The connection to queue into -+ * @kmsg: The kmsg to queue ++ * @staging: Message to send + * @reply: The reply tracker to attach to the queue entry ++ * @name: Destination name this msg is sent to, or NULL + * + * Return: 0 on success. negative error otherwise. + */ +int kdbus_conn_entry_insert(struct kdbus_conn *conn_src, + struct kdbus_conn *conn_dst, -+ const struct kdbus_kmsg *kmsg, -+ struct kdbus_reply *reply) ++ struct kdbus_staging *staging, ++ struct kdbus_reply *reply, ++ const struct kdbus_name_entry *name) +{ + struct kdbus_queue_entry *entry; + int ret; + + kdbus_conn_lock2(conn_src, conn_dst); + -+ entry = kdbus_conn_entry_make(conn_dst, kmsg, -+ conn_src ? conn_src->user : NULL); ++ entry = kdbus_conn_entry_make(conn_src, conn_dst, staging); + if (IS_ERR(entry)) { + ret = PTR_ERR(entry); + goto exit_unlock; @@ -10101,6 +12166,14 @@ index 0000000..9993753 + schedule_delayed_work(&conn_src->work, 0); + } + ++ /* ++ * Record the sequence number of the registered name; it will ++ * be remembered by the queue, in case messages addressed to a ++ * name need to be moved from or to an activator. ++ */ ++ if (name) ++ entry->dst_name_id = name->name_id; ++ + kdbus_queue_entry_enqueue(entry, reply); + wake_up_interruptible(&conn_dst->wait); + @@ -10233,22 +12306,18 @@ index 0000000..9993753 +} + +static int kdbus_pin_dst(struct kdbus_bus *bus, -+ struct kdbus_kmsg *kmsg, ++ struct kdbus_staging *staging, + struct kdbus_name_entry **out_name, + struct kdbus_conn **out_dst) +{ -+ struct kdbus_msg_resources *res = kmsg->res; ++ const struct kdbus_msg *msg = staging->msg; + struct kdbus_name_entry *name = NULL; + struct kdbus_conn *dst = NULL; -+ struct kdbus_msg *msg = &kmsg->msg; + int ret; + -+ if (WARN_ON(!res)) -+ return -EINVAL; -+ + lockdep_assert_held(&bus->name_registry->rwlock); + -+ if (!res->dst_name) { ++ if (!staging->dst_name) { + dst = kdbus_bus_find_conn_by_id(bus, msg->dst_id); + if (!dst) + return -ENXIO; @@ -10259,7 +12328,7 @@ index 0000000..9993753 + } + } else { + name = kdbus_name_lookup_unlocked(bus->name_registry, -+ res->dst_name); ++ staging->dst_name); + if (!name) + return -ESRCH; + @@ -10285,13 +12354,6 @@ index 0000000..9993753 + ret = -EADDRNOTAVAIL; + goto error; + } -+ -+ /* -+ * Record the sequence number of the registered name; it will -+ * be passed on to the queue, in case messages addressed to a -+ * name need to be moved from or to an activator. -+ */ -+ kmsg->dst_name_id = name->name_id; + } + + *out_name = name; @@ -10303,17 +12365,19 @@ index 0000000..9993753 + return ret; +} + -+static int kdbus_conn_reply(struct kdbus_conn *src, struct kdbus_kmsg *kmsg) ++static int kdbus_conn_reply(struct kdbus_conn *src, ++ struct kdbus_staging *staging) +{ ++ const struct kdbus_msg *msg = staging->msg; + struct kdbus_name_entry *name = NULL; + struct kdbus_reply *reply, *wake = NULL; + struct kdbus_conn *dst = NULL; + struct kdbus_bus *bus = src->ep->bus; + int ret; + -+ if (WARN_ON(kmsg->msg.dst_id == KDBUS_DST_ID_BROADCAST) || -+ WARN_ON(kmsg->msg.flags & KDBUS_MSG_EXPECT_REPLY) || -+ WARN_ON(kmsg->msg.flags & KDBUS_MSG_SIGNAL)) ++ if (WARN_ON(msg->dst_id == KDBUS_DST_ID_BROADCAST) || ++ WARN_ON(msg->flags & KDBUS_MSG_EXPECT_REPLY) || ++ WARN_ON(msg->flags & KDBUS_MSG_SIGNAL)) + return -EINVAL; + + /* name-registry must be locked for lookup *and* collecting data */ @@ -10321,12 +12385,12 @@ index 0000000..9993753 + + /* find and pin destination */ + -+ ret = kdbus_pin_dst(bus, kmsg, &name, &dst); ++ ret = kdbus_pin_dst(bus, staging, &name, &dst); + if (ret < 0) + goto exit; + + mutex_lock(&dst->lock); -+ reply = kdbus_reply_find(src, dst, kmsg->msg.cookie_reply); ++ reply = kdbus_reply_find(src, dst, msg->cookie_reply); + if (reply) { + if (reply->sync) + wake = kdbus_reply_ref(reply); @@ -10339,20 +12403,14 @@ index 0000000..9993753 + goto exit; + } + -+ /* attach metadata */ -+ -+ ret = kdbus_kmsg_collect_metadata(kmsg, src, dst); -+ if (ret < 0) -+ goto exit; -+ + /* send message */ + -+ kdbus_bus_eavesdrop(bus, src, kmsg); ++ kdbus_bus_eavesdrop(bus, src, staging); + + if (wake) -+ ret = kdbus_conn_entry_sync_attach(dst, kmsg, wake); ++ ret = kdbus_conn_entry_sync_attach(dst, staging, wake); + else -+ ret = kdbus_conn_entry_insert(src, dst, kmsg, NULL); ++ ret = kdbus_conn_entry_insert(src, dst, staging, NULL, name); + +exit: + up_read(&bus->name_registry->rwlock); @@ -10362,24 +12420,25 @@ index 0000000..9993753 +} + +static struct kdbus_reply *kdbus_conn_call(struct kdbus_conn *src, -+ struct kdbus_kmsg *kmsg, ++ struct kdbus_staging *staging, + ktime_t exp) +{ ++ const struct kdbus_msg *msg = staging->msg; + struct kdbus_name_entry *name = NULL; + struct kdbus_reply *wait = NULL; + struct kdbus_conn *dst = NULL; + struct kdbus_bus *bus = src->ep->bus; + int ret; + -+ if (WARN_ON(kmsg->msg.dst_id == KDBUS_DST_ID_BROADCAST) || -+ WARN_ON(kmsg->msg.flags & KDBUS_MSG_SIGNAL) || -+ WARN_ON(!(kmsg->msg.flags & KDBUS_MSG_EXPECT_REPLY))) ++ if (WARN_ON(msg->dst_id == KDBUS_DST_ID_BROADCAST) || ++ WARN_ON(msg->flags & KDBUS_MSG_SIGNAL) || ++ WARN_ON(!(msg->flags & KDBUS_MSG_EXPECT_REPLY))) + return ERR_PTR(-EINVAL); + + /* resume previous wait-context, if available */ + + mutex_lock(&src->lock); -+ wait = kdbus_reply_find(NULL, src, kmsg->msg.cookie); ++ wait = kdbus_reply_find(NULL, src, msg->cookie); + if (wait) { + if (wait->interrupted) { + kdbus_reply_ref(wait); @@ -10401,7 +12460,7 @@ index 0000000..9993753 + + /* find and pin destination */ + -+ ret = kdbus_pin_dst(bus, kmsg, &name, &dst); ++ ret = kdbus_pin_dst(bus, staging, &name, &dst); + if (ret < 0) + goto exit; + @@ -10410,24 +12469,18 @@ index 0000000..9993753 + goto exit; + } + -+ wait = kdbus_reply_new(dst, src, &kmsg->msg, name, true); ++ wait = kdbus_reply_new(dst, src, msg, name, true); + if (IS_ERR(wait)) { + ret = PTR_ERR(wait); + wait = NULL; + goto exit; + } + -+ /* attach metadata */ -+ -+ ret = kdbus_kmsg_collect_metadata(kmsg, src, dst); -+ if (ret < 0) -+ goto exit; -+ + /* send message */ + -+ kdbus_bus_eavesdrop(bus, src, kmsg); ++ kdbus_bus_eavesdrop(bus, src, staging); + -+ ret = kdbus_conn_entry_insert(src, dst, kmsg, wait); ++ ret = kdbus_conn_entry_insert(src, dst, staging, wait, name); + if (ret < 0) + goto exit; + @@ -10443,18 +12496,20 @@ index 0000000..9993753 + return wait; +} + -+static int kdbus_conn_unicast(struct kdbus_conn *src, struct kdbus_kmsg *kmsg) ++static int kdbus_conn_unicast(struct kdbus_conn *src, ++ struct kdbus_staging *staging) +{ ++ const struct kdbus_msg *msg = staging->msg; + struct kdbus_name_entry *name = NULL; + struct kdbus_reply *wait = NULL; + struct kdbus_conn *dst = NULL; + struct kdbus_bus *bus = src->ep->bus; -+ bool is_signal = (kmsg->msg.flags & KDBUS_MSG_SIGNAL); ++ bool is_signal = (msg->flags & KDBUS_MSG_SIGNAL); + int ret = 0; + -+ if (WARN_ON(kmsg->msg.dst_id == KDBUS_DST_ID_BROADCAST) || -+ WARN_ON(!(kmsg->msg.flags & KDBUS_MSG_EXPECT_REPLY) && -+ kmsg->msg.cookie_reply != 0)) ++ if (WARN_ON(msg->dst_id == KDBUS_DST_ID_BROADCAST) || ++ WARN_ON(!(msg->flags & KDBUS_MSG_EXPECT_REPLY) && ++ msg->cookie_reply != 0)) + return -EINVAL; + + /* name-registry must be locked for lookup *and* collecting data */ @@ -10462,23 +12517,23 @@ index 0000000..9993753 + + /* find and pin destination */ + -+ ret = kdbus_pin_dst(bus, kmsg, &name, &dst); ++ ret = kdbus_pin_dst(bus, staging, &name, &dst); + if (ret < 0) + goto exit; + + if (is_signal) { + /* like broadcasts we eavesdrop even if the msg is dropped */ -+ kdbus_bus_eavesdrop(bus, src, kmsg); ++ kdbus_bus_eavesdrop(bus, src, staging); + + /* drop silently if peer is not interested or not privileged */ -+ if (!kdbus_match_db_match_kmsg(dst->match_db, src, kmsg) || ++ if (!kdbus_match_db_match_msg(dst->match_db, src, staging) || + !kdbus_conn_policy_talk(dst, NULL, src)) + goto exit; + } else if (!kdbus_conn_policy_talk(src, current_cred(), dst)) { + ret = -EPERM; + goto exit; -+ } else if (kmsg->msg.flags & KDBUS_MSG_EXPECT_REPLY) { -+ wait = kdbus_reply_new(dst, src, &kmsg->msg, name, false); ++ } else if (msg->flags & KDBUS_MSG_EXPECT_REPLY) { ++ wait = kdbus_reply_new(dst, src, msg, name, false); + if (IS_ERR(wait)) { + ret = PTR_ERR(wait); + wait = NULL; @@ -10486,18 +12541,12 @@ index 0000000..9993753 + } + } + -+ /* attach metadata */ -+ -+ ret = kdbus_kmsg_collect_metadata(kmsg, src, dst); -+ if (ret < 0) -+ goto exit; -+ + /* send message */ + + if (!is_signal) -+ kdbus_bus_eavesdrop(bus, src, kmsg); ++ kdbus_bus_eavesdrop(bus, src, staging); + -+ ret = kdbus_conn_entry_insert(src, dst, kmsg, wait); ++ ret = kdbus_conn_entry_insert(src, dst, staging, wait, name); + if (ret < 0 && !is_signal) + goto exit; + @@ -10567,7 +12616,7 @@ index 0000000..9993753 + continue; + + if (!(conn_dst->flags & KDBUS_HELLO_ACCEPT_FD) && -+ e->msg_res && e->msg_res->fds_count > 0) { ++ e->gaps && e->gaps->n_fds > 0) { + kdbus_conn_lost_message(conn_dst); + kdbus_queue_entry_free(e); + continue; @@ -10751,19 +12800,16 @@ index 0000000..9993753 + * receive a given kernel notification + * @conn: Connection + * @conn_creds: Credentials of @conn to use for policy check -+ * @kmsg: The message carrying the notification ++ * @msg: Notification message + * -+ * This checks whether @conn is allowed to see the kernel notification @kmsg. ++ * This checks whether @conn is allowed to see the kernel notification. + * + * Return: true if allowed, false if not. + */ +bool kdbus_conn_policy_see_notification(struct kdbus_conn *conn, + const struct cred *conn_creds, -+ const struct kdbus_kmsg *kmsg) ++ const struct kdbus_msg *msg) +{ -+ if (WARN_ON(kmsg->msg.src_id != KDBUS_SRC_ID_KERNEL)) -+ return false; -+ + /* + * Depending on the notification type, broadcasted kernel notifications + * have to be filtered: @@ -10776,12 +12822,12 @@ index 0000000..9993753 + * broadcast to everyone, to allow tracking peers. + */ + -+ switch (kmsg->notify_type) { ++ switch (msg->items[0].type) { + case KDBUS_ITEM_NAME_ADD: + case KDBUS_ITEM_NAME_REMOVE: + case KDBUS_ITEM_NAME_CHANGE: + return kdbus_conn_policy_see_name(conn, conn_creds, -+ kmsg->notify_name); ++ msg->items[0].name_change.name); + + case KDBUS_ITEM_ID_ADD: + case KDBUS_ITEM_ID_REMOVE: @@ -10789,7 +12835,7 @@ index 0000000..9993753 + + default: + WARN(1, "Invalid type for notification broadcast: %llu\n", -+ (unsigned long long)kmsg->notify_type); ++ (unsigned long long)msg->items[0].type); + return false; + } +} @@ -10927,13 +12973,14 @@ index 0000000..9993753 + struct kdbus_pool_slice *slice = NULL; + struct kdbus_name_entry *entry = NULL; + struct kdbus_conn *owner_conn = NULL; ++ struct kdbus_item *meta_items = NULL; + struct kdbus_info info = {}; + struct kdbus_cmd_info *cmd; + struct kdbus_bus *bus = conn->ep->bus; -+ struct kvec kvec; -+ size_t meta_size; ++ struct kvec kvec[3]; ++ size_t meta_size, cnt = 0; + const char *name; -+ u64 attach_flags; ++ u64 attach_flags, size = 0; + int ret; + + struct kdbus_arg argv[] = { @@ -10983,10 +13030,6 @@ index 0000000..9993753 + goto exit; + } + -+ info.id = owner_conn->id; -+ info.flags = owner_conn->flags; -+ kdbus_kvec_set(&kvec, &info, sizeof(info), &info.size); -+ + attach_flags &= atomic64_read(&owner_conn->attach_flags_send); + + conn_meta = kdbus_meta_conn_new(); @@ -10996,32 +13039,35 @@ index 0000000..9993753 + goto exit; + } + -+ ret = kdbus_meta_conn_collect(conn_meta, NULL, owner_conn, -+ attach_flags); ++ ret = kdbus_meta_conn_collect(conn_meta, owner_conn, 0, attach_flags); + if (ret < 0) + goto exit; + -+ ret = kdbus_meta_export_prepare(owner_conn->meta, conn_meta, -+ &attach_flags, &meta_size); ++ ret = kdbus_meta_emit(owner_conn->meta_proc, owner_conn->meta_fake, ++ conn_meta, conn, attach_flags, ++ &meta_items, &meta_size); + if (ret < 0) + goto exit; + -+ slice = kdbus_pool_slice_alloc(conn->pool, -+ info.size + meta_size, false); ++ info.id = owner_conn->id; ++ info.flags = owner_conn->flags; ++ ++ kdbus_kvec_set(&kvec[cnt++], &info, sizeof(info), &size); ++ if (meta_size > 0) { ++ kdbus_kvec_set(&kvec[cnt++], meta_items, meta_size, &size); ++ cnt += !!kdbus_kvec_pad(&kvec[cnt], &size); ++ } ++ ++ info.size = size; ++ ++ slice = kdbus_pool_slice_alloc(conn->pool, size, false); + if (IS_ERR(slice)) { + ret = PTR_ERR(slice); + slice = NULL; + goto exit; + } + -+ ret = kdbus_meta_export(owner_conn->meta, conn_meta, attach_flags, -+ slice, sizeof(info), &meta_size); -+ if (ret < 0) -+ goto exit; -+ -+ info.size += meta_size; -+ -+ ret = kdbus_pool_slice_copy_kvec(slice, 0, &kvec, 1, sizeof(info)); ++ ret = kdbus_pool_slice_copy_kvec(slice, 0, kvec, cnt, size); + if (ret < 0) + goto exit; + @@ -11039,6 +13085,7 @@ index 0000000..9993753 +exit: + up_read(&bus->name_registry->rwlock); + kdbus_pool_slice_release(slice); ++ kfree(meta_items); + kdbus_meta_conn_unref(conn_meta); + kdbus_conn_unref(owner_conn); + return kdbus_args_clear(&args, ret); @@ -11053,7 +13100,6 @@ index 0000000..9993753 + */ +int kdbus_cmd_update(struct kdbus_conn *conn, void __user *argp) +{ -+ struct kdbus_bus *bus = conn->ep->bus; + struct kdbus_item *item_policy; + u64 *item_attach_send = NULL; + u64 *item_attach_recv = NULL; @@ -11094,11 +13140,6 @@ index 0000000..9993753 + &attach_send); + if (ret < 0) + goto exit; -+ -+ if (bus->attach_flags_req & ~attach_send) { -+ ret = -EINVAL; -+ goto exit; -+ } + } + + if (item_attach_recv) { @@ -11151,10 +13192,12 @@ index 0000000..9993753 +int kdbus_cmd_send(struct kdbus_conn *conn, struct file *f, void __user *argp) +{ + struct kdbus_cmd_send *cmd; -+ struct kdbus_kmsg *kmsg = NULL; ++ struct kdbus_staging *staging = NULL; ++ struct kdbus_msg *msg = NULL; + struct file *cancel_fd = NULL; -+ int ret; ++ int ret, ret2; + ++ /* command arguments */ + struct kdbus_arg argv[] = { + { .type = KDBUS_ITEM_NEGOTIATE }, + { .type = KDBUS_ITEM_CANCEL_FD }, @@ -11166,12 +13209,48 @@ index 0000000..9993753 + .argc = ARRAY_SIZE(argv), + }; + ++ /* message arguments */ ++ struct kdbus_arg msg_argv[] = { ++ { .type = KDBUS_ITEM_NEGOTIATE }, ++ { .type = KDBUS_ITEM_PAYLOAD_VEC, .multiple = true }, ++ { .type = KDBUS_ITEM_PAYLOAD_MEMFD, .multiple = true }, ++ { .type = KDBUS_ITEM_FDS }, ++ { .type = KDBUS_ITEM_BLOOM_FILTER }, ++ { .type = KDBUS_ITEM_DST_NAME }, ++ }; ++ struct kdbus_args msg_args = { ++ .allowed_flags = KDBUS_FLAG_NEGOTIATE | ++ KDBUS_MSG_EXPECT_REPLY | ++ KDBUS_MSG_NO_AUTO_START | ++ KDBUS_MSG_SIGNAL, ++ .argv = msg_argv, ++ .argc = ARRAY_SIZE(msg_argv), ++ }; ++ + if (!kdbus_conn_is_ordinary(conn)) + return -EOPNOTSUPP; + ++ /* make sure to parse both, @cmd and @msg on negotiation */ ++ + ret = kdbus_args_parse(&args, argp, &cmd); -+ if (ret != 0) -+ return ret; ++ if (ret < 0) ++ goto exit; ++ else if (ret > 0 && !cmd->msg_address) /* negotiation without msg */ ++ goto exit; ++ ++ ret2 = kdbus_args_parse_msg(&msg_args, KDBUS_PTR(cmd->msg_address), ++ &msg); ++ if (ret2 < 0) { /* cannot parse message */ ++ ret = ret2; ++ goto exit; ++ } else if (ret2 > 0 && !ret) { /* msg-negot implies cmd-negot */ ++ ret = -EINVAL; ++ goto exit; ++ } else if (ret > 0) { /* negotiation */ ++ goto exit; ++ } ++ ++ /* here we parsed both, @cmd and @msg, and neither wants negotiation */ + + cmd->reply.return_flags = 0; + kdbus_pool_publish_empty(conn->pool, &cmd->reply.offset, @@ -11190,23 +13269,30 @@ index 0000000..9993753 + } + } + -+ kmsg = kdbus_kmsg_new_from_cmd(conn, cmd); -+ if (IS_ERR(kmsg)) { -+ ret = PTR_ERR(kmsg); -+ kmsg = NULL; ++ /* patch-in the source of this message */ ++ if (msg->src_id > 0 && msg->src_id != conn->id) { ++ ret = -EINVAL; ++ goto exit; ++ } ++ msg->src_id = conn->id; ++ ++ staging = kdbus_staging_new_user(conn->ep->bus, cmd, msg); ++ if (IS_ERR(staging)) { ++ ret = PTR_ERR(staging); ++ staging = NULL; + goto exit; + } + -+ if (kmsg->msg.dst_id == KDBUS_DST_ID_BROADCAST) { ++ if (msg->dst_id == KDBUS_DST_ID_BROADCAST) { + down_read(&conn->ep->bus->name_registry->rwlock); -+ kdbus_bus_broadcast(conn->ep->bus, conn, kmsg); ++ kdbus_bus_broadcast(conn->ep->bus, conn, staging); + up_read(&conn->ep->bus->name_registry->rwlock); + } else if (cmd->flags & KDBUS_SEND_SYNC_REPLY) { + struct kdbus_reply *r; + ktime_t exp; + -+ exp = ns_to_ktime(kmsg->msg.timeout_ns); -+ r = kdbus_conn_call(conn, kmsg, exp); ++ exp = ns_to_ktime(msg->timeout_ns); ++ r = kdbus_conn_call(conn, staging, exp); + if (IS_ERR(r)) { + ret = PTR_ERR(r); + goto exit; @@ -11216,13 +13302,13 @@ index 0000000..9993753 + kdbus_reply_unref(r); + if (ret < 0) + goto exit; -+ } else if ((kmsg->msg.flags & KDBUS_MSG_EXPECT_REPLY) || -+ kmsg->msg.cookie_reply == 0) { -+ ret = kdbus_conn_unicast(conn, kmsg); ++ } else if ((msg->flags & KDBUS_MSG_EXPECT_REPLY) || ++ msg->cookie_reply == 0) { ++ ret = kdbus_conn_unicast(conn, staging); + if (ret < 0) + goto exit; + } else { -+ ret = kdbus_conn_reply(conn, kmsg); ++ ret = kdbus_conn_reply(conn, staging); + if (ret < 0) + goto exit; + } @@ -11233,7 +13319,8 @@ index 0000000..9993753 +exit: + if (cancel_fd) + fput(cancel_fd); -+ kdbus_kmsg_free(kmsg); ++ kdbus_staging_free(staging); ++ ret = kdbus_args_clear(&msg_args, ret); + return kdbus_args_clear(&args, ret); +} + @@ -11396,10 +13483,10 @@ index 0000000..9993753 +} diff --git a/ipc/kdbus/connection.h b/ipc/kdbus/connection.h new file mode 100644 -index 0000000..d1ffe90 +index 0000000..5ee864e --- /dev/null +++ b/ipc/kdbus/connection.h -@@ -0,0 +1,257 @@ +@@ -0,0 +1,261 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -11433,7 +13520,7 @@ index 0000000..d1ffe90 + KDBUS_HELLO_MONITOR) + +struct kdbus_quota; -+struct kdbus_kmsg; ++struct kdbus_staging; + +/** + * struct kdbus_conn - connection to a bus @@ -11456,11 +13543,13 @@ index 0000000..d1ffe90 + * @work: Delayed work to handle timeouts + * activator for + * @match_db: Subscription filter to broadcast messages -+ * @meta: Active connection creator's metadata/credentials, -+ * either from the handle or from HELLO ++ * @meta_proc: Process metadata of connection creator, or NULL ++ * @meta_fake: Faked metadata, or NULL + * @pool: The user's buffer to receive messages + * @user: Owner of the connection + * @cred: The credentials of the connection at creation time ++ * @pid: Pid at creation time ++ * @root_path: Root path at creation time + * @name_count: Number of owned well-known names + * @request_count: Number of pending requests issued by this + * connection that are waiting for replies from @@ -11474,7 +13563,6 @@ index 0000000..d1ffe90 + * @names_list: List of well-known names + * @names_queue_list: Well-known names this connection waits for + * @privileged: Whether this connection is privileged on the bus -+ * @faked_meta: Whether the metadata was faked on HELLO + */ +struct kdbus_conn { + struct kref kref; @@ -11495,10 +13583,13 @@ index 0000000..d1ffe90 + struct list_head reply_list; + struct delayed_work work; + struct kdbus_match_db *match_db; -+ struct kdbus_meta_proc *meta; ++ struct kdbus_meta_proc *meta_proc; ++ struct kdbus_meta_fake *meta_fake; + struct kdbus_pool *pool; + struct kdbus_user *user; + const struct cred *cred; ++ struct pid *pid; ++ struct path root_path; + atomic_t name_count; + atomic_t request_count; + atomic_t lost_count; @@ -11514,7 +13605,6 @@ index 0000000..d1ffe90 + struct list_head names_queue_list; + + bool privileged:1; -+ bool faked_meta:1; +}; + +struct kdbus_conn *kdbus_conn_ref(struct kdbus_conn *conn); @@ -11531,8 +13621,9 @@ index 0000000..d1ffe90 +void kdbus_conn_lost_message(struct kdbus_conn *c); +int kdbus_conn_entry_insert(struct kdbus_conn *conn_src, + struct kdbus_conn *conn_dst, -+ const struct kdbus_kmsg *kmsg, -+ struct kdbus_reply *reply); ++ struct kdbus_staging *staging, ++ struct kdbus_reply *reply, ++ const struct kdbus_name_entry *name); +void kdbus_conn_move_messages(struct kdbus_conn *conn_dst, + struct kdbus_conn *conn_src, + u64 name_id); @@ -11549,7 +13640,7 @@ index 0000000..d1ffe90 + const char *name); +bool kdbus_conn_policy_see_notification(struct kdbus_conn *conn, + const struct cred *curr_creds, -+ const struct kdbus_kmsg *kmsg); ++ const struct kdbus_msg *msg); + +/* command dispatcher */ +struct kdbus_conn *kdbus_cmd_hello(struct kdbus_ep *ep, bool privileged, @@ -12044,7 +14135,7 @@ index 0000000..447a2bd +#endif diff --git a/ipc/kdbus/endpoint.c b/ipc/kdbus/endpoint.c new file mode 100644 -index 0000000..9a95a5e +index 0000000..977964d --- /dev/null +++ b/ipc/kdbus/endpoint.c @@ -0,0 +1,275 @@ @@ -12128,7 +14219,7 @@ index 0000000..9a95a5e + * @gid: The gid of the node + * @is_custom: Whether this is a custom endpoint + * -+ * This function will create a new enpoint with the given ++ * This function will create a new endpoint with the given + * name and properties for a given bus. + * + * Return: a new kdbus_ep on success, ERR_PTR on failure. @@ -12325,7 +14416,7 @@ index 0000000..9a95a5e +} diff --git a/ipc/kdbus/endpoint.h b/ipc/kdbus/endpoint.h new file mode 100644 -index 0000000..d31954b +index 0000000..bc1b94a --- /dev/null +++ b/ipc/kdbus/endpoint.h @@ -0,0 +1,67 @@ @@ -12356,7 +14447,7 @@ index 0000000..d31954b +struct kdbus_user; + +/** -+ * struct kdbus_ep - enpoint to access a bus ++ * struct kdbus_ep - endpoint to access a bus + * @node: The kdbus node + * @lock: Endpoint data lock + * @bus: Bus behind this endpoint @@ -12364,7 +14455,7 @@ index 0000000..d31954b + * @policy_db: Uploaded policy + * @conn_list: Connections of this endpoint + * -+ * An enpoint offers access to a bus; the default endpoint node name is "bus". ++ * An endpoint offers access to a bus; the default endpoint node name is "bus". + * Additional custom endpoints to the same bus can be created and they can + * carry their own policies/filters. + */ @@ -12398,10 +14489,10 @@ index 0000000..d31954b +#endif diff --git a/ipc/kdbus/fs.c b/ipc/kdbus/fs.c new file mode 100644 -index 0000000..d01f33b +index 0000000..09c4809 --- /dev/null +++ b/ipc/kdbus/fs.c -@@ -0,0 +1,510 @@ +@@ -0,0 +1,508 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -12478,7 +14569,7 @@ index 0000000..d01f33b + * closest node to that position and cannot use our node pointer. This + * means iterating the rb-tree to find the closest match and start over + * from there. -+ * Note that hash values are not neccessarily unique. Therefore, llseek ++ * Note that hash values are not necessarily unique. Therefore, llseek + * is not guaranteed to seek to the same node that you got when you + * retrieved the position. Seeking to 0, 1, 2 and >=INT_MAX is safe, + * though. We could use the inode-number as position, but this would @@ -12729,9 +14820,7 @@ index 0000000..d01f33b + } + + kill_anon_super(sb); -+ -+ if (domain) -+ kdbus_domain_unref(domain); ++ kdbus_domain_unref(domain); +} + +static int fs_super_set(struct super_block *sb, void *data) @@ -12948,10 +15037,10 @@ index 0000000..62f7d6a +#endif diff --git a/ipc/kdbus/handle.c b/ipc/kdbus/handle.c new file mode 100644 -index 0000000..0752799 +index 0000000..e0e06b0 --- /dev/null +++ b/ipc/kdbus/handle.c -@@ -0,0 +1,702 @@ +@@ -0,0 +1,709 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -13080,6 +15169,7 @@ index 0000000..0752799 +/** + * __kdbus_args_parse() - parse payload of kdbus command + * @args: object to parse data into ++ * @is_cmd: whether this is a command or msg payload + * @argp: user-space location of command payload to parse + * @type_size: overall size of command payload to parse + * @items_offset: offset of items array in command payload @@ -13094,10 +15184,14 @@ index 0000000..0752799 + * If this function succeeded, you must call kdbus_args_clear() to release + * allocated resources before destroying @args. + * ++ * This can also be used to import kdbus_msg objects. In that case, @is_cmd must ++ * be set to 'false' and the 'return_flags' field will not be touched (as it ++ * doesn't exist on kdbus_msg). ++ * + * Return: On failure a negative error code is returned. Otherwise, 1 is + * returned if negotiation was requested, 0 if not. + */ -+int __kdbus_args_parse(struct kdbus_args *args, void __user *argp, ++int __kdbus_args_parse(struct kdbus_args *args, bool is_cmd, void __user *argp, + size_t type_size, size_t items_offset, void **out) +{ + u64 user_size; @@ -13127,10 +15221,12 @@ index 0000000..0752799 + goto error; + } + -+ args->cmd->return_flags = 0; ++ if (is_cmd) ++ args->cmd->return_flags = 0; + args->user = argp; + args->items = (void *)((u8 *)args->cmd + items_offset); + args->items_size = args->cmd->size - items_offset; ++ args->is_cmd = is_cmd; + + if (args->cmd->flags & ~args->allowed_flags) { + ret = -EINVAL; @@ -13179,8 +15275,8 @@ index 0000000..0752799 + return ret; + + if (!IS_ERR_OR_NULL(args->cmd)) { -+ if (put_user(args->cmd->return_flags, -+ &args->user->return_flags)) ++ if (args->is_cmd && put_user(args->cmd->return_flags, ++ &args->user->return_flags)) + ret = -EFAULT; + if (args->cmd != (void*)args->cmd_buf) + kfree(args->cmd); @@ -13656,10 +15752,10 @@ index 0000000..0752799 +}; diff --git a/ipc/kdbus/handle.h b/ipc/kdbus/handle.h new file mode 100644 -index 0000000..13c59d9 +index 0000000..8a36c05 --- /dev/null +++ b/ipc/kdbus/handle.h -@@ -0,0 +1,90 @@ +@@ -0,0 +1,103 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -13710,6 +15806,7 @@ index 0000000..13c59d9 + * @cmd_buf: 512 bytes inline buf to avoid kmalloc() on small cmds + * @items: points to item array in @cmd + * @items_size: size of @items in bytes ++ * @is_cmd: whether this is a command-payload or msg-payload + * + * This structure is used to parse ioctl command payloads on each invocation. + * The ioctl handler has to pre-fill the flags and allowed items before passing @@ -13730,9 +15827,10 @@ index 0000000..13c59d9 + + struct kdbus_item *items; + size_t items_size; ++ bool is_cmd : 1; +}; + -+int __kdbus_args_parse(struct kdbus_args *args, void __user *argp, ++int __kdbus_args_parse(struct kdbus_args *args, bool is_cmd, void __user *argp, + size_t type_size, size_t items_offset, void **out); +int kdbus_args_clear(struct kdbus_args *args, int ret); + @@ -13744,7 +15842,18 @@ index 0000000..13c59d9 + offsetof(struct kdbus_cmd, flags)); \ + BUILD_BUG_ON(offsetof(typeof(**(_v)), return_flags) != \ + offsetof(struct kdbus_cmd, return_flags)); \ -+ __kdbus_args_parse((_args), (_argp), sizeof(**(_v)), \ ++ __kdbus_args_parse((_args), 1, (_argp), sizeof(**(_v)), \ ++ offsetof(typeof(**(_v)), items), \ ++ (void **)(_v)); \ ++ }) ++ ++#define kdbus_args_parse_msg(_args, _argp, _v) \ ++ ({ \ ++ BUILD_BUG_ON(offsetof(typeof(**(_v)), size) != \ ++ offsetof(struct kdbus_cmd, size)); \ ++ BUILD_BUG_ON(offsetof(typeof(**(_v)), flags) != \ ++ offsetof(struct kdbus_cmd, flags)); \ ++ __kdbus_args_parse((_args), 0, (_argp), sizeof(**(_v)), \ + offsetof(typeof(**(_v)), items), \ + (void **)(_v)); \ + }) @@ -13752,10 +15861,10 @@ index 0000000..13c59d9 +#endif diff --git a/ipc/kdbus/item.c b/ipc/kdbus/item.c new file mode 100644 -index 0000000..1ee72c2 +index 0000000..ce78dba --- /dev/null +++ b/ipc/kdbus/item.c -@@ -0,0 +1,333 @@ +@@ -0,0 +1,293 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -13905,6 +16014,7 @@ index 0000000..1ee72c2 + case KDBUS_ITEM_ATTACH_FLAGS_SEND: + case KDBUS_ITEM_ATTACH_FLAGS_RECV: + case KDBUS_ITEM_ID: ++ case KDBUS_ITEM_DST_ID: + if (payload_size != sizeof(u64)) + return -EINVAL; + break; @@ -14018,47 +16128,6 @@ index 0000000..1ee72c2 + return 0; +} + -+static struct kdbus_item *kdbus_items_get(const struct kdbus_item *items, -+ size_t items_size, -+ unsigned int item_type) -+{ -+ const struct kdbus_item *iter, *found = NULL; -+ -+ KDBUS_ITEMS_FOREACH(iter, items, items_size) { -+ if (iter->type == item_type) { -+ if (found) -+ return ERR_PTR(-EEXIST); -+ found = iter; -+ } -+ } -+ -+ return (struct kdbus_item *)found ? : ERR_PTR(-EBADMSG); -+} -+ -+/** -+ * kdbus_items_get_str() - get string from a list of items -+ * @items: The items to walk -+ * @items_size: The size of all items -+ * @item_type: The item type to look for -+ * -+ * This function walks a list of items and searches for items of type -+ * @item_type. If it finds exactly one such item, @str_ret will be set to -+ * the .str member of the item. -+ * -+ * Return: the string, if the item was found exactly once, ERR_PTR(-EEXIST) -+ * if the item was found more than once, and ERR_PTR(-EBADMSG) if there was -+ * no item of the given type. -+ */ -+const char *kdbus_items_get_str(const struct kdbus_item *items, -+ size_t items_size, -+ unsigned int item_type) -+{ -+ const struct kdbus_item *item; -+ -+ item = kdbus_items_get(items, items_size, item_type); -+ return IS_ERR(item) ? ERR_CAST(item) : item->str; -+} -+ +/** + * kdbus_item_set() - Set item content + * @item: The item to modify @@ -14091,10 +16160,10 @@ index 0000000..1ee72c2 +} diff --git a/ipc/kdbus/item.h b/ipc/kdbus/item.h new file mode 100644 -index 0000000..bca63b4 +index 0000000..3a7e6cc --- /dev/null +++ b/ipc/kdbus/item.h -@@ -0,0 +1,64 @@ +@@ -0,0 +1,61 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -14152,19 +16221,16 @@ index 0000000..bca63b4 +int kdbus_item_validate_name(const struct kdbus_item *item); +int kdbus_item_validate(const struct kdbus_item *item); +int kdbus_items_validate(const struct kdbus_item *items, size_t items_size); -+const char *kdbus_items_get_str(const struct kdbus_item *items, -+ size_t items_size, -+ unsigned int item_type); +struct kdbus_item *kdbus_item_set(struct kdbus_item *item, u64 type, + const void *data, size_t len); + +#endif diff --git a/ipc/kdbus/limits.h b/ipc/kdbus/limits.h new file mode 100644 -index 0000000..6450f58 +index 0000000..c54925a --- /dev/null +++ b/ipc/kdbus/limits.h -@@ -0,0 +1,64 @@ +@@ -0,0 +1,61 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -14186,9 +16252,6 @@ index 0000000..6450f58 +/* maximum size of message header and items */ +#define KDBUS_MSG_MAX_SIZE SZ_8K + -+/* maximum number of message items */ -+#define KDBUS_MSG_MAX_ITEMS 128 -+ +/* maximum number of memfd items per message */ +#define KDBUS_MSG_MAX_MEMFD_ITEMS 16 + @@ -14351,10 +16414,10 @@ index 0000000..1ad4dc8 +MODULE_ALIAS_FS(KBUILD_MODNAME "fs"); diff --git a/ipc/kdbus/match.c b/ipc/kdbus/match.c new file mode 100644 -index 0000000..cc083b4 +index 0000000..4ee6a1f --- /dev/null +++ b/ipc/kdbus/match.c -@@ -0,0 +1,559 @@ +@@ -0,0 +1,546 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -14423,7 +16486,7 @@ index 0000000..cc083b4 + +/** + * struct kdbus_match_rule - a rule appended to a match entry -+ * @type: An item type to match agains ++ * @type: An item type to match against + * @bloom_mask: Bloom mask to match a message's filter against, used + * with KDBUS_ITEM_BLOOM_MASK + * @name: Name to match against, used with KDBUS_ITEM_NAME, @@ -14435,6 +16498,7 @@ index 0000000..cc083b4 + * KDBUS_ITEM_NAME_{ADD,REMOVE,CHANGE}, + * KDBUS_ITEM_ID_REMOVE + * @src_id: ID to match against, used with KDBUS_ITEM_ID ++ * @dst_id: Message destination ID, used with KDBUS_ITEM_DST_ID + * @rules_entry: Entry in the entry's rules list + */ +struct kdbus_match_rule { @@ -14447,6 +16511,7 @@ index 0000000..cc083b4 + u64 new_id; + }; + u64 src_id; ++ u64 dst_id; + }; + struct list_head rules_entry; +}; @@ -14469,6 +16534,7 @@ index 0000000..cc083b4 + break; + + case KDBUS_ITEM_ID: ++ case KDBUS_ITEM_DST_ID: + case KDBUS_ITEM_ID_ADD: + case KDBUS_ITEM_ID_REMOVE: + break; @@ -14561,96 +16627,74 @@ index 0000000..cc083b4 + return true; +} + -+static bool kdbus_match_rules(const struct kdbus_match_entry *entry, -+ struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg) ++static bool kdbus_match_rule_conn(const struct kdbus_match_rule *r, ++ struct kdbus_conn *c, ++ const struct kdbus_staging *s) +{ -+ struct kdbus_match_rule *r; -+ -+ if (conn_src) -+ lockdep_assert_held(&conn_src->ep->bus->name_registry->rwlock); -+ -+ /* -+ * Walk all the rules and bail out immediately -+ * if any of them is unsatisfied. -+ */ -+ -+ list_for_each_entry(r, &entry->rules_list, rules_entry) { -+ if (conn_src) { -+ /* messages from userspace */ -+ -+ switch (r->type) { -+ case KDBUS_ITEM_BLOOM_MASK: -+ if (!kdbus_match_bloom(kmsg->bloom_filter, -+ &r->bloom_mask, -+ conn_src)) -+ return false; -+ break; -+ -+ case KDBUS_ITEM_ID: -+ if (r->src_id != conn_src->id && -+ r->src_id != KDBUS_MATCH_ID_ANY) -+ return false; -+ -+ break; -+ -+ case KDBUS_ITEM_NAME: -+ if (!kdbus_conn_has_name(conn_src, r->name)) -+ return false; -+ -+ break; -+ -+ default: -+ return false; -+ } -+ } else { -+ /* kernel notifications */ -+ -+ if (kmsg->notify_type != r->type) -+ return false; -+ -+ switch (r->type) { -+ case KDBUS_ITEM_ID_ADD: -+ if (r->new_id != KDBUS_MATCH_ID_ANY && -+ r->new_id != kmsg->notify_new_id) -+ return false; ++ lockdep_assert_held(&c->ep->bus->name_registry->rwlock); + -+ break; ++ switch (r->type) { ++ case KDBUS_ITEM_BLOOM_MASK: ++ return kdbus_match_bloom(s->bloom_filter, &r->bloom_mask, c); ++ case KDBUS_ITEM_ID: ++ return r->src_id == c->id || r->src_id == KDBUS_MATCH_ID_ANY; ++ case KDBUS_ITEM_DST_ID: ++ return r->dst_id == s->msg->dst_id || ++ r->dst_id == KDBUS_MATCH_ID_ANY; ++ case KDBUS_ITEM_NAME: ++ return kdbus_conn_has_name(c, r->name); ++ default: ++ return false; ++ } ++} + -+ case KDBUS_ITEM_ID_REMOVE: -+ if (r->old_id != KDBUS_MATCH_ID_ANY && -+ r->old_id != kmsg->notify_old_id) -+ return false; ++static bool kdbus_match_rule_kernel(const struct kdbus_match_rule *r, ++ const struct kdbus_staging *s) ++{ ++ struct kdbus_item *n = s->notify; + -+ break; ++ if (WARN_ON(!n) || n->type != r->type) ++ return false; + -+ case KDBUS_ITEM_NAME_ADD: -+ case KDBUS_ITEM_NAME_CHANGE: -+ case KDBUS_ITEM_NAME_REMOVE: -+ if ((r->old_id != KDBUS_MATCH_ID_ANY && -+ r->old_id != kmsg->notify_old_id) || -+ (r->new_id != KDBUS_MATCH_ID_ANY && -+ r->new_id != kmsg->notify_new_id) || -+ (r->name && kmsg->notify_name && -+ strcmp(r->name, kmsg->notify_name) != 0)) -+ return false; ++ switch (r->type) { ++ case KDBUS_ITEM_ID_ADD: ++ return r->new_id == KDBUS_MATCH_ID_ANY || ++ r->new_id == n->id_change.id; ++ case KDBUS_ITEM_ID_REMOVE: ++ return r->old_id == KDBUS_MATCH_ID_ANY || ++ r->old_id == n->id_change.id; ++ case KDBUS_ITEM_NAME_ADD: ++ case KDBUS_ITEM_NAME_CHANGE: ++ case KDBUS_ITEM_NAME_REMOVE: ++ return (r->old_id == KDBUS_MATCH_ID_ANY || ++ r->old_id == n->name_change.old_id.id) && ++ (r->new_id == KDBUS_MATCH_ID_ANY || ++ r->new_id == n->name_change.new_id.id) && ++ (!r->name || !strcmp(r->name, n->name_change.name)); ++ default: ++ return false; ++ } ++} + -+ break; ++static bool kdbus_match_rules(const struct kdbus_match_entry *entry, ++ struct kdbus_conn *c, ++ const struct kdbus_staging *s) ++{ ++ struct kdbus_match_rule *r; + -+ default: -+ return false; -+ } -+ } -+ } ++ list_for_each_entry(r, &entry->rules_list, rules_entry) ++ if ((c && !kdbus_match_rule_conn(r, c, s)) || ++ (!c && !kdbus_match_rule_kernel(r, s))) ++ return false; + + return true; +} + +/** -+ * kdbus_match_db_match_kmsg() - match a kmsg object agains the database entries ++ * kdbus_match_db_match_msg() - match a msg object agains the database entries + * @mdb: The match database + * @conn_src: The connection object originating the message -+ * @kmsg: The kmsg to perform the match on ++ * @staging: Staging object containing the message to match against + * + * This function will walk through all the database entries previously uploaded + * with kdbus_match_db_add(). As soon as any of them has an all-satisfied rule @@ -14661,16 +16705,16 @@ index 0000000..cc083b4 + * + * Return: true if there was a matching database entry, false otherwise. + */ -+bool kdbus_match_db_match_kmsg(struct kdbus_match_db *mdb, -+ struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg) ++bool kdbus_match_db_match_msg(struct kdbus_match_db *mdb, ++ struct kdbus_conn *conn_src, ++ const struct kdbus_staging *staging) +{ + struct kdbus_match_entry *entry; + bool matched = false; + + down_read(&mdb->mdb_rwlock); + list_for_each_entry(entry, &mdb->entries_list, list_entry) { -+ matched = kdbus_match_rules(entry, conn_src, kmsg); ++ matched = kdbus_match_rules(entry, conn_src, staging); + if (matched) + break; + } @@ -14710,6 +16754,7 @@ index 0000000..cc083b4 + * KDBUS_ITEM_BLOOM_MASK: A bloom mask + * KDBUS_ITEM_NAME: A connection's source name + * KDBUS_ITEM_ID: A connection ID ++ * KDBUS_ITEM_DST_ID: A connection ID + * KDBUS_ITEM_NAME_ADD: + * KDBUS_ITEM_NAME_REMOVE: + * KDBUS_ITEM_NAME_CHANGE: Well-known name changes, carry @@ -14721,9 +16766,9 @@ index 0000000..cc083b4 + * For kdbus_notify_{id,name}_change structs, only the ID and name fields + * are looked at when adding an entry. The flags are unused. + * -+ * Also note that KDBUS_ITEM_BLOOM_MASK, KDBUS_ITEM_NAME and KDBUS_ITEM_ID -+ * are used to match messages from userspace, while the others apply to -+ * kernel-generated notifications. ++ * Also note that KDBUS_ITEM_BLOOM_MASK, KDBUS_ITEM_NAME, KDBUS_ITEM_ID, ++ * and KDBUS_ITEM_DST_ID are used to match messages from userspace, while the ++ * others apply to kernel-generated notifications. + * + * Return: >=0 on success, negative error code on failure. + */ @@ -14740,6 +16785,7 @@ index 0000000..cc083b4 + { .type = KDBUS_ITEM_BLOOM_MASK, .multiple = true }, + { .type = KDBUS_ITEM_NAME, .multiple = true }, + { .type = KDBUS_ITEM_ID, .multiple = true }, ++ { .type = KDBUS_ITEM_DST_ID, .multiple = true }, + { .type = KDBUS_ITEM_NAME_ADD, .multiple = true }, + { .type = KDBUS_ITEM_NAME_REMOVE, .multiple = true }, + { .type = KDBUS_ITEM_NAME_CHANGE, .multiple = true }, @@ -14822,6 +16868,10 @@ index 0000000..cc083b4 + rule->src_id = item->id; + break; + ++ case KDBUS_ITEM_DST_ID: ++ rule->dst_id = item->id; ++ break; ++ + case KDBUS_ITEM_NAME_ADD: + case KDBUS_ITEM_NAME_REMOVE: + case KDBUS_ITEM_NAME_CHANGE: @@ -14916,7 +16966,7 @@ index 0000000..cc083b4 +} diff --git a/ipc/kdbus/match.h b/ipc/kdbus/match.h new file mode 100644 -index 0000000..ea42929 +index 0000000..ceb492f --- /dev/null +++ b/ipc/kdbus/match.h @@ -0,0 +1,35 @@ @@ -14938,8 +16988,8 @@ index 0000000..ea42929 +#define __KDBUS_MATCH_H + +struct kdbus_conn; -+struct kdbus_kmsg; +struct kdbus_match_db; ++struct kdbus_staging; + +struct kdbus_match_db *kdbus_match_db_new(void); +void kdbus_match_db_free(struct kdbus_match_db *db); @@ -14947,9 +16997,9 @@ index 0000000..ea42929 + struct kdbus_cmd_match *cmd); +int kdbus_match_db_remove(struct kdbus_conn *conn, + struct kdbus_cmd_match *cmd); -+bool kdbus_match_db_match_kmsg(struct kdbus_match_db *db, -+ struct kdbus_conn *conn_src, -+ struct kdbus_kmsg *kmsg); ++bool kdbus_match_db_match_msg(struct kdbus_match_db *db, ++ struct kdbus_conn *conn_src, ++ const struct kdbus_staging *staging); + +int kdbus_cmd_match_add(struct kdbus_conn *conn, void __user *argp); +int kdbus_cmd_match_remove(struct kdbus_conn *conn, void __user *argp); @@ -14957,10 +17007,10 @@ index 0000000..ea42929 +#endif diff --git a/ipc/kdbus/message.c b/ipc/kdbus/message.c new file mode 100644 -index 0000000..066e816 +index 0000000..3520f45 --- /dev/null +++ b/ipc/kdbus/message.c -@@ -0,0 +1,640 @@ +@@ -0,0 +1,1040 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -15000,613 +17050,1013 @@ index 0000000..066e816 +#include "names.h" +#include "policy.h" + -+#define KDBUS_KMSG_HEADER_SIZE offsetof(struct kdbus_kmsg, msg) ++static const char * const zeros = "\0\0\0\0\0\0\0"; + -+static struct kdbus_msg_resources *kdbus_msg_resources_new(void) ++static struct kdbus_gaps *kdbus_gaps_new(size_t n_memfds, size_t n_fds) +{ -+ struct kdbus_msg_resources *r; ++ size_t size_offsets, size_memfds, size_fds, size; ++ struct kdbus_gaps *gaps; + -+ r = kzalloc(sizeof(*r), GFP_KERNEL); -+ if (!r) ++ size_offsets = n_memfds * sizeof(*gaps->memfd_offsets); ++ size_memfds = n_memfds * sizeof(*gaps->memfd_files); ++ size_fds = n_fds * sizeof(*gaps->fd_files); ++ size = sizeof(*gaps) + size_offsets + size_memfds + size_fds; ++ ++ gaps = kzalloc(size, GFP_KERNEL); ++ if (!gaps) + return ERR_PTR(-ENOMEM); + -+ kref_init(&r->kref); ++ kref_init(&gaps->kref); ++ gaps->n_memfds = 0; /* we reserve n_memfds, but don't enforce them */ ++ gaps->memfd_offsets = (void *)(gaps + 1); ++ gaps->memfd_files = (void *)((u8 *)gaps->memfd_offsets + size_offsets); ++ gaps->n_fds = 0; /* we reserve n_fds, but don't enforce them */ ++ gaps->fd_files = (void *)((u8 *)gaps->memfd_files + size_memfds); + -+ return r; ++ return gaps; +} + -+static void __kdbus_msg_resources_free(struct kref *kref) ++static void kdbus_gaps_free(struct kref *kref) +{ -+ struct kdbus_msg_resources *r = -+ container_of(kref, struct kdbus_msg_resources, kref); ++ struct kdbus_gaps *gaps = container_of(kref, struct kdbus_gaps, kref); + size_t i; + -+ for (i = 0; i < r->data_count; ++i) { -+ switch (r->data[i].type) { -+ case KDBUS_MSG_DATA_VEC: -+ /* nothing to do */ -+ break; -+ case KDBUS_MSG_DATA_MEMFD: -+ if (r->data[i].memfd.file) -+ fput(r->data[i].memfd.file); -+ break; -+ } -+ } -+ -+ for (i = 0; i < r->fds_count; i++) -+ if (r->fds[i]) -+ fput(r->fds[i]); ++ for (i = 0; i < gaps->n_fds; ++i) ++ if (gaps->fd_files[i]) ++ fput(gaps->fd_files[i]); ++ for (i = 0; i < gaps->n_memfds; ++i) ++ if (gaps->memfd_files[i]) ++ fput(gaps->memfd_files[i]); + -+ kfree(r->dst_name); -+ kfree(r->data); -+ kfree(r->fds); -+ kfree(r); ++ kfree(gaps); +} + +/** -+ * kdbus_msg_resources_ref() - Acquire reference to msg resources -+ * @r: resources to acquire ref to ++ * kdbus_gaps_ref() - gain reference ++ * @gaps: gaps object + * -+ * Return: The acquired resource ++ * Return: @gaps is returned + */ -+struct kdbus_msg_resources * -+kdbus_msg_resources_ref(struct kdbus_msg_resources *r) ++struct kdbus_gaps *kdbus_gaps_ref(struct kdbus_gaps *gaps) +{ -+ if (r) -+ kref_get(&r->kref); -+ return r; ++ if (gaps) ++ kref_get(&gaps->kref); ++ return gaps; +} + +/** -+ * kdbus_msg_resources_unref() - Drop reference to msg resources -+ * @r: resources to drop reference of ++ * kdbus_gaps_unref() - drop reference ++ * @gaps: gaps object + * + * Return: NULL + */ -+struct kdbus_msg_resources * -+kdbus_msg_resources_unref(struct kdbus_msg_resources *r) ++struct kdbus_gaps *kdbus_gaps_unref(struct kdbus_gaps *gaps) +{ -+ if (r) -+ kref_put(&r->kref, __kdbus_msg_resources_free); ++ if (gaps) ++ kref_put(&gaps->kref, kdbus_gaps_free); + return NULL; +} + +/** -+ * kdbus_kmsg_free() - free allocated message -+ * @kmsg: Message ++ * kdbus_gaps_install() - install file-descriptors ++ * @gaps: gaps object, or NULL ++ * @slice: pool slice that contains the message ++ * @out_incomplete output variable to note incomplete fds ++ * ++ * This function installs all file-descriptors of @gaps into the current ++ * process and copies the file-descriptor numbers into the target pool slice. ++ * ++ * If the file-descriptors were only partially installed, then @out_incomplete ++ * will be set to true. Otherwise, it's set to false. ++ * ++ * Return: 0 on success, negative error code on failure + */ -+void kdbus_kmsg_free(struct kdbus_kmsg *kmsg) ++int kdbus_gaps_install(struct kdbus_gaps *gaps, struct kdbus_pool_slice *slice, ++ bool *out_incomplete) +{ -+ if (!kmsg) -+ return; ++ bool incomplete_fds = false; ++ struct kvec kvec; ++ size_t i, n_fds; ++ int ret, *fds; + -+ kdbus_msg_resources_unref(kmsg->res); -+ kdbus_meta_conn_unref(kmsg->conn_meta); -+ kdbus_meta_proc_unref(kmsg->proc_meta); -+ kfree(kmsg->iov); -+ kfree(kmsg); -+} ++ if (!gaps) { ++ /* nothing to do */ ++ *out_incomplete = incomplete_fds; ++ return 0; ++ } + -+/** -+ * kdbus_kmsg_new() - allocate message -+ * @bus: Bus this message is allocated on -+ * @extra_size: Additional size to reserve for data -+ * -+ * Return: new kdbus_kmsg on success, ERR_PTR on failure. -+ */ -+struct kdbus_kmsg *kdbus_kmsg_new(struct kdbus_bus *bus, size_t extra_size) -+{ -+ struct kdbus_kmsg *m; -+ size_t size; -+ int ret; ++ n_fds = gaps->n_fds + gaps->n_memfds; ++ if (n_fds < 1) { ++ /* nothing to do */ ++ *out_incomplete = incomplete_fds; ++ return 0; ++ } + -+ size = sizeof(struct kdbus_kmsg) + KDBUS_ITEM_SIZE(extra_size); -+ m = kzalloc(size, GFP_KERNEL); -+ if (!m) -+ return ERR_PTR(-ENOMEM); ++ fds = kmalloc_array(n_fds, sizeof(*fds), GFP_TEMPORARY); ++ n_fds = 0; ++ if (!fds) ++ return -ENOMEM; + -+ m->seq = atomic64_inc_return(&bus->domain->last_id); -+ m->msg.size = size - KDBUS_KMSG_HEADER_SIZE; -+ m->msg.items[0].size = KDBUS_ITEM_SIZE(extra_size); ++ /* 1) allocate fds and copy them over */ + -+ m->proc_meta = kdbus_meta_proc_new(); -+ if (IS_ERR(m->proc_meta)) { -+ ret = PTR_ERR(m->proc_meta); -+ m->proc_meta = NULL; -+ goto exit; ++ if (gaps->n_fds > 0) { ++ for (i = 0; i < gaps->n_fds; ++i) { ++ int fd; ++ ++ fd = get_unused_fd_flags(O_CLOEXEC); ++ if (fd < 0) ++ incomplete_fds = true; ++ ++ WARN_ON(!gaps->fd_files[i]); ++ ++ fds[n_fds++] = fd < 0 ? -1 : fd; ++ } ++ ++ /* ++ * The file-descriptor array can only be present once per ++ * message. Hence, prepare all fds and then copy them over with ++ * a single kvec. ++ */ ++ ++ WARN_ON(!gaps->fd_offset); ++ ++ kvec.iov_base = fds; ++ kvec.iov_len = gaps->n_fds * sizeof(*fds); ++ ret = kdbus_pool_slice_copy_kvec(slice, gaps->fd_offset, ++ &kvec, 1, kvec.iov_len); ++ if (ret < 0) ++ goto exit; + } + -+ m->conn_meta = kdbus_meta_conn_new(); -+ if (IS_ERR(m->conn_meta)) { -+ ret = PTR_ERR(m->conn_meta); -+ m->conn_meta = NULL; -+ goto exit; ++ for (i = 0; i < gaps->n_memfds; ++i) { ++ int memfd; ++ ++ memfd = get_unused_fd_flags(O_CLOEXEC); ++ if (memfd < 0) { ++ incomplete_fds = true; ++ /* memfds are initialized to -1, skip copying it */ ++ continue; ++ } ++ ++ fds[n_fds++] = memfd; ++ ++ /* ++ * memfds have to be copied individually as they each are put ++ * into a separate item. This should not be an issue, though, ++ * as usually there is no need to send more than one memfd per ++ * message. ++ */ ++ ++ WARN_ON(!gaps->memfd_offsets[i]); ++ WARN_ON(!gaps->memfd_files[i]); ++ ++ kvec.iov_base = &memfd; ++ kvec.iov_len = sizeof(memfd); ++ ret = kdbus_pool_slice_copy_kvec(slice, gaps->memfd_offsets[i], ++ &kvec, 1, kvec.iov_len); ++ if (ret < 0) ++ goto exit; + } + -+ return m; ++ /* 2) install fds now that everything was successful */ ++ ++ for (i = 0; i < gaps->n_fds; ++i) ++ if (fds[i] >= 0) ++ fd_install(fds[i], get_file(gaps->fd_files[i])); ++ for (i = 0; i < gaps->n_memfds; ++i) ++ if (fds[gaps->n_fds + i] >= 0) ++ fd_install(fds[gaps->n_fds + i], ++ get_file(gaps->memfd_files[i])); ++ ++ ret = 0; + +exit: -+ kdbus_kmsg_free(m); -+ return ERR_PTR(ret); ++ if (ret < 0) ++ for (i = 0; i < n_fds; ++i) ++ put_unused_fd(fds[i]); ++ kfree(fds); ++ *out_incomplete = incomplete_fds; ++ return ret; +} + -+static int kdbus_handle_check_file(struct file *file) ++static struct file *kdbus_get_fd(int fd) +{ -+ struct inode *inode = file_inode(file); ++ struct file *f, *ret; ++ struct inode *inode; + struct socket *sock; + -+ /* -+ * Don't allow file descriptors in the transport that themselves allow -+ * file descriptor queueing. This will eventually be allowed once both -+ * unix domain sockets and kdbus share a generic garbage collector. -+ */ ++ if (fd < 0) ++ return ERR_PTR(-EBADF); + -+ if (file->f_op == &kdbus_handle_ops) -+ return -EOPNOTSUPP; ++ f = fget_raw(fd); ++ if (!f) ++ return ERR_PTR(-EBADF); + -+ if (!S_ISSOCK(inode->i_mode)) -+ return 0; ++ inode = file_inode(f); ++ sock = S_ISSOCK(inode->i_mode) ? SOCKET_I(inode) : NULL; + -+ if (file->f_mode & FMODE_PATH) -+ return 0; ++ if (f->f_mode & FMODE_PATH) ++ ret = f; /* O_PATH is always allowed */ ++ else if (f->f_op == &kdbus_handle_ops) ++ ret = ERR_PTR(-EOPNOTSUPP); /* disallow kdbus-fd over kdbus */ ++ else if (sock && sock->sk && sock->ops && sock->ops->family == PF_UNIX) ++ ret = ERR_PTR(-EOPNOTSUPP); /* disallow UDS over kdbus */ ++ else ++ ret = f; /* all other are allowed */ + -+ sock = SOCKET_I(inode); -+ if (sock->sk && sock->ops && sock->ops->family == PF_UNIX) -+ return -EOPNOTSUPP; ++ if (f != ret) ++ fput(f); + -+ return 0; ++ return ret; +} + -+static const char * const zeros = "\0\0\0\0\0\0\0"; ++static struct file *kdbus_get_memfd(const struct kdbus_memfd *memfd) ++{ ++ const int m = F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_WRITE | F_SEAL_SEAL; ++ struct file *f, *ret; ++ int s; + -+/* -+ * kdbus_msg_scan_items() - validate incoming data and prepare parsing -+ * @kmsg: Message -+ * @bus: Bus the message is sent over -+ * -+ * Return: 0 on success, negative errno on failure. -+ * -+ * Files references in MEMFD or FDS items are pinned. -+ * -+ * On errors, the caller should drop any taken reference with -+ * kdbus_kmsg_free() -+ */ -+static int kdbus_msg_scan_items(struct kdbus_kmsg *kmsg, -+ struct kdbus_bus *bus) ++ if (memfd->fd < 0) ++ return ERR_PTR(-EBADF); ++ ++ f = fget(memfd->fd); ++ if (!f) ++ return ERR_PTR(-EBADF); ++ ++ s = shmem_get_seals(f); ++ if (s < 0) ++ ret = ERR_PTR(-EMEDIUMTYPE); ++ else if ((s & m) != m) ++ ret = ERR_PTR(-ETXTBSY); ++ else if (memfd->start + memfd->size > (u64)i_size_read(file_inode(f))) ++ ret = ERR_PTR(-EFAULT); ++ else ++ ret = f; ++ ++ if (f != ret) ++ fput(f); ++ ++ return ret; ++} ++ ++static int kdbus_msg_examine(struct kdbus_msg *msg, struct kdbus_bus *bus, ++ struct kdbus_cmd_send *cmd, size_t *out_n_memfds, ++ size_t *out_n_fds, size_t *out_n_parts) +{ -+ struct kdbus_msg_resources *res = kmsg->res; -+ const struct kdbus_msg *msg = &kmsg->msg; -+ const struct kdbus_item *item; -+ size_t n, n_vecs, n_memfds; -+ bool has_bloom = false; -+ bool has_name = false; -+ bool has_fds = false; -+ bool is_broadcast; -+ bool is_signal; -+ u64 vec_size; -+ -+ is_broadcast = (msg->dst_id == KDBUS_DST_ID_BROADCAST); -+ is_signal = !!(msg->flags & KDBUS_MSG_SIGNAL); -+ -+ /* count data payloads */ -+ n_vecs = 0; -+ n_memfds = 0; -+ KDBUS_ITEMS_FOREACH(item, msg->items, KDBUS_ITEMS_SIZE(msg, items)) { -+ switch (item->type) { -+ case KDBUS_ITEM_PAYLOAD_VEC: -+ ++n_vecs; -+ break; -+ case KDBUS_ITEM_PAYLOAD_MEMFD: -+ ++n_memfds; -+ if (item->memfd.size % 8) -+ ++n_vecs; -+ break; -+ default: -+ break; -+ } -+ } ++ struct kdbus_item *item, *fds = NULL, *bloom = NULL, *dstname = NULL; ++ u64 n_parts, n_memfds, n_fds, vec_size; + -+ n = n_vecs + n_memfds; -+ if (n > 0) { -+ res->data = kcalloc(n, sizeof(*res->data), GFP_KERNEL); -+ if (!res->data) -+ return -ENOMEM; ++ /* ++ * Step 1: ++ * Validate the message and command parameters. ++ */ ++ ++ /* KDBUS_PAYLOAD_KERNEL is reserved to kernel messages */ ++ if (msg->payload_type == KDBUS_PAYLOAD_KERNEL) ++ return -EINVAL; ++ ++ if (msg->dst_id == KDBUS_DST_ID_BROADCAST) { ++ /* broadcasts must be marked as signals */ ++ if (!(msg->flags & KDBUS_MSG_SIGNAL)) ++ return -EBADMSG; ++ /* broadcasts cannot have timeouts */ ++ if (msg->timeout_ns > 0) ++ return -ENOTUNIQ; + } + -+ if (n_vecs > 0) { -+ kmsg->iov = kcalloc(n_vecs, sizeof(*kmsg->iov), GFP_KERNEL); -+ if (!kmsg->iov) -+ return -ENOMEM; ++ if (msg->flags & KDBUS_MSG_EXPECT_REPLY) { ++ /* if you expect a reply, you must specify a timeout */ ++ if (msg->timeout_ns == 0) ++ return -EINVAL; ++ /* signals cannot have replies */ ++ if (msg->flags & KDBUS_MSG_SIGNAL) ++ return -ENOTUNIQ; ++ } else { ++ /* must expect reply if sent as synchronous call */ ++ if (cmd->flags & KDBUS_SEND_SYNC_REPLY) ++ return -EINVAL; ++ /* cannot mark replies as signal */ ++ if (msg->cookie_reply && (msg->flags & KDBUS_MSG_SIGNAL)) ++ return -EINVAL; + } + -+ /* import data payloads */ -+ n = 0; -+ vec_size = 0; -+ KDBUS_ITEMS_FOREACH(item, msg->items, KDBUS_ITEMS_SIZE(msg, items)) { -+ size_t payload_size = KDBUS_ITEM_PAYLOAD_SIZE(item); -+ struct iovec *iov = kmsg->iov + kmsg->iov_count; ++ /* ++ * Step 2: ++ * Validate all passed items. While at it, select some statistics that ++ * are required to allocate state objects later on. ++ * ++ * Generic item validation has already been done via ++ * kdbus_item_validate(). Furthermore, the number of items is naturally ++ * limited by the maximum message size. Hence, only non-generic item ++ * checks are performed here (mainly integer overflow tests). ++ */ + -+ if (++n > KDBUS_MSG_MAX_ITEMS) -+ return -E2BIG; ++ n_parts = 0; ++ n_memfds = 0; ++ n_fds = 0; ++ vec_size = 0; + ++ KDBUS_ITEMS_FOREACH(item, msg->items, KDBUS_ITEMS_SIZE(msg, items)) { + switch (item->type) { + case KDBUS_ITEM_PAYLOAD_VEC: { -+ struct kdbus_msg_data *d = res->data + res->data_count; + void __force __user *ptr = KDBUS_PTR(item->vec.address); -+ size_t size = item->vec.size; ++ u64 size = item->vec.size; + + if (vec_size + size < vec_size) + return -EMSGSIZE; + if (vec_size + size > KDBUS_MSG_MAX_PAYLOAD_VEC_SIZE) + return -EMSGSIZE; ++ if (ptr && unlikely(!access_ok(VERIFY_READ, ptr, size))) ++ return -EFAULT; + -+ d->type = KDBUS_MSG_DATA_VEC; -+ d->size = size; -+ -+ if (ptr) { -+ if (unlikely(!access_ok(VERIFY_READ, ptr, -+ size))) -+ return -EFAULT; -+ -+ d->vec.off = kmsg->pool_size; -+ iov->iov_base = ptr; -+ iov->iov_len = size; -+ } else { -+ d->vec.off = ~0ULL; -+ iov->iov_base = (char __user *)zeros; -+ iov->iov_len = size % 8; -+ } -+ -+ if (kmsg->pool_size + iov->iov_len < kmsg->pool_size) -+ return -EMSGSIZE; -+ -+ kmsg->pool_size += iov->iov_len; -+ ++kmsg->iov_count; -+ ++res->vec_count; -+ ++res->data_count; -+ vec_size += size; -+ ++ if (ptr || size % 8) /* data or padding */ ++ ++n_parts; + break; + } -+ + case KDBUS_ITEM_PAYLOAD_MEMFD: { -+ struct kdbus_msg_data *d = res->data + res->data_count; + u64 start = item->memfd.start; + u64 size = item->memfd.size; -+ size_t pad = size % 8; -+ int seals, mask; -+ struct file *f; + -+ if (kmsg->pool_size + size % 8 < kmsg->pool_size) -+ return -EMSGSIZE; + if (start + size < start) + return -EMSGSIZE; -+ -+ if (item->memfd.fd < 0) -+ return -EBADF; -+ -+ if (res->memfd_count >= KDBUS_MSG_MAX_MEMFD_ITEMS) ++ if (n_memfds >= KDBUS_MSG_MAX_MEMFD_ITEMS) + return -E2BIG; + -+ f = fget(item->memfd.fd); -+ if (!f) -+ return -EBADF; -+ -+ if (pad) { -+ iov->iov_base = (char __user *)zeros; -+ iov->iov_len = pad; ++ ++n_memfds; ++ if (size % 8) /* vec-padding required */ ++ ++n_parts; ++ break; ++ } ++ case KDBUS_ITEM_FDS: { ++ if (fds) ++ return -EEXIST; + -+ kmsg->pool_size += pad; -+ ++kmsg->iov_count; -+ } ++ fds = item; ++ n_fds = KDBUS_ITEM_PAYLOAD_SIZE(item) / sizeof(int); ++ if (n_fds > KDBUS_CONN_MAX_FDS_PER_USER) ++ return -EMFILE; + -+ ++res->data_count; -+ ++res->memfd_count; ++ break; ++ } ++ case KDBUS_ITEM_BLOOM_FILTER: { ++ u64 bloom_size; + -+ d->type = KDBUS_MSG_DATA_MEMFD; -+ d->size = size; -+ d->memfd.start = start; -+ d->memfd.file = f; ++ if (bloom) ++ return -EEXIST; + -+ /* -+ * We only accept a sealed memfd file whose content -+ * cannot be altered by the sender or anybody else -+ * while it is shared or in-flight. Other files need -+ * to be passed with KDBUS_MSG_FDS. -+ */ -+ seals = shmem_get_seals(f); -+ if (seals < 0) -+ return -EMEDIUMTYPE; ++ bloom = item; ++ bloom_size = KDBUS_ITEM_PAYLOAD_SIZE(item) - ++ offsetof(struct kdbus_bloom_filter, data); ++ if (!KDBUS_IS_ALIGNED8(bloom_size)) ++ return -EFAULT; ++ if (bloom_size != bus->bloom.size) ++ return -EDOM; + -+ mask = F_SEAL_SHRINK | F_SEAL_GROW | -+ F_SEAL_WRITE | F_SEAL_SEAL; -+ if ((seals & mask) != mask) -+ return -ETXTBSY; ++ break; ++ } ++ case KDBUS_ITEM_DST_NAME: { ++ if (dstname) ++ return -EEXIST; + -+ if (start + size > (u64)i_size_read(file_inode(f))) -+ return -EBADF; ++ dstname = item; ++ if (!kdbus_name_is_valid(item->str, false)) ++ return -EINVAL; ++ if (msg->dst_id == KDBUS_DST_ID_BROADCAST) ++ return -EBADMSG; + + break; + } ++ default: ++ return -EINVAL; ++ } ++ } + -+ case KDBUS_ITEM_FDS: { -+ unsigned int i; -+ unsigned int fds_count = payload_size / sizeof(int); ++ /* ++ * Step 3: ++ * Validate that required items were actually passed, and that no item ++ * contradicts the message flags. ++ */ + -+ /* do not allow multiple fd arrays */ -+ if (has_fds) -+ return -EEXIST; -+ has_fds = true; ++ /* bloom filters must be attached _iff_ it's a signal */ ++ if (!(msg->flags & KDBUS_MSG_SIGNAL) != !bloom) ++ return -EBADMSG; ++ /* destination name is required if no ID is given */ ++ if (msg->dst_id == KDBUS_DST_ID_NAME && !dstname) ++ return -EDESTADDRREQ; ++ /* cannot send file-descriptors attached to broadcasts */ ++ if (msg->dst_id == KDBUS_DST_ID_BROADCAST && fds) ++ return -ENOTUNIQ; + -+ /* Do not allow to broadcast file descriptors */ -+ if (is_broadcast) -+ return -ENOTUNIQ; ++ *out_n_memfds = n_memfds; ++ *out_n_fds = n_fds; ++ *out_n_parts = n_parts; + -+ if (fds_count > KDBUS_CONN_MAX_FDS_PER_USER) -+ return -EMFILE; ++ return 0; ++} + -+ res->fds = kcalloc(fds_count, sizeof(struct file *), -+ GFP_KERNEL); -+ if (!res->fds) -+ return -ENOMEM; ++static bool kdbus_staging_merge_vecs(struct kdbus_staging *staging, ++ struct kdbus_item **prev_item, ++ struct iovec **prev_vec, ++ const struct kdbus_item *merge) ++{ ++ void __user *ptr = (void __user *)KDBUS_PTR(merge->vec.address); ++ u64 padding = merge->vec.size % 8; ++ struct kdbus_item *prev = *prev_item; ++ struct iovec *vec = *prev_vec; + -+ for (i = 0; i < fds_count; i++) { -+ int fd = item->fds[i]; -+ int ret; ++ /* XXX: merging is disabled so far */ ++ if (0 && prev && prev->type == KDBUS_ITEM_PAYLOAD_OFF && ++ !merge->vec.address == !prev->vec.address) { ++ /* ++ * If we merge two VECs, we can always drop the second ++ * PAYLOAD_VEC item. Hence, include its size in the previous ++ * one. ++ */ ++ prev->vec.size += merge->vec.size; + -+ /* -+ * Verify the fd and increment the usage count. -+ * Use fget_raw() to allow passing O_PATH fds. -+ */ -+ if (fd < 0) -+ return -EBADF; ++ if (ptr) { ++ /* ++ * If we merge two data VECs, we need two iovecs to copy ++ * the data. But the items can be easily merged by ++ * summing their lengths. ++ */ ++ vec = &staging->parts[staging->n_parts++]; ++ vec->iov_len = merge->vec.size; ++ vec->iov_base = ptr; ++ staging->n_payload += vec->iov_len; ++ } else if (padding) { ++ /* ++ * If we merge two 0-vecs with the second 0-vec ++ * requiring padding, we need to insert an iovec to copy ++ * the 0-padding. We try merging it with the previous ++ * 0-padding iovec. This might end up with an ++ * iov_len==0, in which case we simply drop the iovec. ++ */ ++ if (vec) { ++ staging->n_payload -= vec->iov_len; ++ vec->iov_len = prev->vec.size % 8; ++ if (!vec->iov_len) { ++ --staging->n_parts; ++ vec = NULL; ++ } else { ++ staging->n_payload += vec->iov_len; ++ } ++ } else { ++ vec = &staging->parts[staging->n_parts++]; ++ vec->iov_len = padding; ++ vec->iov_base = (char __user *)zeros; ++ staging->n_payload += vec->iov_len; ++ } ++ } else { ++ /* ++ * If we merge two 0-vecs with the second 0-vec having ++ * no padding, we know the padding of the first stays ++ * the same. Hence, @vec needs no adjustment. ++ */ ++ } ++ ++ /* successfully merged with previous item */ ++ merge = prev; ++ } else { ++ /* ++ * If we cannot merge the payload item with the previous one, ++ * we simply insert a new iovec for the data/padding. ++ */ ++ if (ptr) { ++ vec = &staging->parts[staging->n_parts++]; ++ vec->iov_len = merge->vec.size; ++ vec->iov_base = ptr; ++ staging->n_payload += vec->iov_len; ++ } else if (padding) { ++ vec = &staging->parts[staging->n_parts++]; ++ vec->iov_len = padding; ++ vec->iov_base = (char __user *)zeros; ++ staging->n_payload += vec->iov_len; ++ } else { ++ vec = NULL; ++ } ++ } + -+ res->fds[i] = fget_raw(fd); -+ if (!res->fds[i]) -+ return -EBADF; ++ *prev_item = (struct kdbus_item *)merge; ++ *prev_vec = vec; + -+ res->fds_count++; ++ return merge == prev; ++} + -+ ret = kdbus_handle_check_file(res->fds[i]); -+ if (ret < 0) -+ return ret; ++static int kdbus_staging_import(struct kdbus_staging *staging) ++{ ++ struct kdbus_item *it, *item, *last, *prev_payload; ++ struct kdbus_gaps *gaps = staging->gaps; ++ struct kdbus_msg *msg = staging->msg; ++ struct iovec *part, *prev_part; ++ bool drop_item; ++ ++ drop_item = false; ++ last = NULL; ++ prev_payload = NULL; ++ prev_part = NULL; ++ ++ /* ++ * We modify msg->items along the way; make sure to use @item as offset ++ * to the next item (instead of the iterator @it). ++ */ ++ for (it = item = msg->items; ++ it >= msg->items && ++ (u8 *)it < (u8 *)msg + msg->size && ++ (u8 *)it + it->size <= (u8 *)msg + msg->size; ) { ++ /* ++ * If we dropped items along the way, move current item to ++ * front. We must not access @it afterwards, but use @item ++ * instead! ++ */ ++ if (it != item) ++ memmove(item, it, it->size); ++ it = (void *)((u8 *)it + KDBUS_ALIGN8(item->size)); ++ ++ switch (item->type) { ++ case KDBUS_ITEM_PAYLOAD_VEC: { ++ size_t offset = staging->n_payload; ++ ++ if (kdbus_staging_merge_vecs(staging, &prev_payload, ++ &prev_part, item)) { ++ drop_item = true; ++ } else if (item->vec.address) { ++ /* real offset is patched later on */ ++ item->type = KDBUS_ITEM_PAYLOAD_OFF; ++ item->vec.offset = offset; ++ } else { ++ item->type = KDBUS_ITEM_PAYLOAD_OFF; ++ item->vec.offset = ~0ULL; + } + + break; + } ++ case KDBUS_ITEM_PAYLOAD_MEMFD: { ++ struct file *f; + -+ case KDBUS_ITEM_BLOOM_FILTER: { -+ u64 bloom_size; ++ f = kdbus_get_memfd(&item->memfd); ++ if (IS_ERR(f)) ++ return PTR_ERR(f); ++ ++ gaps->memfd_files[gaps->n_memfds] = f; ++ gaps->memfd_offsets[gaps->n_memfds] = ++ (u8 *)&item->memfd.fd - (u8 *)msg; ++ ++gaps->n_memfds; ++ ++ /* memfds cannot be merged */ ++ prev_payload = item; ++ prev_part = NULL; ++ ++ /* insert padding to make following VECs aligned */ ++ if (item->memfd.size % 8) { ++ part = &staging->parts[staging->n_parts++]; ++ part->iov_len = item->memfd.size % 8; ++ part->iov_base = (char __user *)zeros; ++ staging->n_payload += part->iov_len; ++ } + -+ /* do not allow multiple bloom filters */ -+ if (has_bloom) -+ return -EEXIST; -+ has_bloom = true; ++ break; ++ } ++ case KDBUS_ITEM_FDS: { ++ size_t i, n_fds; + -+ bloom_size = payload_size - -+ offsetof(struct kdbus_bloom_filter, data); ++ n_fds = KDBUS_ITEM_PAYLOAD_SIZE(item) / sizeof(int); ++ for (i = 0; i < n_fds; ++i) { ++ struct file *f; + -+ /* -+ * Allow only bloom filter sizes of a multiple of 64bit. -+ */ -+ if (!KDBUS_IS_ALIGNED8(bloom_size)) -+ return -EFAULT; ++ f = kdbus_get_fd(item->fds[i]); ++ if (IS_ERR(f)) ++ return PTR_ERR(f); + -+ /* do not allow mismatching bloom filter sizes */ -+ if (bloom_size != bus->bloom.size) -+ return -EDOM; ++ gaps->fd_files[gaps->n_fds++] = f; ++ } ++ ++ gaps->fd_offset = (u8 *)item->fds - (u8 *)msg; + -+ kmsg->bloom_filter = &item->bloom_filter; + break; + } -+ ++ case KDBUS_ITEM_BLOOM_FILTER: ++ staging->bloom_filter = &item->bloom_filter; ++ break; + case KDBUS_ITEM_DST_NAME: -+ /* do not allow multiple names */ -+ if (has_name) -+ return -EEXIST; -+ has_name = true; -+ -+ if (!kdbus_name_is_valid(item->str, false)) -+ return -EINVAL; -+ -+ res->dst_name = kstrdup(item->str, GFP_KERNEL); -+ if (!res->dst_name) -+ return -ENOMEM; ++ staging->dst_name = item->str; + break; ++ } + -+ default: -+ return -EINVAL; ++ /* drop item if we merged it with a previous one */ ++ if (drop_item) { ++ drop_item = false; ++ } else { ++ last = item; ++ item = KDBUS_ITEM_NEXT(item); + } + } + -+ /* name is needed if no ID is given */ -+ if (msg->dst_id == KDBUS_DST_ID_NAME && !has_name) -+ return -EDESTADDRREQ; ++ /* adjust message size regarding dropped items */ ++ msg->size = offsetof(struct kdbus_msg, items); ++ if (last) ++ msg->size += ((u8 *)last - (u8 *)msg->items) + last->size; + -+ if (is_broadcast) { -+ /* Broadcasts can't take names */ -+ if (has_name) -+ return -EBADMSG; ++ return 0; ++} + -+ /* All broadcasts have to be signals */ -+ if (!is_signal) -+ return -EBADMSG; ++static void kdbus_staging_reserve(struct kdbus_staging *staging) ++{ ++ struct iovec *part; + -+ /* Timeouts are not allowed for broadcasts */ -+ if (msg->timeout_ns > 0) -+ return -ENOTUNIQ; ++ part = &staging->parts[staging->n_parts++]; ++ part->iov_base = (void __user *)zeros; ++ part->iov_len = 0; ++} ++ ++static struct kdbus_staging *kdbus_staging_new(struct kdbus_bus *bus, ++ size_t n_parts, ++ size_t msg_extra_size) ++{ ++ const size_t reserved_parts = 5; /* see below for explanation */ ++ struct kdbus_staging *staging; ++ int ret; ++ ++ n_parts += reserved_parts; ++ ++ staging = kzalloc(sizeof(*staging) + n_parts * sizeof(*staging->parts) + ++ msg_extra_size, GFP_TEMPORARY); ++ if (!staging) ++ return ERR_PTR(-ENOMEM); ++ ++ staging->msg_seqnum = atomic64_inc_return(&bus->domain->last_id); ++ staging->n_parts = 0; /* we reserve n_parts, but don't enforce them */ ++ staging->parts = (void *)(staging + 1); ++ ++ if (msg_extra_size) /* if requested, allocate message, too */ ++ staging->msg = (void *)((u8 *)staging->parts + ++ n_parts * sizeof(*staging->parts)); ++ ++ staging->meta_proc = kdbus_meta_proc_new(); ++ if (IS_ERR(staging->meta_proc)) { ++ ret = PTR_ERR(staging->meta_proc); ++ staging->meta_proc = NULL; ++ goto error; ++ } ++ ++ staging->meta_conn = kdbus_meta_conn_new(); ++ if (IS_ERR(staging->meta_conn)) { ++ ret = PTR_ERR(staging->meta_conn); ++ staging->meta_conn = NULL; ++ goto error; + } + + /* -+ * Signal messages require a bloom filter, and bloom filters are -+ * only valid with signals. ++ * Prepare iovecs to copy the message into the target pool. We use the ++ * following iovecs: ++ * * iovec to copy "kdbus_msg.size" ++ * * iovec to copy "struct kdbus_msg" (minus size) plus items ++ * * iovec for possible padding after the items ++ * * iovec for metadata items ++ * * iovec for possible padding after the items ++ * ++ * Make sure to update @reserved_parts if you add more parts here. + */ -+ if (is_signal ^ has_bloom) -+ return -EBADMSG; + -+ return 0; ++ kdbus_staging_reserve(staging); /* msg.size */ ++ kdbus_staging_reserve(staging); /* msg (minus msg.size) plus items */ ++ kdbus_staging_reserve(staging); /* msg padding */ ++ kdbus_staging_reserve(staging); /* meta */ ++ kdbus_staging_reserve(staging); /* meta padding */ ++ ++ return staging; ++ ++error: ++ kdbus_staging_free(staging); ++ return ERR_PTR(ret); +} + -+/** -+ * kdbus_kmsg_new_from_cmd() - create kernel message from send payload -+ * @conn: Connection -+ * @cmd_send: Payload of KDBUS_CMD_SEND -+ * -+ * Return: a new kdbus_kmsg on success, ERR_PTR on failure. -+ */ -+struct kdbus_kmsg *kdbus_kmsg_new_from_cmd(struct kdbus_conn *conn, -+ struct kdbus_cmd_send *cmd_send) ++struct kdbus_staging *kdbus_staging_new_kernel(struct kdbus_bus *bus, ++ u64 dst, u64 cookie_timeout, ++ size_t it_size, size_t it_type) +{ -+ struct kdbus_kmsg *m; -+ u64 size; ++ struct kdbus_staging *staging; ++ size_t size; ++ ++ size = offsetof(struct kdbus_msg, items) + ++ KDBUS_ITEM_HEADER_SIZE + it_size; ++ ++ staging = kdbus_staging_new(bus, 0, KDBUS_ALIGN8(size)); ++ if (IS_ERR(staging)) ++ return ERR_CAST(staging); ++ ++ staging->msg->size = size; ++ staging->msg->flags = (dst == KDBUS_DST_ID_BROADCAST) ? ++ KDBUS_MSG_SIGNAL : 0; ++ staging->msg->dst_id = dst; ++ staging->msg->src_id = KDBUS_SRC_ID_KERNEL; ++ staging->msg->payload_type = KDBUS_PAYLOAD_KERNEL; ++ staging->msg->cookie_reply = cookie_timeout; ++ staging->notify = staging->msg->items; ++ staging->notify->size = KDBUS_ITEM_HEADER_SIZE + it_size; ++ staging->notify->type = it_type; ++ ++ return staging; ++} ++ ++struct kdbus_staging *kdbus_staging_new_user(struct kdbus_bus *bus, ++ struct kdbus_cmd_send *cmd, ++ struct kdbus_msg *msg) ++{ ++ const size_t reserved_parts = 1; /* see below for explanation */ ++ size_t n_memfds, n_fds, n_parts; ++ struct kdbus_staging *staging; + int ret; + -+ ret = kdbus_copy_from_user(&size, KDBUS_PTR(cmd_send->msg_address), -+ sizeof(size)); ++ /* ++ * Examine user-supplied message and figure out how many resources we ++ * need to allocate in our staging area. This requires us to iterate ++ * the message twice, but saves us from re-allocating our resources ++ * all the time. ++ */ ++ ++ ret = kdbus_msg_examine(msg, bus, cmd, &n_memfds, &n_fds, &n_parts); + if (ret < 0) + return ERR_PTR(ret); + -+ if (size < sizeof(struct kdbus_msg) || size > KDBUS_MSG_MAX_SIZE) -+ return ERR_PTR(-EINVAL); ++ n_parts += reserved_parts; + -+ m = kmalloc(size + KDBUS_KMSG_HEADER_SIZE, GFP_KERNEL); -+ if (!m) -+ return ERR_PTR(-ENOMEM); ++ /* ++ * Allocate staging area with the number of required resources. Make ++ * sure that we have enough iovecs for all required parts pre-allocated ++ * so this will hopefully be the only memory allocation for this ++ * message transaction. ++ */ + -+ memset(m, 0, KDBUS_KMSG_HEADER_SIZE); -+ m->seq = atomic64_inc_return(&conn->ep->bus->domain->last_id); ++ staging = kdbus_staging_new(bus, n_parts, 0); ++ if (IS_ERR(staging)) ++ return ERR_CAST(staging); + -+ m->proc_meta = kdbus_meta_proc_new(); -+ if (IS_ERR(m->proc_meta)) { -+ ret = PTR_ERR(m->proc_meta); -+ m->proc_meta = NULL; -+ goto exit_free; -+ } ++ staging->msg = msg; + -+ m->conn_meta = kdbus_meta_conn_new(); -+ if (IS_ERR(m->conn_meta)) { -+ ret = PTR_ERR(m->conn_meta); -+ m->conn_meta = NULL; -+ goto exit_free; -+ } ++ /* ++ * If the message contains memfds or fd items, we need to remember some ++ * state so we can fill in the requested information at RECV time. ++ * File-descriptors cannot be passed at SEND time. Hence, allocate a ++ * gaps-object to remember that state. That gaps object is linked to ++ * from the staging area, but will also be linked to from the message ++ * queue of each peer. Hence, each receiver owns a reference to it, and ++ * it will later be used to fill the 'gaps' in message that couldn't be ++ * filled at SEND time. ++ * Note that the 'gaps' object is read-only once the staging-allocator ++ * returns. There might be connections receiving a queued message while ++ * the sender still broadcasts the message to other receivers. ++ */ + -+ if (copy_from_user(&m->msg, KDBUS_PTR(cmd_send->msg_address), size)) { -+ ret = -EFAULT; -+ goto exit_free; ++ if (n_memfds > 0 || n_fds > 0) { ++ staging->gaps = kdbus_gaps_new(n_memfds, n_fds); ++ if (IS_ERR(staging->gaps)) { ++ ret = PTR_ERR(staging->gaps); ++ staging->gaps = NULL; ++ kdbus_staging_free(staging); ++ return ERR_PTR(ret); ++ } + } + -+ if (m->msg.size != size) { -+ ret = -EINVAL; -+ goto exit_free; -+ } ++ /* ++ * kdbus_staging_new() already reserves parts for message setup. For ++ * user-supplied messages, we add the following iovecs: ++ * ... variable number of iovecs for payload ... ++ * * final iovec for possible padding of payload ++ * ++ * Make sure to update @reserved_parts if you add more parts here. ++ */ + -+ if (m->msg.flags & ~(KDBUS_MSG_EXPECT_REPLY | -+ KDBUS_MSG_NO_AUTO_START | -+ KDBUS_MSG_SIGNAL)) { -+ ret = -EINVAL; -+ goto exit_free; ++ ret = kdbus_staging_import(staging); /* payload */ ++ kdbus_staging_reserve(staging); /* payload padding */ ++ ++ if (ret < 0) ++ goto error; ++ ++ return staging; ++ ++error: ++ kdbus_staging_free(staging); ++ return ERR_PTR(ret); ++} ++ ++struct kdbus_staging *kdbus_staging_free(struct kdbus_staging *staging) ++{ ++ if (!staging) ++ return NULL; ++ ++ kdbus_meta_conn_unref(staging->meta_conn); ++ kdbus_meta_proc_unref(staging->meta_proc); ++ kdbus_gaps_unref(staging->gaps); ++ kfree(staging); ++ ++ return NULL; ++} ++ ++static int kdbus_staging_collect_metadata(struct kdbus_staging *staging, ++ struct kdbus_conn *src, ++ struct kdbus_conn *dst, ++ u64 *out_attach) ++{ ++ u64 attach; ++ int ret; ++ ++ if (src) ++ attach = kdbus_meta_msg_mask(src, dst); ++ else ++ attach = KDBUS_ATTACH_TIMESTAMP; /* metadata for kernel msgs */ ++ ++ if (src && !src->meta_fake) { ++ ret = kdbus_meta_proc_collect(staging->meta_proc, attach); ++ if (ret < 0) ++ return ret; + } + -+ ret = kdbus_items_validate(m->msg.items, -+ KDBUS_ITEMS_SIZE(&m->msg, items)); ++ ret = kdbus_meta_conn_collect(staging->meta_conn, src, ++ staging->msg_seqnum, attach); + if (ret < 0) -+ goto exit_free; ++ return ret; + -+ m->res = kdbus_msg_resources_new(); -+ if (IS_ERR(m->res)) { -+ ret = PTR_ERR(m->res); -+ m->res = NULL; -+ goto exit_free; ++ *out_attach = attach; ++ return 0; ++} ++ ++/** ++ * kdbus_staging_emit() - emit linearized message in target pool ++ * @staging: staging object to create message from ++ * @src: sender of the message (or NULL) ++ * @dst: target connection to allocate message for ++ * ++ * This allocates a pool-slice for @dst and copies the message provided by ++ * @staging into it. The new slice is then returned to the caller for further ++ * processing. It's not linked into any queue, yet. ++ * ++ * Return: Newly allocated slice or ERR_PTR on failure. ++ */ ++struct kdbus_pool_slice *kdbus_staging_emit(struct kdbus_staging *staging, ++ struct kdbus_conn *src, ++ struct kdbus_conn *dst) ++{ ++ struct kdbus_item *item, *meta_items = NULL; ++ struct kdbus_pool_slice *slice = NULL; ++ size_t off, size, msg_size, meta_size; ++ struct iovec *v; ++ u64 attach; ++ int ret; ++ ++ /* ++ * Step 1: ++ * Collect metadata from @src depending on the attach-flags allowed for ++ * @dst. Translate it into the namespaces pinned by @dst. ++ */ ++ ++ ret = kdbus_staging_collect_metadata(staging, src, dst, &attach); ++ if (ret < 0) ++ goto error; ++ ++ ret = kdbus_meta_emit(staging->meta_proc, NULL, staging->meta_conn, ++ dst, attach, &meta_items, &meta_size); ++ if (ret < 0) ++ goto error; ++ ++ /* ++ * Step 2: ++ * Setup iovecs for the message. See kdbus_staging_new() for allocation ++ * of those iovecs. All reserved iovecs have been initialized with ++ * iov_len=0 + iov_base=zeros. Furthermore, the iovecs to copy the ++ * actual message payload have already been initialized and need not be ++ * touched. ++ */ ++ ++ v = staging->parts; ++ msg_size = staging->msg->size; ++ ++ /* msg.size */ ++ v->iov_len = sizeof(msg_size); ++ v->iov_base = &msg_size; ++ ++v; ++ ++ /* msg (after msg.size) plus items */ ++ v->iov_len = staging->msg->size - sizeof(staging->msg->size); ++ v->iov_base = (void __user *)((u8 *)staging->msg + ++ sizeof(staging->msg->size)); ++ ++v; ++ ++ /* padding after msg */ ++ v->iov_len = KDBUS_ALIGN8(staging->msg->size) - staging->msg->size; ++ v->iov_base = (void __user *)zeros; ++ ++v; ++ ++ if (meta_size > 0) { ++ /* metadata items */ ++ v->iov_len = meta_size; ++ v->iov_base = meta_items; ++ ++v; ++ ++ /* padding after metadata */ ++ v->iov_len = KDBUS_ALIGN8(meta_size) - meta_size; ++ v->iov_base = (void __user *)zeros; ++ ++v; ++ ++ msg_size = KDBUS_ALIGN8(msg_size) + meta_size; ++ } else { ++ /* metadata items */ ++ v->iov_len = 0; ++ v->iov_base = (void __user *)zeros; ++ ++v; ++ ++ /* padding after metadata */ ++ v->iov_len = 0; ++ v->iov_base = (void __user *)zeros; ++ ++v; + } + -+ /* do not accept kernel-generated messages */ -+ if (m->msg.payload_type == KDBUS_PAYLOAD_KERNEL) { -+ ret = -EINVAL; -+ goto exit_free; ++ /* ... payload iovecs are already filled in ... */ ++ ++ /* compute overall size and fill in padding after payload */ ++ size = KDBUS_ALIGN8(msg_size); ++ ++ if (staging->n_payload > 0) { ++ size += staging->n_payload; ++ ++ v = &staging->parts[staging->n_parts - 1]; ++ v->iov_len = KDBUS_ALIGN8(size) - size; ++ v->iov_base = (void __user *)zeros; ++ ++ size = KDBUS_ALIGN8(size); + } + -+ if (m->msg.flags & KDBUS_MSG_EXPECT_REPLY) { -+ /* requests for replies need timeout and cookie */ -+ if (m->msg.timeout_ns == 0 || m->msg.cookie == 0) { -+ ret = -EINVAL; -+ goto exit_free; -+ } ++ /* ++ * Step 3: ++ * The PAYLOAD_OFF items in the message contain a relative 'offset' ++ * field that tells the receiver where to find the actual payload. This ++ * offset is relative to the start of the message, and as such depends ++ * on the size of the metadata items we inserted. This size is variable ++ * and changes for each peer we send the message to. Hence, we remember ++ * the last relative offset that was used to calculate the 'offset' ++ * fields. For each message, we re-calculate it and patch all items, in ++ * case it changed. ++ */ + -+ /* replies may not be expected for broadcasts */ -+ if (m->msg.dst_id == KDBUS_DST_ID_BROADCAST) { -+ ret = -ENOTUNIQ; -+ goto exit_free; -+ } ++ off = KDBUS_ALIGN8(msg_size); + -+ /* replies may not be expected for signals */ -+ if (m->msg.flags & KDBUS_MSG_SIGNAL) { -+ ret = -EINVAL; -+ goto exit_free; -+ } -+ } else { -+ /* -+ * KDBUS_SEND_SYNC_REPLY is only valid together with -+ * KDBUS_MSG_EXPECT_REPLY -+ */ -+ if (cmd_send->flags & KDBUS_SEND_SYNC_REPLY) { -+ ret = -EINVAL; -+ goto exit_free; -+ } ++ if (off != staging->i_payload) { ++ KDBUS_ITEMS_FOREACH(item, staging->msg->items, ++ KDBUS_ITEMS_SIZE(staging->msg, items)) { ++ if (item->type != KDBUS_ITEM_PAYLOAD_OFF) ++ continue; + -+ /* replies cannot be signals */ -+ if (m->msg.cookie_reply && (m->msg.flags & KDBUS_MSG_SIGNAL)) { -+ ret = -EINVAL; -+ goto exit_free; ++ item->vec.offset -= staging->i_payload; ++ item->vec.offset += off; + } ++ ++ staging->i_payload = off; + } + -+ ret = kdbus_msg_scan_items(m, conn->ep->bus); ++ /* ++ * Step 4: ++ * Allocate pool slice and copy over all data. Make sure to properly ++ * account on user quota. ++ */ ++ ++ ret = kdbus_conn_quota_inc(dst, src ? src->user : NULL, size, ++ staging->gaps ? staging->gaps->n_fds : 0); + if (ret < 0) -+ goto exit_free; ++ goto error; + -+ /* patch-in the source of this message */ -+ if (m->msg.src_id > 0 && m->msg.src_id != conn->id) { -+ ret = -EINVAL; -+ goto exit_free; ++ slice = kdbus_pool_slice_alloc(dst->pool, size, true); ++ if (IS_ERR(slice)) { ++ ret = PTR_ERR(slice); ++ slice = NULL; ++ goto error; + } -+ m->msg.src_id = conn->id; -+ -+ return m; + -+exit_free: -+ kdbus_kmsg_free(m); -+ return ERR_PTR(ret); -+} ++ WARN_ON(kdbus_pool_slice_size(slice) != size); + -+/** -+ * kdbus_kmsg_collect_metadata() - collect metadata -+ * @kmsg: message to collect metadata on -+ * @src: source connection of message -+ * @dst: destination connection of message -+ * -+ * Return: 0 on success, negative error code on failure. -+ */ -+int kdbus_kmsg_collect_metadata(struct kdbus_kmsg *kmsg, struct kdbus_conn *src, -+ struct kdbus_conn *dst) -+{ -+ u64 attach; -+ int ret; ++ ret = kdbus_pool_slice_copy_iovec(slice, 0, staging->parts, ++ staging->n_parts, size); ++ if (ret < 0) ++ goto error; + -+ attach = kdbus_meta_calc_attach_flags(src, dst); -+ if (!src->faked_meta) { -+ ret = kdbus_meta_proc_collect(kmsg->proc_meta, attach); -+ if (ret < 0) -+ return ret; -+ } ++ /* all done, return slice to caller */ ++ goto exit; + -+ return kdbus_meta_conn_collect(kmsg->conn_meta, kmsg, src, attach); ++error: ++ if (slice) ++ kdbus_conn_quota_dec(dst, src ? src->user : NULL, size, ++ staging->gaps ? staging->gaps->n_fds : 0); ++ kdbus_pool_slice_release(slice); ++ slice = ERR_PTR(ret); ++exit: ++ kfree(meta_items); ++ return slice; +} diff --git a/ipc/kdbus/message.h b/ipc/kdbus/message.h new file mode 100644 -index 0000000..cdaa65c +index 0000000..298f9c9 --- /dev/null +++ b/ipc/kdbus/message.h -@@ -0,0 +1,135 @@ +@@ -0,0 +1,120 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -15623,131 +18073,116 @@ index 0000000..cdaa65c +#ifndef __KDBUS_MESSAGE_H +#define __KDBUS_MESSAGE_H + -+#include "util.h" -+#include "metadata.h" -+ -+/** -+ * enum kdbus_msg_data_type - Type of kdbus_msg_data payloads -+ * @KDBUS_MSG_DATA_VEC: Data vector provided by user-space -+ * @KDBUS_MSG_DATA_MEMFD: Memfd payload -+ */ -+enum kdbus_msg_data_type { -+ KDBUS_MSG_DATA_VEC, -+ KDBUS_MSG_DATA_MEMFD, -+}; -+ -+/** -+ * struct kdbus_msg_data - Data payload as stored by messages -+ * @type: Type of payload (KDBUS_MSG_DATA_*) -+ * @size: Size of the described payload -+ * @off: The offset, relative to the vec slice -+ * @start: Offset inside the memfd -+ * @file: Backing file referenced by the memfd -+ */ -+struct kdbus_msg_data { -+ unsigned int type; -+ u64 size; ++#include ++#include ++#include + -+ union { -+ struct { -+ u64 off; -+ } vec; -+ struct { -+ u64 start; -+ struct file *file; -+ } memfd; -+ }; -+}; ++struct kdbus_bus; ++struct kdbus_conn; ++struct kdbus_meta_conn; ++struct kdbus_meta_proc; ++struct kdbus_pool_slice; + +/** -+ * struct kdbus_kmsg_resources - resources of a message ++ * struct kdbus_gaps - gaps in message to be filled later + * @kref: Reference counter -+ * @dst_name: Short-cut to msg for faster lookup -+ * @fds: Array of file descriptors to pass -+ * @fds_count: Number of file descriptors to pass -+ * @data: Array of data payloads -+ * @vec_count: Number of VEC entries -+ * @memfd_count: Number of MEMFD entries in @data -+ * @data_count: Sum of @vec_count + @memfd_count -+ */ -+struct kdbus_msg_resources { ++ * @n_memfd_offs: Number of memfds ++ * @memfd_offs: Offsets of kdbus_memfd items in target slice ++ * @n_fds: Number of fds ++ * @fds: Array of sent fds ++ * @fds_offset: Offset of fd-array in target slice ++ * ++ * The 'gaps' object is used to track data that is needed to fill gaps in a ++ * message at RECV time. Usually, we try to compile the whole message at SEND ++ * time. This has the advantage, that we don't have to cache any information and ++ * can keep the memory consumption small. Furthermore, all copy operations can ++ * be combined into a single function call, which speeds up transactions ++ * considerably. ++ * However, things like file-descriptors can only be fully installed at RECV ++ * time. The gaps object tracks this data and pins it until a message is ++ * received. The gaps object is shared between all receivers of the same ++ * message. ++ */ ++struct kdbus_gaps { + struct kref kref; -+ const char *dst_name; + -+ struct file **fds; -+ unsigned int fds_count; ++ /* state tracking for KDBUS_ITEM_PAYLOAD_MEMFD entries */ ++ size_t n_memfds; ++ u64 *memfd_offsets; ++ struct file **memfd_files; + -+ struct kdbus_msg_data *data; -+ size_t vec_count; -+ size_t memfd_count; -+ size_t data_count; ++ /* state tracking for KDBUS_ITEM_FDS */ ++ size_t n_fds; ++ struct file **fd_files; ++ u64 fd_offset; +}; + -+struct kdbus_msg_resources * -+kdbus_msg_resources_ref(struct kdbus_msg_resources *r); -+struct kdbus_msg_resources * -+kdbus_msg_resources_unref(struct kdbus_msg_resources *r); ++struct kdbus_gaps *kdbus_gaps_ref(struct kdbus_gaps *gaps); ++struct kdbus_gaps *kdbus_gaps_unref(struct kdbus_gaps *gaps); ++int kdbus_gaps_install(struct kdbus_gaps *gaps, struct kdbus_pool_slice *slice, ++ bool *out_incomplete); + +/** -+ * struct kdbus_kmsg - internal message handling data -+ * @seq: Domain-global message sequence number -+ * @notify_type: Short-cut for faster lookup -+ * @notify_old_id: Short-cut for faster lookup -+ * @notify_new_id: Short-cut for faster lookup -+ * @notify_name: Short-cut for faster lookup -+ * @dst_name_id: Short-cut to msg for faster lookup -+ * @bloom_filter: Bloom filter to match message properties -+ * @bloom_generation: Generation of bloom element set -+ * @notify_entry: List of kernel-generated notifications -+ * @iov: Array of iovec, describing the payload to copy -+ * @iov_count: Number of array members in @iov -+ * @pool_size: Overall size of inlined data referenced by @iov -+ * @proc_meta: Appended SCM-like metadata of the sending process -+ * @conn_meta: Appended SCM-like metadata of the sending connection -+ * @res: Message resources -+ * @msg: Message from or to userspace -+ */ -+struct kdbus_kmsg { -+ u64 seq; -+ u64 notify_type; -+ u64 notify_old_id; -+ u64 notify_new_id; -+ const char *notify_name; -+ -+ u64 dst_name_id; -+ const struct kdbus_bloom_filter *bloom_filter; -+ u64 bloom_generation; ++ * struct kdbus_staging - staging area to import messages ++ * @msg: User-supplied message ++ * @gaps: Gaps-object created during import (or NULL if empty) ++ * @msg_seqnum: Message sequence number ++ * @notify_entry: Entry into list of kernel-generated notifications ++ * @i_payload: Current relative index of start of payload ++ * @n_payload: Total number of bytes needed for payload ++ * @n_parts: Number of parts ++ * @parts: Array of iovecs that make up the whole message ++ * @meta_proc: Process metadata of the sender (or NULL if empty) ++ * @meta_conn: Connection metadata of the sender (or NULL if empty) ++ * @bloom_filter: Pointer to the bloom-item in @msg, or NULL ++ * @dst_name: Pointer to the dst-name-item in @msg, or NULL ++ * @notify: Pointer to the notification item in @msg, or NULL ++ * ++ * The kdbus_staging object is a temporary staging area to import user-supplied ++ * messages into the kernel. It is only used during SEND and dropped once the ++ * message is queued. Any data that cannot be collected during SEND, is ++ * collected in a kdbus_gaps object and attached to the message queue. ++ */ ++struct kdbus_staging { ++ struct kdbus_msg *msg; ++ struct kdbus_gaps *gaps; ++ u64 msg_seqnum; + struct list_head notify_entry; + -+ struct iovec *iov; -+ size_t iov_count; -+ u64 pool_size; ++ /* crafted iovecs to copy the message */ ++ size_t i_payload; ++ size_t n_payload; ++ size_t n_parts; ++ struct iovec *parts; + -+ struct kdbus_meta_proc *proc_meta; -+ struct kdbus_meta_conn *conn_meta; -+ struct kdbus_msg_resources *res; ++ /* metadata state */ ++ struct kdbus_meta_proc *meta_proc; ++ struct kdbus_meta_conn *meta_conn; + -+ /* variable size, must be the last member */ -+ struct kdbus_msg msg; ++ /* cached pointers into @msg */ ++ const struct kdbus_bloom_filter *bloom_filter; ++ const char *dst_name; ++ struct kdbus_item *notify; +}; + -+struct kdbus_bus; -+struct kdbus_conn; -+ -+struct kdbus_kmsg *kdbus_kmsg_new(struct kdbus_bus *bus, size_t extra_size); -+struct kdbus_kmsg *kdbus_kmsg_new_from_cmd(struct kdbus_conn *conn, -+ struct kdbus_cmd_send *cmd_send); -+void kdbus_kmsg_free(struct kdbus_kmsg *kmsg); -+int kdbus_kmsg_collect_metadata(struct kdbus_kmsg *kmsg, struct kdbus_conn *src, -+ struct kdbus_conn *dst); ++struct kdbus_staging *kdbus_staging_new_kernel(struct kdbus_bus *bus, ++ u64 dst, u64 cookie_timeout, ++ size_t it_size, size_t it_type); ++struct kdbus_staging *kdbus_staging_new_user(struct kdbus_bus *bus, ++ struct kdbus_cmd_send *cmd, ++ struct kdbus_msg *msg); ++struct kdbus_staging *kdbus_staging_free(struct kdbus_staging *staging); ++struct kdbus_pool_slice *kdbus_staging_emit(struct kdbus_staging *staging, ++ struct kdbus_conn *src, ++ struct kdbus_conn *dst); + +#endif diff --git a/ipc/kdbus/metadata.c b/ipc/kdbus/metadata.c new file mode 100644 -index 0000000..c36b9cc +index 0000000..d4973a9 --- /dev/null +++ b/ipc/kdbus/metadata.c -@@ -0,0 +1,1184 @@ +@@ -0,0 +1,1342 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -15794,26 +18229,16 @@ index 0000000..c36b9cc + * @lock: Object lock + * @collected: Bitmask of collected items + * @valid: Bitmask of collected and valid items -+ * @uid: UID of process -+ * @euid: EUID of process -+ * @suid: SUID of process -+ * @fsuid: FSUID of process -+ * @gid: GID of process -+ * @egid: EGID of process -+ * @sgid: SGID of process -+ * @fsgid: FSGID of process ++ * @cred: Credentials + * @pid: PID of process + * @tgid: TGID of process + * @ppid: PPID of process -+ * @auxgrps: Auxiliary groups -+ * @n_auxgrps: Number of items in @auxgrps + * @tid_comm: TID comm line + * @pid_comm: PID comm line + * @exe_path: Executable path + * @root_path: Root-FS path + * @cmdline: Command-line + * @cgroup: Full cgroup path -+ * @cred: Credentials + * @seclabel: Seclabel + * @audit_loginuid: Audit login-UID + * @audit_sessionid: Audit session-ID @@ -15825,18 +18250,15 @@ index 0000000..c36b9cc + u64 valid; + + /* KDBUS_ITEM_CREDS */ -+ kuid_t uid, euid, suid, fsuid; -+ kgid_t gid, egid, sgid, fsgid; ++ /* KDBUS_ITEM_AUXGROUPS */ ++ /* KDBUS_ITEM_CAPS */ ++ const struct cred *cred; + + /* KDBUS_ITEM_PIDS */ + struct pid *pid; + struct pid *tgid; + struct pid *ppid; + -+ /* KDBUS_ITEM_AUXGROUPS */ -+ kgid_t *auxgrps; -+ size_t n_auxgrps; -+ + /* KDBUS_ITEM_TID_COMM */ + char tid_comm[TASK_COMM_LEN]; + /* KDBUS_ITEM_PID_COMM */ @@ -15852,9 +18274,6 @@ index 0000000..c36b9cc + /* KDBUS_ITEM_CGROUP */ + char *cgroup; + -+ /* KDBUS_ITEM_CAPS */ -+ const struct cred *cred; -+ + /* KDBUS_ITEM_SECLABEL */ + char *seclabel; + @@ -15932,7 +18351,6 @@ index 0000000..c36b9cc + put_pid(mp->pid); + + kfree(mp->seclabel); -+ kfree(mp->auxgrps); + kfree(mp->cmdline); + kfree(mp->cgroup); + kfree(mp); @@ -15964,21 +18382,6 @@ index 0000000..c36b9cc + return NULL; +} + -+static void kdbus_meta_proc_collect_creds(struct kdbus_meta_proc *mp) -+{ -+ mp->uid = current_uid(); -+ mp->euid = current_euid(); -+ mp->suid = current_suid(); -+ mp->fsuid = current_fsuid(); -+ -+ mp->gid = current_gid(); -+ mp->egid = current_egid(); -+ mp->sgid = current_sgid(); -+ mp->fsgid = current_fsgid(); -+ -+ mp->valid |= KDBUS_ATTACH_CREDS; -+} -+ +static void kdbus_meta_proc_collect_pids(struct kdbus_meta_proc *mp) +{ + struct task_struct *parent; @@ -15994,30 +18397,6 @@ index 0000000..c36b9cc + mp->valid |= KDBUS_ATTACH_PIDS; +} + -+static int kdbus_meta_proc_collect_auxgroups(struct kdbus_meta_proc *mp) -+{ -+ const struct group_info *info; -+ size_t i; -+ -+ /* no need to lock/ref, current creds cannot change */ -+ info = current_cred()->group_info; -+ -+ if (info->ngroups > 0) { -+ mp->auxgrps = kmalloc_array(info->ngroups, sizeof(kgid_t), -+ GFP_KERNEL); -+ if (!mp->auxgrps) -+ return -ENOMEM; -+ -+ for (i = 0; i < info->ngroups; i++) -+ mp->auxgrps[i] = GROUP_AT(info, i); -+ } -+ -+ mp->n_auxgrps = info->ngroups; -+ mp->valid |= KDBUS_ATTACH_AUXGROUPS; -+ -+ return 0; -+} -+ +static void kdbus_meta_proc_collect_tid_comm(struct kdbus_meta_proc *mp) +{ + get_task_comm(mp->tid_comm, current); @@ -16090,12 +18469,6 @@ index 0000000..c36b9cc + return 0; +} + -+static void kdbus_meta_proc_collect_caps(struct kdbus_meta_proc *mp) -+{ -+ mp->cred = get_current_cred(); -+ mp->valid |= KDBUS_ATTACH_CAPS; -+} -+ +static int kdbus_meta_proc_collect_seclabel(struct kdbus_meta_proc *mp) +{ +#ifdef CONFIG_SECURITY @@ -16162,10 +18535,17 @@ index 0000000..c36b9cc + + mutex_lock(&mp->lock); + -+ if ((what & KDBUS_ATTACH_CREDS) && -+ !(mp->collected & KDBUS_ATTACH_CREDS)) { -+ kdbus_meta_proc_collect_creds(mp); -+ mp->collected |= KDBUS_ATTACH_CREDS; ++ /* creds, auxgrps and caps share "struct cred" as context */ ++ { ++ const u64 m_cred = KDBUS_ATTACH_CREDS | ++ KDBUS_ATTACH_AUXGROUPS | ++ KDBUS_ATTACH_CAPS; ++ ++ if ((what & m_cred) && !(mp->collected & m_cred)) { ++ mp->cred = get_current_cred(); ++ mp->valid |= m_cred; ++ mp->collected |= m_cred; ++ } + } + + if ((what & KDBUS_ATTACH_PIDS) && @@ -16174,14 +18554,6 @@ index 0000000..c36b9cc + mp->collected |= KDBUS_ATTACH_PIDS; + } + -+ if ((what & KDBUS_ATTACH_AUXGROUPS) && -+ !(mp->collected & KDBUS_ATTACH_AUXGROUPS)) { -+ ret = kdbus_meta_proc_collect_auxgroups(mp); -+ if (ret < 0) -+ goto exit_unlock; -+ mp->collected |= KDBUS_ATTACH_AUXGROUPS; -+ } -+ + if ((what & KDBUS_ATTACH_TID_COMM) && + !(mp->collected & KDBUS_ATTACH_TID_COMM)) { + kdbus_meta_proc_collect_tid_comm(mp); @@ -16216,12 +18588,6 @@ index 0000000..c36b9cc + mp->collected |= KDBUS_ATTACH_CGROUP; + } + -+ if ((what & KDBUS_ATTACH_CAPS) && -+ !(mp->collected & KDBUS_ATTACH_CAPS)) { -+ kdbus_meta_proc_collect_caps(mp); -+ mp->collected |= KDBUS_ATTACH_CAPS; -+ } -+ + if ((what & KDBUS_ATTACH_SECLABEL) && + !(mp->collected & KDBUS_ATTACH_SECLABEL)) { + ret = kdbus_meta_proc_collect_seclabel(mp); @@ -16244,101 +18610,116 @@ index 0000000..c36b9cc +} + +/** -+ * kdbus_meta_proc_fake() - Fill process metadata from faked credentials -+ * @mp: Metadata ++ * kdbus_meta_fake_new() - Create fake metadata object ++ * ++ * Return: Pointer to new object on success, ERR_PTR on failure. ++ */ ++struct kdbus_meta_fake *kdbus_meta_fake_new(void) ++{ ++ struct kdbus_meta_fake *mf; ++ ++ mf = kzalloc(sizeof(*mf), GFP_KERNEL); ++ if (!mf) ++ return ERR_PTR(-ENOMEM); ++ ++ return mf; ++} ++ ++/** ++ * kdbus_meta_fake_free() - Free fake metadata object ++ * @mf: Fake metadata object ++ * ++ * Return: NULL ++ */ ++struct kdbus_meta_fake *kdbus_meta_fake_free(struct kdbus_meta_fake *mf) ++{ ++ if (mf) { ++ put_pid(mf->ppid); ++ put_pid(mf->tgid); ++ put_pid(mf->pid); ++ kfree(mf->seclabel); ++ kfree(mf); ++ } ++ ++ return NULL; ++} ++ ++/** ++ * kdbus_meta_fake_collect() - Fill fake metadata from faked credentials ++ * @mf: Fake metadata object + * @creds: Creds to set, may be %NULL + * @pids: PIDs to set, may be %NULL + * @seclabel: Seclabel to set, may be %NULL + * + * This function takes information stored in @creds, @pids and @seclabel and -+ * resolves them to kernel-representations, if possible. A call to this function -+ * is considered an alternative to calling kdbus_meta_add_current(), which -+ * derives the same information from the 'current' task. ++ * resolves them to kernel-representations, if possible. This call uses the ++ * current task's namespaces to resolve the given information. + * -+ * This call uses the current task's namespaces to resolve the given -+ * information. -+ * -+ * Return: 0 on success, negative error number otherwise. ++ * Return: 0 on success, negative error code on failure. + */ -+int kdbus_meta_proc_fake(struct kdbus_meta_proc *mp, -+ const struct kdbus_creds *creds, -+ const struct kdbus_pids *pids, -+ const char *seclabel) ++int kdbus_meta_fake_collect(struct kdbus_meta_fake *mf, ++ const struct kdbus_creds *creds, ++ const struct kdbus_pids *pids, ++ const char *seclabel) +{ -+ int ret; -+ -+ if (!mp) -+ return 0; -+ -+ mutex_lock(&mp->lock); ++ if (mf->valid) ++ return -EALREADY; + -+ if (creds && !(mp->collected & KDBUS_ATTACH_CREDS)) { ++ if (creds) { + struct user_namespace *ns = current_user_ns(); + -+ mp->uid = make_kuid(ns, creds->uid); -+ mp->euid = make_kuid(ns, creds->euid); -+ mp->suid = make_kuid(ns, creds->suid); -+ mp->fsuid = make_kuid(ns, creds->fsuid); -+ -+ mp->gid = make_kgid(ns, creds->gid); -+ mp->egid = make_kgid(ns, creds->egid); -+ mp->sgid = make_kgid(ns, creds->sgid); -+ mp->fsgid = make_kgid(ns, creds->fsgid); -+ -+ if ((creds->uid != (uid_t)-1 && !uid_valid(mp->uid)) || -+ (creds->euid != (uid_t)-1 && !uid_valid(mp->euid)) || -+ (creds->suid != (uid_t)-1 && !uid_valid(mp->suid)) || -+ (creds->fsuid != (uid_t)-1 && !uid_valid(mp->fsuid)) || -+ (creds->gid != (gid_t)-1 && !gid_valid(mp->gid)) || -+ (creds->egid != (gid_t)-1 && !gid_valid(mp->egid)) || -+ (creds->sgid != (gid_t)-1 && !gid_valid(mp->sgid)) || -+ (creds->fsgid != (gid_t)-1 && !gid_valid(mp->fsgid))) { -+ ret = -EINVAL; -+ goto exit_unlock; -+ } ++ mf->uid = make_kuid(ns, creds->uid); ++ mf->euid = make_kuid(ns, creds->euid); ++ mf->suid = make_kuid(ns, creds->suid); ++ mf->fsuid = make_kuid(ns, creds->fsuid); ++ ++ mf->gid = make_kgid(ns, creds->gid); ++ mf->egid = make_kgid(ns, creds->egid); ++ mf->sgid = make_kgid(ns, creds->sgid); ++ mf->fsgid = make_kgid(ns, creds->fsgid); ++ ++ if ((creds->uid != (uid_t)-1 && !uid_valid(mf->uid)) || ++ (creds->euid != (uid_t)-1 && !uid_valid(mf->euid)) || ++ (creds->suid != (uid_t)-1 && !uid_valid(mf->suid)) || ++ (creds->fsuid != (uid_t)-1 && !uid_valid(mf->fsuid)) || ++ (creds->gid != (gid_t)-1 && !gid_valid(mf->gid)) || ++ (creds->egid != (gid_t)-1 && !gid_valid(mf->egid)) || ++ (creds->sgid != (gid_t)-1 && !gid_valid(mf->sgid)) || ++ (creds->fsgid != (gid_t)-1 && !gid_valid(mf->fsgid))) ++ return -EINVAL; + -+ mp->valid |= KDBUS_ATTACH_CREDS; -+ mp->collected |= KDBUS_ATTACH_CREDS; ++ mf->valid |= KDBUS_ATTACH_CREDS; + } + -+ if (pids && !(mp->collected & KDBUS_ATTACH_PIDS)) { -+ mp->pid = get_pid(find_vpid(pids->tid)); -+ mp->tgid = get_pid(find_vpid(pids->pid)); -+ mp->ppid = get_pid(find_vpid(pids->ppid)); ++ if (pids) { ++ mf->pid = get_pid(find_vpid(pids->tid)); ++ mf->tgid = get_pid(find_vpid(pids->pid)); ++ mf->ppid = get_pid(find_vpid(pids->ppid)); + -+ if ((pids->tid != 0 && !mp->pid) || -+ (pids->pid != 0 && !mp->tgid) || -+ (pids->ppid != 0 && !mp->ppid)) { -+ put_pid(mp->pid); -+ put_pid(mp->tgid); -+ put_pid(mp->ppid); -+ mp->pid = NULL; -+ mp->tgid = NULL; -+ mp->ppid = NULL; -+ ret = -EINVAL; -+ goto exit_unlock; ++ if ((pids->tid != 0 && !mf->pid) || ++ (pids->pid != 0 && !mf->tgid) || ++ (pids->ppid != 0 && !mf->ppid)) { ++ put_pid(mf->pid); ++ put_pid(mf->tgid); ++ put_pid(mf->ppid); ++ mf->pid = NULL; ++ mf->tgid = NULL; ++ mf->ppid = NULL; ++ return -EINVAL; + } + -+ mp->valid |= KDBUS_ATTACH_PIDS; -+ mp->collected |= KDBUS_ATTACH_PIDS; ++ mf->valid |= KDBUS_ATTACH_PIDS; + } + -+ if (seclabel && !(mp->collected & KDBUS_ATTACH_SECLABEL)) { -+ mp->seclabel = kstrdup(seclabel, GFP_KERNEL); -+ if (!mp->seclabel) { -+ ret = -ENOMEM; -+ goto exit_unlock; -+ } ++ if (seclabel) { ++ mf->seclabel = kstrdup(seclabel, GFP_KERNEL); ++ if (!mf->seclabel) ++ return -ENOMEM; + -+ mp->valid |= KDBUS_ATTACH_SECLABEL; -+ mp->collected |= KDBUS_ATTACH_SECLABEL; ++ mf->valid |= KDBUS_ATTACH_SECLABEL; + } + -+ ret = 0; -+ -+exit_unlock: -+ mutex_unlock(&mp->lock); -+ return ret; ++ return 0; +} + +/** @@ -16393,13 +18774,13 @@ index 0000000..c36b9cc +} + +static void kdbus_meta_conn_collect_timestamp(struct kdbus_meta_conn *mc, -+ struct kdbus_kmsg *kmsg) ++ u64 msg_seqnum) +{ + mc->ts.monotonic_ns = ktime_get_ns(); + mc->ts.realtime_ns = ktime_get_real_ns(); + -+ if (kmsg) -+ mc->ts.seqnum = kmsg->seq; ++ if (msg_seqnum) ++ mc->ts.seqnum = msg_seqnum; + + mc->valid |= KDBUS_ATTACH_TIMESTAMP; +} @@ -16414,14 +18795,16 @@ index 0000000..c36b9cc + lockdep_assert_held(&conn->ep->bus->name_registry->rwlock); + + size = 0; ++ /* open-code length calculation to avoid final padding */ + list_for_each_entry(e, &conn->names_list, conn_entry) -+ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_name) + -+ strlen(e->name) + 1); ++ size = KDBUS_ALIGN8(size) + KDBUS_ITEM_HEADER_SIZE + ++ sizeof(struct kdbus_name) + strlen(e->name) + 1; + + if (!size) + return 0; + -+ item = kmalloc(size, GFP_KERNEL); ++ /* make sure we include zeroed padding for convenience helpers */ ++ item = kmalloc(KDBUS_ALIGN8(size), GFP_KERNEL); + if (!item) + return -ENOMEM; + @@ -16438,7 +18821,8 @@ index 0000000..c36b9cc + } + + /* sanity check: the buffer should be completely written now */ -+ WARN_ON((u8 *)item != (u8 *)mc->owned_names_items + size); ++ WARN_ON((u8 *)item != ++ (u8 *)mc->owned_names_items + KDBUS_ALIGN8(size)); + + mc->valid |= KDBUS_ATTACH_NAMES; + return 0; @@ -16461,184 +18845,63 @@ index 0000000..c36b9cc +/** + * kdbus_meta_conn_collect() - Collect connection metadata + * @mc: Message metadata object -+ * @kmsg: Kmsg to collect data from + * @conn: Connection to collect data from ++ * @msg_seqnum: Sequence number of the message to send + * @what: Attach flags to collect + * -+ * This collects connection metadata from @kmsg and @conn and saves it in @mc. ++ * This collects connection metadata from @msg_seqnum and @conn and saves it ++ * in @mc. + * + * If KDBUS_ATTACH_NAMES is set in @what and @conn is non-NULL, the caller must + * hold the name-registry read-lock of conn->ep->bus->registry. + * -+ * Return: 0 on success, negative error code on failure. -+ */ -+int kdbus_meta_conn_collect(struct kdbus_meta_conn *mc, -+ struct kdbus_kmsg *kmsg, -+ struct kdbus_conn *conn, -+ u64 what) -+{ -+ int ret; -+ -+ if (!mc || !(what & (KDBUS_ATTACH_TIMESTAMP | -+ KDBUS_ATTACH_NAMES | -+ KDBUS_ATTACH_CONN_DESCRIPTION))) -+ return 0; -+ -+ mutex_lock(&mc->lock); -+ -+ if (kmsg && (what & KDBUS_ATTACH_TIMESTAMP) && -+ !(mc->collected & KDBUS_ATTACH_TIMESTAMP)) { -+ kdbus_meta_conn_collect_timestamp(mc, kmsg); -+ mc->collected |= KDBUS_ATTACH_TIMESTAMP; -+ } -+ -+ if (conn && (what & KDBUS_ATTACH_NAMES) && -+ !(mc->collected & KDBUS_ATTACH_NAMES)) { -+ ret = kdbus_meta_conn_collect_names(mc, conn); -+ if (ret < 0) -+ goto exit_unlock; -+ mc->collected |= KDBUS_ATTACH_NAMES; -+ } -+ -+ if (conn && (what & KDBUS_ATTACH_CONN_DESCRIPTION) && -+ !(mc->collected & KDBUS_ATTACH_CONN_DESCRIPTION)) { -+ ret = kdbus_meta_conn_collect_description(mc, conn); -+ if (ret < 0) -+ goto exit_unlock; -+ mc->collected |= KDBUS_ATTACH_CONN_DESCRIPTION; -+ } -+ -+ ret = 0; -+ -+exit_unlock: -+ mutex_unlock(&mc->lock); -+ return ret; -+} -+ -+/* -+ * kdbus_meta_export_prepare() - Prepare metadata for export -+ * @mp: Process metadata, or NULL -+ * @mc: Connection metadata, or NULL -+ * @mask: Pointer to mask of KDBUS_ATTACH_* flags to export -+ * @sz: Pointer to return the size needed by the metadata -+ * -+ * Does a conservative calculation of how much space metadata information -+ * will take up during export. It is 'conservative' because for string -+ * translations in namespaces, it will use the kernel namespaces, which is -+ * the longest possible version. -+ * -+ * The actual size consumed by kdbus_meta_export() may hence vary from the -+ * one reported here, but it is guaranteed never to be greater. -+ * -+ * Return: 0 on success, negative error number otherwise. -+ */ -+int kdbus_meta_export_prepare(struct kdbus_meta_proc *mp, -+ struct kdbus_meta_conn *mc, -+ u64 *mask, size_t *sz) -+{ -+ char *exe_pathname = NULL; -+ void *exe_page = NULL; -+ size_t size = 0; -+ u64 valid = 0; -+ int ret = 0; -+ -+ if (mp) { -+ mutex_lock(&mp->lock); -+ valid |= mp->valid; -+ mutex_unlock(&mp->lock); -+ } -+ -+ if (mc) { -+ mutex_lock(&mc->lock); -+ valid |= mc->valid; -+ mutex_unlock(&mc->lock); -+ } -+ -+ *mask &= valid; -+ -+ if (!*mask) -+ goto exit; -+ -+ /* process metadata */ -+ -+ if (mp && (*mask & KDBUS_ATTACH_CREDS)) -+ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_creds)); -+ -+ if (mp && (*mask & KDBUS_ATTACH_PIDS)) -+ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_pids)); -+ -+ if (mp && (*mask & KDBUS_ATTACH_AUXGROUPS)) -+ size += KDBUS_ITEM_SIZE(mp->n_auxgrps * sizeof(u64)); -+ -+ if (mp && (*mask & KDBUS_ATTACH_TID_COMM)) -+ size += KDBUS_ITEM_SIZE(strlen(mp->tid_comm) + 1); -+ -+ if (mp && (*mask & KDBUS_ATTACH_PID_COMM)) -+ size += KDBUS_ITEM_SIZE(strlen(mp->pid_comm) + 1); -+ -+ if (mp && (*mask & KDBUS_ATTACH_EXE)) { -+ exe_page = (void *)__get_free_page(GFP_TEMPORARY); -+ if (!exe_page) { -+ ret = -ENOMEM; -+ goto exit; -+ } -+ -+ exe_pathname = d_path(&mp->exe_path, exe_page, PAGE_SIZE); -+ if (IS_ERR(exe_pathname)) { -+ ret = PTR_ERR(exe_pathname); -+ goto exit; -+ } -+ -+ size += KDBUS_ITEM_SIZE(strlen(exe_pathname) + 1); -+ free_page((unsigned long)exe_page); -+ } -+ -+ if (mp && (*mask & KDBUS_ATTACH_CMDLINE)) -+ size += KDBUS_ITEM_SIZE(strlen(mp->cmdline) + 1); -+ -+ if (mp && (*mask & KDBUS_ATTACH_CGROUP)) -+ size += KDBUS_ITEM_SIZE(strlen(mp->cgroup) + 1); -+ -+ if (mp && (*mask & KDBUS_ATTACH_CAPS)) -+ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_meta_caps)); -+ -+ if (mp && (*mask & KDBUS_ATTACH_SECLABEL)) -+ size += KDBUS_ITEM_SIZE(strlen(mp->seclabel) + 1); ++ * Return: 0 on success, negative error code on failure. ++ */ ++int kdbus_meta_conn_collect(struct kdbus_meta_conn *mc, ++ struct kdbus_conn *conn, ++ u64 msg_seqnum, u64 what) ++{ ++ int ret; + -+ if (mp && (*mask & KDBUS_ATTACH_AUDIT)) -+ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_audit)); ++ if (!mc || !(what & (KDBUS_ATTACH_TIMESTAMP | ++ KDBUS_ATTACH_NAMES | ++ KDBUS_ATTACH_CONN_DESCRIPTION))) ++ return 0; + -+ /* connection metadata */ ++ mutex_lock(&mc->lock); + -+ if (mc && (*mask & KDBUS_ATTACH_NAMES)) -+ size += mc->owned_names_size; ++ if (msg_seqnum && (what & KDBUS_ATTACH_TIMESTAMP) && ++ !(mc->collected & KDBUS_ATTACH_TIMESTAMP)) { ++ kdbus_meta_conn_collect_timestamp(mc, msg_seqnum); ++ mc->collected |= KDBUS_ATTACH_TIMESTAMP; ++ } + -+ if (mc && (*mask & KDBUS_ATTACH_CONN_DESCRIPTION)) -+ size += KDBUS_ITEM_SIZE(strlen(mc->conn_description) + 1); ++ if (conn && (what & KDBUS_ATTACH_NAMES) && ++ !(mc->collected & KDBUS_ATTACH_NAMES)) { ++ ret = kdbus_meta_conn_collect_names(mc, conn); ++ if (ret < 0) ++ goto exit_unlock; ++ mc->collected |= KDBUS_ATTACH_NAMES; ++ } + -+ if (mc && (*mask & KDBUS_ATTACH_TIMESTAMP)) -+ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_timestamp)); ++ if (conn && (what & KDBUS_ATTACH_CONN_DESCRIPTION) && ++ !(mc->collected & KDBUS_ATTACH_CONN_DESCRIPTION)) { ++ ret = kdbus_meta_conn_collect_description(mc, conn); ++ if (ret < 0) ++ goto exit_unlock; ++ mc->collected |= KDBUS_ATTACH_CONN_DESCRIPTION; ++ } + -+exit: -+ *sz = size; ++ ret = 0; + ++exit_unlock: ++ mutex_unlock(&mc->lock); + return ret; +} + -+static int kdbus_meta_push_kvec(struct kvec *kvec, -+ struct kdbus_item_header *hdr, -+ u64 type, void *payload, -+ size_t payload_size, u64 *size) -+{ -+ hdr->type = type; -+ hdr->size = KDBUS_ITEM_HEADER_SIZE + payload_size; -+ kdbus_kvec_set(kvec++, hdr, sizeof(*hdr), size); -+ kdbus_kvec_set(kvec++, payload, payload_size, size); -+ return 2 + !!kdbus_kvec_pad(kvec++, size); -+} -+ +static void kdbus_meta_export_caps(struct kdbus_meta_caps *out, -+ struct kdbus_meta_proc *mp) ++ const struct kdbus_meta_proc *mp, ++ struct user_namespace *user_ns) +{ + struct user_namespace *iter; + const struct cred *cred = mp->cred; @@ -16646,18 +18909,18 @@ index 0000000..c36b9cc + int i; + + /* -+ * This translates the effective capabilities of 'cred' into the current -+ * user-namespace. If the current user-namespace is a child-namespace of ++ * This translates the effective capabilities of 'cred' into the given ++ * user-namespace. If the given user-namespace is a child-namespace of + * the user-namespace of 'cred', the mask can be copied verbatim. If + * not, the mask is cleared. + * There's one exception: If 'cred' is the owner of any user-namespace -+ * in the path between the current user-namespace and the user-namespace ++ * in the path between the given user-namespace and the user-namespace + * of 'cred', then it has all effective capabilities set. This means, + * the user who created a user-namespace always has all effective + * capabilities in any child namespaces. Note that this is based on the + * uid of the namespace creator, not the task hierarchy. + */ -+ for (iter = current_user_ns(); iter; iter = iter->parent) { ++ for (iter = user_ns; iter; iter = iter->parent) { + if (iter == cred->user_ns) { + parent = true; + break; @@ -16701,126 +18964,327 @@ index 0000000..c36b9cc +} + +/* This is equivalent to from_kuid_munged(), but maps INVALID_UID to itself */ -+static uid_t kdbus_from_kuid_keep(kuid_t uid) ++static uid_t kdbus_from_kuid_keep(struct user_namespace *ns, kuid_t uid) +{ -+ return uid_valid(uid) ? -+ from_kuid_munged(current_user_ns(), uid) : ((uid_t)-1); ++ return uid_valid(uid) ? from_kuid_munged(ns, uid) : ((uid_t)-1); +} + +/* This is equivalent to from_kgid_munged(), but maps INVALID_GID to itself */ -+static gid_t kdbus_from_kgid_keep(kgid_t gid) ++static gid_t kdbus_from_kgid_keep(struct user_namespace *ns, kgid_t gid) +{ -+ return gid_valid(gid) ? -+ from_kgid_munged(current_user_ns(), gid) : ((gid_t)-1); ++ return gid_valid(gid) ? from_kgid_munged(ns, gid) : ((gid_t)-1); +} + -+/** -+ * kdbus_meta_export() - export information from metadata into a slice -+ * @mp: Process metadata, or NULL -+ * @mc: Connection metadata, or NULL -+ * @mask: Mask of KDBUS_ATTACH_* flags to export -+ * @slice: The slice to export to -+ * @offset: The offset inside @slice to write to -+ * @real_size: The real size the metadata consumed -+ * -+ * This function exports information from metadata into @slice at offset -+ * @offset inside that slice. Only information that is requested in @mask -+ * and that has been collected before is exported. -+ * -+ * In order to make sure not to write out of bounds, @mask must be the same -+ * value that was previously returned from kdbus_meta_export_prepare(). The -+ * function will, however, not necessarily write as many bytes as returned by -+ * kdbus_meta_export_prepare(); depending on the namespaces in question, it -+ * might use up less than that. -+ * -+ * All information will be translated using the current namespaces. -+ * -+ * Return: 0 on success, negative error number otherwise. -+ */ -+int kdbus_meta_export(struct kdbus_meta_proc *mp, -+ struct kdbus_meta_conn *mc, -+ u64 mask, -+ struct kdbus_pool_slice *slice, -+ off_t offset, -+ size_t *real_size) -+{ -+ struct user_namespace *user_ns = current_user_ns(); -+ struct kdbus_item_header item_hdr[13], *hdr; -+ char *exe_pathname = NULL; -+ struct kdbus_creds creds; -+ struct kdbus_pids pids; -+ void *exe_page = NULL; -+ struct kvec kvec[40]; -+ u64 *auxgrps = NULL; -+ size_t cnt = 0; -+ u64 size = 0; -+ int ret = 0; ++struct kdbus_meta_staging { ++ const struct kdbus_meta_proc *mp; ++ const struct kdbus_meta_fake *mf; ++ const struct kdbus_meta_conn *mc; ++ const struct kdbus_conn *conn; ++ u64 mask; + -+ hdr = &item_hdr[0]; ++ void *exe; ++ const char *exe_path; ++}; + -+ if (mask == 0) { -+ *real_size = 0; -+ return 0; -+ } ++static size_t kdbus_meta_measure(struct kdbus_meta_staging *staging) ++{ ++ const struct kdbus_meta_proc *mp = staging->mp; ++ const struct kdbus_meta_fake *mf = staging->mf; ++ const struct kdbus_meta_conn *mc = staging->mc; ++ const u64 mask = staging->mask; ++ size_t size = 0; ++ ++ /* process metadata */ ++ ++ if (mf && (mask & KDBUS_ATTACH_CREDS)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_creds)); ++ else if (mp && (mask & KDBUS_ATTACH_CREDS)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_creds)); ++ ++ if (mf && (mask & KDBUS_ATTACH_PIDS)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_pids)); ++ else if (mp && (mask & KDBUS_ATTACH_PIDS)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_pids)); ++ ++ if (mp && (mask & KDBUS_ATTACH_AUXGROUPS)) ++ size += KDBUS_ITEM_SIZE(mp->cred->group_info->ngroups * ++ sizeof(u64)); ++ ++ if (mp && (mask & KDBUS_ATTACH_TID_COMM)) ++ size += KDBUS_ITEM_SIZE(strlen(mp->tid_comm) + 1); ++ ++ if (mp && (mask & KDBUS_ATTACH_PID_COMM)) ++ size += KDBUS_ITEM_SIZE(strlen(mp->pid_comm) + 1); ++ ++ if (staging->exe_path && (mask & KDBUS_ATTACH_EXE)) ++ size += KDBUS_ITEM_SIZE(strlen(staging->exe_path) + 1); ++ ++ if (mp && (mask & KDBUS_ATTACH_CMDLINE)) ++ size += KDBUS_ITEM_SIZE(strlen(mp->cmdline) + 1); ++ ++ if (mp && (mask & KDBUS_ATTACH_CGROUP)) ++ size += KDBUS_ITEM_SIZE(strlen(mp->cgroup) + 1); ++ ++ if (mp && (mask & KDBUS_ATTACH_CAPS)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_meta_caps)); ++ ++ if (mf && (mask & KDBUS_ATTACH_SECLABEL)) ++ size += KDBUS_ITEM_SIZE(strlen(mf->seclabel) + 1); ++ else if (mp && (mask & KDBUS_ATTACH_SECLABEL)) ++ size += KDBUS_ITEM_SIZE(strlen(mp->seclabel) + 1); ++ ++ if (mp && (mask & KDBUS_ATTACH_AUDIT)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_audit)); ++ ++ /* connection metadata */ ++ ++ if (mc && (mask & KDBUS_ATTACH_NAMES)) ++ size += KDBUS_ALIGN8(mc->owned_names_size); ++ ++ if (mc && (mask & KDBUS_ATTACH_CONN_DESCRIPTION)) ++ size += KDBUS_ITEM_SIZE(strlen(mc->conn_description) + 1); ++ ++ if (mc && (mask & KDBUS_ATTACH_TIMESTAMP)) ++ size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_timestamp)); ++ ++ return size; ++} ++ ++static struct kdbus_item *kdbus_write_head(struct kdbus_item **iter, ++ u64 type, u64 size) ++{ ++ struct kdbus_item *item = *iter; ++ size_t padding; ++ ++ item->type = type; ++ item->size = KDBUS_ITEM_HEADER_SIZE + size; ++ ++ /* clear padding */ ++ padding = KDBUS_ALIGN8(item->size) - item->size; ++ if (padding) ++ memset(item->data + size, 0, padding); ++ ++ *iter = KDBUS_ITEM_NEXT(item); ++ return item; ++} ++ ++static struct kdbus_item *kdbus_write_full(struct kdbus_item **iter, ++ u64 type, u64 size, const void *data) ++{ ++ struct kdbus_item *item; ++ ++ item = kdbus_write_head(iter, type, size); ++ memcpy(item->data, data, size); ++ return item; ++} ++ ++static size_t kdbus_meta_write(struct kdbus_meta_staging *staging, void *mem, ++ size_t size) ++{ ++ struct user_namespace *user_ns = staging->conn->cred->user_ns; ++ struct pid_namespace *pid_ns = ns_of_pid(staging->conn->pid); ++ struct kdbus_item *item = NULL, *items = mem; ++ u8 *end, *owned_names_end = NULL; + + /* process metadata */ + -+ if (mp && (mask & KDBUS_ATTACH_CREDS)) { -+ creds.uid = kdbus_from_kuid_keep(mp->uid); -+ creds.euid = kdbus_from_kuid_keep(mp->euid); -+ creds.suid = kdbus_from_kuid_keep(mp->suid); -+ creds.fsuid = kdbus_from_kuid_keep(mp->fsuid); -+ creds.gid = kdbus_from_kgid_keep(mp->gid); -+ creds.egid = kdbus_from_kgid_keep(mp->egid); -+ creds.sgid = kdbus_from_kgid_keep(mp->sgid); -+ creds.fsgid = kdbus_from_kgid_keep(mp->fsgid); ++ if (staging->mf && (staging->mask & KDBUS_ATTACH_CREDS)) { ++ const struct kdbus_meta_fake *mf = staging->mf; ++ ++ item = kdbus_write_head(&items, KDBUS_ITEM_CREDS, ++ sizeof(struct kdbus_creds)); ++ item->creds = (struct kdbus_creds){ ++ .uid = kdbus_from_kuid_keep(user_ns, mf->uid), ++ .euid = kdbus_from_kuid_keep(user_ns, mf->euid), ++ .suid = kdbus_from_kuid_keep(user_ns, mf->suid), ++ .fsuid = kdbus_from_kuid_keep(user_ns, mf->fsuid), ++ .gid = kdbus_from_kgid_keep(user_ns, mf->gid), ++ .egid = kdbus_from_kgid_keep(user_ns, mf->egid), ++ .sgid = kdbus_from_kgid_keep(user_ns, mf->sgid), ++ .fsgid = kdbus_from_kgid_keep(user_ns, mf->fsgid), ++ }; ++ } else if (staging->mp && (staging->mask & KDBUS_ATTACH_CREDS)) { ++ const struct cred *c = staging->mp->cred; ++ ++ item = kdbus_write_head(&items, KDBUS_ITEM_CREDS, ++ sizeof(struct kdbus_creds)); ++ item->creds = (struct kdbus_creds){ ++ .uid = kdbus_from_kuid_keep(user_ns, c->uid), ++ .euid = kdbus_from_kuid_keep(user_ns, c->euid), ++ .suid = kdbus_from_kuid_keep(user_ns, c->suid), ++ .fsuid = kdbus_from_kuid_keep(user_ns, c->fsuid), ++ .gid = kdbus_from_kgid_keep(user_ns, c->gid), ++ .egid = kdbus_from_kgid_keep(user_ns, c->egid), ++ .sgid = kdbus_from_kgid_keep(user_ns, c->sgid), ++ .fsgid = kdbus_from_kgid_keep(user_ns, c->fsgid), ++ }; ++ } ++ ++ if (staging->mf && (staging->mask & KDBUS_ATTACH_PIDS)) { ++ item = kdbus_write_head(&items, KDBUS_ITEM_PIDS, ++ sizeof(struct kdbus_pids)); ++ item->pids = (struct kdbus_pids){ ++ .pid = pid_nr_ns(staging->mf->tgid, pid_ns), ++ .tid = pid_nr_ns(staging->mf->pid, pid_ns), ++ .ppid = pid_nr_ns(staging->mf->ppid, pid_ns), ++ }; ++ } else if (staging->mp && (staging->mask & KDBUS_ATTACH_PIDS)) { ++ item = kdbus_write_head(&items, KDBUS_ITEM_PIDS, ++ sizeof(struct kdbus_pids)); ++ item->pids = (struct kdbus_pids){ ++ .pid = pid_nr_ns(staging->mp->tgid, pid_ns), ++ .tid = pid_nr_ns(staging->mp->pid, pid_ns), ++ .ppid = pid_nr_ns(staging->mp->ppid, pid_ns), ++ }; ++ } + -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, KDBUS_ITEM_CREDS, -+ &creds, sizeof(creds), &size); ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_AUXGROUPS)) { ++ const struct group_info *info = staging->mp->cred->group_info; ++ size_t i; ++ ++ item = kdbus_write_head(&items, KDBUS_ITEM_AUXGROUPS, ++ info->ngroups * sizeof(u64)); ++ for (i = 0; i < info->ngroups; ++i) ++ item->data64[i] = from_kgid_munged(user_ns, ++ GROUP_AT(info, i)); ++ } ++ ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_TID_COMM)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_TID_COMM, ++ strlen(staging->mp->tid_comm) + 1, ++ staging->mp->tid_comm); ++ ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_PID_COMM)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_PID_COMM, ++ strlen(staging->mp->pid_comm) + 1, ++ staging->mp->pid_comm); ++ ++ if (staging->exe_path && (staging->mask & KDBUS_ATTACH_EXE)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_EXE, ++ strlen(staging->exe_path) + 1, ++ staging->exe_path); ++ ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_CMDLINE)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_CMDLINE, ++ strlen(staging->mp->cmdline) + 1, ++ staging->mp->cmdline); ++ ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_CGROUP)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_CGROUP, ++ strlen(staging->mp->cgroup) + 1, ++ staging->mp->cgroup); ++ ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_CAPS)) { ++ item = kdbus_write_head(&items, KDBUS_ITEM_CAPS, ++ sizeof(struct kdbus_meta_caps)); ++ kdbus_meta_export_caps((void*)&item->caps, staging->mp, ++ user_ns); ++ } ++ ++ if (staging->mf && (staging->mask & KDBUS_ATTACH_SECLABEL)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_SECLABEL, ++ strlen(staging->mf->seclabel) + 1, ++ staging->mf->seclabel); ++ else if (staging->mp && (staging->mask & KDBUS_ATTACH_SECLABEL)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_SECLABEL, ++ strlen(staging->mp->seclabel) + 1, ++ staging->mp->seclabel); ++ ++ if (staging->mp && (staging->mask & KDBUS_ATTACH_AUDIT)) { ++ item = kdbus_write_head(&items, KDBUS_ITEM_AUDIT, ++ sizeof(struct kdbus_audit)); ++ item->audit = (struct kdbus_audit){ ++ .loginuid = from_kuid(user_ns, ++ staging->mp->audit_loginuid), ++ .sessionid = staging->mp->audit_sessionid, ++ }; + } + -+ if (mp && (mask & KDBUS_ATTACH_PIDS)) { -+ pids.pid = pid_vnr(mp->tgid); -+ pids.tid = pid_vnr(mp->pid); -+ pids.ppid = pid_vnr(mp->ppid); ++ /* connection metadata */ + -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, KDBUS_ITEM_PIDS, -+ &pids, sizeof(pids), &size); ++ if (staging->mc && (staging->mask & KDBUS_ATTACH_NAMES)) { ++ memcpy(items, staging->mc->owned_names_items, ++ KDBUS_ALIGN8(staging->mc->owned_names_size)); ++ owned_names_end = (u8 *)items + staging->mc->owned_names_size; ++ items = (void *)KDBUS_ALIGN8((unsigned long)owned_names_end); + } + -+ if (mp && (mask & KDBUS_ATTACH_AUXGROUPS)) { -+ size_t payload_size = mp->n_auxgrps * sizeof(u64); -+ int i; ++ if (staging->mc && (staging->mask & KDBUS_ATTACH_CONN_DESCRIPTION)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_CONN_DESCRIPTION, ++ strlen(staging->mc->conn_description) + 1, ++ staging->mc->conn_description); + -+ auxgrps = kmalloc(payload_size, GFP_KERNEL); -+ if (!auxgrps) { -+ ret = -ENOMEM; -+ goto exit; -+ } ++ if (staging->mc && (staging->mask & KDBUS_ATTACH_TIMESTAMP)) ++ item = kdbus_write_full(&items, KDBUS_ITEM_TIMESTAMP, ++ sizeof(staging->mc->ts), ++ &staging->mc->ts); ++ ++ /* ++ * Return real size (minus trailing padding). In case of 'owned_names' ++ * we cannot deduce it from item->size, so treat it special. ++ */ ++ ++ if (items == (void *)KDBUS_ALIGN8((unsigned long)owned_names_end)) ++ end = owned_names_end; ++ else if (item) ++ end = (u8 *)item + item->size; ++ else ++ end = mem; ++ ++ WARN_ON((u8 *)items - (u8 *)mem != size); ++ WARN_ON((void *)KDBUS_ALIGN8((unsigned long)end) != (void *)items); ++ ++ return end - (u8 *)mem; ++} ++ ++int kdbus_meta_emit(struct kdbus_meta_proc *mp, ++ struct kdbus_meta_fake *mf, ++ struct kdbus_meta_conn *mc, ++ struct kdbus_conn *conn, ++ u64 mask, ++ struct kdbus_item **out_items, ++ size_t *out_size) ++{ ++ struct kdbus_meta_staging staging = {}; ++ struct kdbus_item *items = NULL; ++ size_t size = 0; ++ int ret; ++ ++ if (WARN_ON(mf && mp)) ++ mp = NULL; + -+ for (i = 0; i < mp->n_auxgrps; i++) -+ auxgrps[i] = from_kgid_munged(user_ns, mp->auxgrps[i]); ++ staging.mp = mp; ++ staging.mf = mf; ++ staging.mc = mc; ++ staging.conn = conn; + -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_AUXGROUPS, -+ auxgrps, payload_size, &size); ++ /* get mask of valid items */ ++ if (mf) ++ staging.mask |= mf->valid; ++ if (mp) { ++ mutex_lock(&mp->lock); ++ staging.mask |= mp->valid; ++ mutex_unlock(&mp->lock); ++ } ++ if (mc) { ++ mutex_lock(&mc->lock); ++ staging.mask |= mc->valid; ++ mutex_unlock(&mc->lock); + } + -+ if (mp && (mask & KDBUS_ATTACH_TID_COMM)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_TID_COMM, mp->tid_comm, -+ strlen(mp->tid_comm) + 1, &size); ++ staging.mask &= mask; + -+ if (mp && (mask & KDBUS_ATTACH_PID_COMM)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_PID_COMM, mp->pid_comm, -+ strlen(mp->pid_comm) + 1, &size); ++ if (!staging.mask) { /* bail out if nothing to do */ ++ ret = 0; ++ goto exit; ++ } + -+ if (mp && (mask & KDBUS_ATTACH_EXE)) { ++ /* EXE is special as it needs a temporary page to assemble */ ++ if (mp && (staging.mask & KDBUS_ATTACH_EXE)) { + struct path p; + + /* -+ * TODO: We need access to __d_path() so we can write the path ++ * XXX: We need access to __d_path() so we can write the path + * relative to conn->root_path. Once upstream, we need + * EXPORT_SYMBOL(__d_path) or an equivalent of d_path() that + * takes the root path directly. Until then, we drop this item @@ -16828,116 +19292,245 @@ index 0000000..c36b9cc + */ + + get_fs_root(current->fs, &p); -+ if (path_equal(&p, &mp->root_path)) { -+ exe_page = (void *)__get_free_page(GFP_TEMPORARY); -+ if (!exe_page) { ++ if (path_equal(&p, &conn->root_path)) { ++ staging.exe = (void *)__get_free_page(GFP_TEMPORARY); ++ if (!staging.exe) { + path_put(&p); + ret = -ENOMEM; + goto exit; + } + -+ exe_pathname = d_path(&mp->exe_path, exe_page, -+ PAGE_SIZE); -+ if (IS_ERR(exe_pathname)) { ++ staging.exe_path = d_path(&mp->exe_path, staging.exe, ++ PAGE_SIZE); ++ if (IS_ERR(staging.exe_path)) { + path_put(&p); -+ ret = PTR_ERR(exe_pathname); ++ ret = PTR_ERR(staging.exe_path); + goto exit; + } -+ -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_EXE, -+ exe_pathname, -+ strlen(exe_pathname) + 1, -+ &size); + } + path_put(&p); + } + -+ if (mp && (mask & KDBUS_ATTACH_CMDLINE)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_CMDLINE, mp->cmdline, -+ strlen(mp->cmdline) + 1, &size); ++ size = kdbus_meta_measure(&staging); ++ if (!size) { /* bail out if nothing to do */ ++ ret = 0; ++ goto exit; ++ } + -+ if (mp && (mask & KDBUS_ATTACH_CGROUP)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_CGROUP, mp->cgroup, -+ strlen(mp->cgroup) + 1, &size); ++ items = kmalloc(size, GFP_KERNEL); ++ if (!items) { ++ ret = -ENOMEM; ++ goto exit; ++ } ++ ++ size = kdbus_meta_write(&staging, items, size); ++ if (!size) { ++ kfree(items); ++ items = NULL; ++ } + -+ if (mp && (mask & KDBUS_ATTACH_CAPS)) { -+ struct kdbus_meta_caps caps = {}; ++ ret = 0; + -+ kdbus_meta_export_caps(&caps, mp); -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_CAPS, &caps, -+ sizeof(caps), &size); ++exit: ++ if (staging.exe) ++ free_page((unsigned long)staging.exe); ++ if (ret >= 0) { ++ *out_items = items; ++ *out_size = size; + } ++ return ret; ++} + -+ if (mp && (mask & KDBUS_ATTACH_SECLABEL)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_SECLABEL, mp->seclabel, -+ strlen(mp->seclabel) + 1, &size); ++enum { ++ KDBUS_META_PROC_NONE, ++ KDBUS_META_PROC_NORMAL, ++}; + -+ if (mp && (mask & KDBUS_ATTACH_AUDIT)) { -+ struct kdbus_audit a = { -+ .loginuid = from_kuid(user_ns, mp->audit_loginuid), -+ .sessionid = mp->audit_sessionid, -+ }; ++/** ++ * kdbus_proc_permission() - check /proc permissions on target pid ++ * @pid_ns: namespace we operate in ++ * @cred: credentials of requestor ++ * @target: target process ++ * ++ * This checks whether a process with credentials @cred can access information ++ * of @target in the namespace @pid_ns. This tries to follow /proc permissions, ++ * but is slightly more restrictive. ++ * ++ * Return: The /proc access level (KDBUS_META_PROC_*) is returned. ++ */ ++static unsigned int kdbus_proc_permission(const struct pid_namespace *pid_ns, ++ const struct cred *cred, ++ struct pid *target) ++{ ++ if (pid_ns->hide_pid < 1) ++ return KDBUS_META_PROC_NORMAL; + -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, KDBUS_ITEM_AUDIT, -+ &a, sizeof(a), &size); -+ } ++ /* XXX: we need groups_search() exported for aux-groups */ ++ if (gid_eq(cred->egid, pid_ns->pid_gid)) ++ return KDBUS_META_PROC_NORMAL; + -+ /* connection metadata */ ++ /* ++ * XXX: If ptrace_may_access(PTRACE_MODE_READ) is granted, you can ++ * overwrite hide_pid. However, ptrace_may_access() only supports ++ * checking 'current', hence, we cannot use this here. But we ++ * simply decide to not support this override, so no need to worry. ++ */ + -+ if (mc && (mask & KDBUS_ATTACH_NAMES)) -+ kdbus_kvec_set(&kvec[cnt++], mc->owned_names_items, -+ mc->owned_names_size, &size); ++ return KDBUS_META_PROC_NONE; ++} + -+ if (mc && (mask & KDBUS_ATTACH_CONN_DESCRIPTION)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_CONN_DESCRIPTION, -+ mc->conn_description, -+ strlen(mc->conn_description) + 1, -+ &size); ++/** ++ * kdbus_meta_proc_mask() - calculate which metadata would be visible to ++ * a connection via /proc ++ * @prv_pid: pid of metadata provider ++ * @req_pid: pid of metadata requestor ++ * @req_cred: credentials of metadata reqeuestor ++ * @wanted: metadata that is requested ++ * ++ * This checks which metadata items of @prv_pid can be read via /proc by the ++ * requestor @req_pid. ++ * ++ * Return: Set of metadata flags the requestor can see (limited by @wanted). ++ */ ++static u64 kdbus_meta_proc_mask(struct pid *prv_pid, ++ struct pid *req_pid, ++ const struct cred *req_cred, ++ u64 wanted) ++{ ++ struct pid_namespace *prv_ns, *req_ns; ++ unsigned int proc; + -+ if (mc && (mask & KDBUS_ATTACH_TIMESTAMP)) -+ cnt += kdbus_meta_push_kvec(kvec + cnt, hdr++, -+ KDBUS_ITEM_TIMESTAMP, &mc->ts, -+ sizeof(mc->ts), &size); ++ prv_ns = ns_of_pid(prv_pid); ++ req_ns = ns_of_pid(req_pid); ++ ++ /* ++ * If the sender is not visible in the receiver namespace, then the ++ * receiver cannot access the sender via its own procfs. Hence, we do ++ * not attach any additional metadata. ++ */ ++ if (!pid_nr_ns(prv_pid, req_ns)) ++ return 0; + -+ ret = kdbus_pool_slice_copy_kvec(slice, offset, kvec, cnt, size); -+ *real_size = size; ++ /* ++ * If the pid-namespace of the receiver has hide_pid set, it cannot see ++ * any process but its own. We shortcut this /proc permission check if ++ * provider and requestor are the same. If not, we perform rather ++ * expensive /proc permission checks. ++ */ ++ if (prv_pid == req_pid) ++ proc = KDBUS_META_PROC_NORMAL; ++ else ++ proc = kdbus_proc_permission(req_ns, req_cred, prv_pid); ++ ++ /* you need /proc access to read standard process attributes */ ++ if (proc < KDBUS_META_PROC_NORMAL) ++ wanted &= ~(KDBUS_ATTACH_TID_COMM | ++ KDBUS_ATTACH_PID_COMM | ++ KDBUS_ATTACH_SECLABEL | ++ KDBUS_ATTACH_CMDLINE | ++ KDBUS_ATTACH_CGROUP | ++ KDBUS_ATTACH_AUDIT | ++ KDBUS_ATTACH_CAPS | ++ KDBUS_ATTACH_EXE); ++ ++ /* clear all non-/proc flags */ ++ return wanted & (KDBUS_ATTACH_TID_COMM | ++ KDBUS_ATTACH_PID_COMM | ++ KDBUS_ATTACH_SECLABEL | ++ KDBUS_ATTACH_CMDLINE | ++ KDBUS_ATTACH_CGROUP | ++ KDBUS_ATTACH_AUDIT | ++ KDBUS_ATTACH_CAPS | ++ KDBUS_ATTACH_EXE); ++} + -+exit: -+ kfree(auxgrps); ++/** ++ * kdbus_meta_get_mask() - calculate attach flags mask for metadata request ++ * @prv_pid: pid of metadata provider ++ * @prv_mask: mask of metadata the provide grants unchecked ++ * @req_pid: pid of metadata requestor ++ * @req_cred: credentials of metadata requestor ++ * @req_mask: mask of metadata that is requested ++ * ++ * This calculates the metadata items that the requestor @req_pid can access ++ * from the metadata provider @prv_pid. This permission check consists of ++ * several different parts: ++ * - Providers can grant metadata items unchecked. Regardless of their type, ++ * they're always granted to the requestor. This mask is passed as @prv_mask. ++ * - Basic items (credentials and connection metadata) are granted implicitly ++ * to everyone. They're publicly available to any bus-user that can see the ++ * provider. ++ * - Process credentials that are not granted implicitly follow the same ++ * permission checks as /proc. This means, we always assume a requestor ++ * process has access to their *own* /proc mount, if they have access to ++ * kdbusfs. ++ * ++ * Return: Mask of metadata that is granted. ++ */ ++static u64 kdbus_meta_get_mask(struct pid *prv_pid, u64 prv_mask, ++ struct pid *req_pid, ++ const struct cred *req_cred, u64 req_mask) ++{ ++ u64 missing, impl_mask, proc_mask = 0; + -+ if (exe_page) -+ free_page((unsigned long)exe_page); ++ /* ++ * Connection metadata and basic unix process credentials are ++ * transmitted implicitly, and cannot be suppressed. Both are required ++ * to perform user-space policies on the receiver-side. Furthermore, ++ * connection metadata is public state, anyway, and unix credentials ++ * are needed for UDS-compatibility. We extend them slightly by ++ * auxiliary groups and additional uids/gids/pids. ++ */ ++ impl_mask = /* connection metadata */ ++ KDBUS_ATTACH_CONN_DESCRIPTION | ++ KDBUS_ATTACH_TIMESTAMP | ++ KDBUS_ATTACH_NAMES | ++ /* credentials and pids */ ++ KDBUS_ATTACH_AUXGROUPS | ++ KDBUS_ATTACH_CREDS | ++ KDBUS_ATTACH_PIDS; + -+ return ret; ++ /* ++ * Calculate the set of metadata that is not granted implicitly nor by ++ * the sender, but still requested by the receiver. If any are left, ++ * perform rather expensive /proc access checks for them. ++ */ ++ missing = req_mask & ~((prv_mask | impl_mask) & req_mask); ++ if (missing) ++ proc_mask = kdbus_meta_proc_mask(prv_pid, req_pid, req_cred, ++ missing); ++ ++ return (prv_mask | impl_mask | proc_mask) & req_mask; ++} ++ ++/** ++ */ ++u64 kdbus_meta_info_mask(const struct kdbus_conn *conn, u64 mask) ++{ ++ return kdbus_meta_get_mask(conn->pid, ++ atomic64_read(&conn->attach_flags_send), ++ task_pid(current), ++ current_cred(), ++ mask); +} + +/** -+ * kdbus_meta_calc_attach_flags() - calculate attach flags for a sender -+ * and a receiver -+ * @sender: Sending connection -+ * @receiver: Receiving connection -+ * -+ * Return: the attach flags both the sender and the receiver have opted-in -+ * for. + */ -+u64 kdbus_meta_calc_attach_flags(const struct kdbus_conn *sender, -+ const struct kdbus_conn *receiver) ++u64 kdbus_meta_msg_mask(const struct kdbus_conn *snd, ++ const struct kdbus_conn *rcv) +{ -+ return atomic64_read(&sender->attach_flags_send) & -+ atomic64_read(&receiver->attach_flags_recv); ++ return kdbus_meta_get_mask(task_pid(current), ++ atomic64_read(&snd->attach_flags_send), ++ rcv->pid, ++ rcv->cred, ++ atomic64_read(&rcv->attach_flags_recv)); +} diff --git a/ipc/kdbus/metadata.h b/ipc/kdbus/metadata.h new file mode 100644 -index 0000000..79b6ac3 +index 0000000..dba7cc7 --- /dev/null +++ b/ipc/kdbus/metadata.h -@@ -0,0 +1,55 @@ +@@ -0,0 +1,86 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -16958,44 +19551,75 @@ index 0000000..79b6ac3 +#include + +struct kdbus_conn; -+struct kdbus_kmsg; +struct kdbus_pool_slice; + +struct kdbus_meta_proc; +struct kdbus_meta_conn; + ++/** ++ * struct kdbus_meta_fake - Fake metadata ++ * @valid: Bitmask of collected and valid items ++ * @uid: UID of process ++ * @euid: EUID of process ++ * @suid: SUID of process ++ * @fsuid: FSUID of process ++ * @gid: GID of process ++ * @egid: EGID of process ++ * @sgid: SGID of process ++ * @fsgid: FSGID of process ++ * @pid: PID of process ++ * @tgid: TGID of process ++ * @ppid: PPID of process ++ * @seclabel: Seclabel ++ */ ++struct kdbus_meta_fake { ++ u64 valid; ++ ++ /* KDBUS_ITEM_CREDS */ ++ kuid_t uid, euid, suid, fsuid; ++ kgid_t gid, egid, sgid, fsgid; ++ ++ /* KDBUS_ITEM_PIDS */ ++ struct pid *pid, *tgid, *ppid; ++ ++ /* KDBUS_ITEM_SECLABEL */ ++ char *seclabel; ++}; ++ +struct kdbus_meta_proc *kdbus_meta_proc_new(void); +struct kdbus_meta_proc *kdbus_meta_proc_ref(struct kdbus_meta_proc *mp); +struct kdbus_meta_proc *kdbus_meta_proc_unref(struct kdbus_meta_proc *mp); +int kdbus_meta_proc_collect(struct kdbus_meta_proc *mp, u64 what); -+int kdbus_meta_proc_fake(struct kdbus_meta_proc *mp, -+ const struct kdbus_creds *creds, -+ const struct kdbus_pids *pids, -+ const char *seclabel); ++ ++struct kdbus_meta_fake *kdbus_meta_fake_new(void); ++struct kdbus_meta_fake *kdbus_meta_fake_free(struct kdbus_meta_fake *mf); ++int kdbus_meta_fake_collect(struct kdbus_meta_fake *mf, ++ const struct kdbus_creds *creds, ++ const struct kdbus_pids *pids, ++ const char *seclabel); + +struct kdbus_meta_conn *kdbus_meta_conn_new(void); +struct kdbus_meta_conn *kdbus_meta_conn_ref(struct kdbus_meta_conn *mc); +struct kdbus_meta_conn *kdbus_meta_conn_unref(struct kdbus_meta_conn *mc); +int kdbus_meta_conn_collect(struct kdbus_meta_conn *mc, -+ struct kdbus_kmsg *kmsg, + struct kdbus_conn *conn, -+ u64 what); -+ -+int kdbus_meta_export_prepare(struct kdbus_meta_proc *mp, -+ struct kdbus_meta_conn *mc, -+ u64 *mask, size_t *sz); -+int kdbus_meta_export(struct kdbus_meta_proc *mp, -+ struct kdbus_meta_conn *mc, -+ u64 mask, -+ struct kdbus_pool_slice *slice, -+ off_t offset, size_t *real_size); -+u64 kdbus_meta_calc_attach_flags(const struct kdbus_conn *sender, -+ const struct kdbus_conn *receiver); ++ u64 msg_seqnum, u64 what); ++ ++int kdbus_meta_emit(struct kdbus_meta_proc *mp, ++ struct kdbus_meta_fake *mf, ++ struct kdbus_meta_conn *mc, ++ struct kdbus_conn *conn, ++ u64 mask, ++ struct kdbus_item **out_items, ++ size_t *out_size); ++u64 kdbus_meta_info_mask(const struct kdbus_conn *conn, u64 mask); ++u64 kdbus_meta_msg_mask(const struct kdbus_conn *snd, ++ const struct kdbus_conn *rcv); + +#endif diff --git a/ipc/kdbus/names.c b/ipc/kdbus/names.c new file mode 100644 -index 0000000..d77ee08 +index 0000000..057f806 --- /dev/null +++ b/ipc/kdbus/names.c @@ -0,0 +1,770 @@ @@ -17445,7 +20069,7 @@ index 0000000..d77ee08 + + down_write(®->rwlock); + -+ if (kdbus_conn_is_activator(conn)) { ++ if (conn->activator_of) { + activator = conn->activator_of->activator; + conn->activator_of->activator = NULL; + } @@ -17851,7 +20475,7 @@ index 0000000..3dd2589 +#endif diff --git a/ipc/kdbus/node.c b/ipc/kdbus/node.c new file mode 100644 -index 0000000..0d65c65 +index 0000000..89f58bc --- /dev/null +++ b/ipc/kdbus/node.c @@ -0,0 +1,897 @@ @@ -17977,7 +20601,7 @@ index 0000000..0d65c65 + * new active references can be acquired. + * Once all active references are dropped, the node is considered 'drained'. Now + * kdbus_node_deactivate() is called on each child of the node before we -+ * continue deactvating our node. That is, once all children are entirely ++ * continue deactivating our node. That is, once all children are entirely + * deactivated, we call ->release_cb() of our node. ->release_cb() can release + * any resources on that node which are bound to the "active" state of a node. + * When done, we unlink the node from its parent rb-tree, mark it as @@ -18494,7 +21118,7 @@ index 0000000..0d65c65 + kdbus_fs_flush(pos); + + /* -+ * If the node was activated and somone subtracted BIAS ++ * If the node was activated and someone subtracted BIAS + * from it to deactivate it, we, and only us, are + * responsible to release the extra ref-count that was + * taken once in kdbus_node_activate(). @@ -18846,10 +21470,10 @@ index 0000000..970e02b +#endif diff --git a/ipc/kdbus/notify.c b/ipc/kdbus/notify.c new file mode 100644 -index 0000000..e4a4542 +index 0000000..375758c --- /dev/null +++ b/ipc/kdbus/notify.c -@@ -0,0 +1,248 @@ +@@ -0,0 +1,204 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -18880,40 +21504,24 @@ index 0000000..e4a4542 +#include "message.h" +#include "notify.h" + -+static inline void kdbus_notify_add_tail(struct kdbus_kmsg *kmsg, ++static inline void kdbus_notify_add_tail(struct kdbus_staging *staging, + struct kdbus_bus *bus) +{ + spin_lock(&bus->notify_lock); -+ list_add_tail(&kmsg->notify_entry, &bus->notify_list); ++ list_add_tail(&staging->notify_entry, &bus->notify_list); + spin_unlock(&bus->notify_lock); +} + +static int kdbus_notify_reply(struct kdbus_bus *bus, u64 id, + u64 cookie, u64 msg_type) +{ -+ struct kdbus_kmsg *kmsg = NULL; -+ -+ WARN_ON(id == 0); -+ -+ kmsg = kdbus_kmsg_new(bus, 0); -+ if (IS_ERR(kmsg)) -+ return PTR_ERR(kmsg); ++ struct kdbus_staging *s; + -+ /* -+ * a kernel-generated notification can only contain one -+ * struct kdbus_item, so make a shortcut here for -+ * faster lookup in the match db. -+ */ -+ kmsg->notify_type = msg_type; -+ kmsg->msg.flags = KDBUS_MSG_SIGNAL; -+ kmsg->msg.dst_id = id; -+ kmsg->msg.src_id = KDBUS_SRC_ID_KERNEL; -+ kmsg->msg.payload_type = KDBUS_PAYLOAD_KERNEL; -+ kmsg->msg.cookie_reply = cookie; -+ kmsg->msg.items[0].type = msg_type; -+ -+ kdbus_notify_add_tail(kmsg, bus); ++ s = kdbus_staging_new_kernel(bus, id, cookie, 0, msg_type); ++ if (IS_ERR(s)) ++ return PTR_ERR(s); + ++ kdbus_notify_add_tail(s, bus); + return 0; +} + @@ -18967,78 +21575,52 @@ index 0000000..e4a4542 + u64 old_flags, u64 new_flags, + const char *name) +{ -+ struct kdbus_kmsg *kmsg = NULL; + size_t name_len, extra_size; ++ struct kdbus_staging *s; + + name_len = strlen(name) + 1; + extra_size = sizeof(struct kdbus_notify_name_change) + name_len; -+ kmsg = kdbus_kmsg_new(bus, extra_size); -+ if (IS_ERR(kmsg)) -+ return PTR_ERR(kmsg); -+ -+ kmsg->msg.flags = KDBUS_MSG_SIGNAL; -+ kmsg->msg.dst_id = KDBUS_DST_ID_BROADCAST; -+ kmsg->msg.src_id = KDBUS_SRC_ID_KERNEL; -+ kmsg->msg.payload_type = KDBUS_PAYLOAD_KERNEL; -+ kmsg->notify_type = type; -+ kmsg->notify_old_id = old_id; -+ kmsg->notify_new_id = new_id; -+ kmsg->msg.items[0].type = type; -+ kmsg->msg.items[0].name_change.old_id.id = old_id; -+ kmsg->msg.items[0].name_change.old_id.flags = old_flags; -+ kmsg->msg.items[0].name_change.new_id.id = new_id; -+ kmsg->msg.items[0].name_change.new_id.flags = new_flags; -+ memcpy(kmsg->msg.items[0].name_change.name, name, name_len); -+ kmsg->notify_name = kmsg->msg.items[0].name_change.name; -+ -+ kdbus_notify_add_tail(kmsg, bus); + ++ s = kdbus_staging_new_kernel(bus, KDBUS_DST_ID_BROADCAST, 0, ++ extra_size, type); ++ if (IS_ERR(s)) ++ return PTR_ERR(s); ++ ++ s->notify->name_change.old_id.id = old_id; ++ s->notify->name_change.old_id.flags = old_flags; ++ s->notify->name_change.new_id.id = new_id; ++ s->notify->name_change.new_id.flags = new_flags; ++ memcpy(s->notify->name_change.name, name, name_len); ++ ++ kdbus_notify_add_tail(s, bus); + return 0; +} + +/** -+ * kdbus_notify_id_change() - queue a notification about a unique ID change -+ * @bus: Bus which queues the messages -+ * @type: The type if the notification; KDBUS_ITEM_ID_ADD or -+ * KDBUS_ITEM_ID_REMOVE -+ * @id: The id of the connection that was added or removed -+ * @flags: The flags to pass in the KDBUS_ITEM flags field -+ * -+ * Return: 0 on success, negative errno on failure. -+ */ -+int kdbus_notify_id_change(struct kdbus_bus *bus, u64 type, u64 id, u64 flags) -+{ -+ struct kdbus_kmsg *kmsg = NULL; -+ -+ kmsg = kdbus_kmsg_new(bus, sizeof(struct kdbus_notify_id_change)); -+ if (IS_ERR(kmsg)) -+ return PTR_ERR(kmsg); -+ -+ kmsg->msg.flags = KDBUS_MSG_SIGNAL; -+ kmsg->msg.dst_id = KDBUS_DST_ID_BROADCAST; -+ kmsg->msg.src_id = KDBUS_SRC_ID_KERNEL; -+ kmsg->msg.payload_type = KDBUS_PAYLOAD_KERNEL; -+ kmsg->notify_type = type; -+ -+ switch (type) { -+ case KDBUS_ITEM_ID_ADD: -+ kmsg->notify_new_id = id; -+ break; -+ -+ case KDBUS_ITEM_ID_REMOVE: -+ kmsg->notify_old_id = id; -+ break; -+ -+ default: -+ BUG(); -+ } ++ * kdbus_notify_id_change() - queue a notification about a unique ID change ++ * @bus: Bus which queues the messages ++ * @type: The type if the notification; KDBUS_ITEM_ID_ADD or ++ * KDBUS_ITEM_ID_REMOVE ++ * @id: The id of the connection that was added or removed ++ * @flags: The flags to pass in the KDBUS_ITEM flags field ++ * ++ * Return: 0 on success, negative errno on failure. ++ */ ++int kdbus_notify_id_change(struct kdbus_bus *bus, u64 type, u64 id, u64 flags) ++{ ++ struct kdbus_staging *s; ++ size_t extra_size; + -+ kmsg->msg.items[0].type = type; -+ kmsg->msg.items[0].id_change.id = id; -+ kmsg->msg.items[0].id_change.flags = flags; ++ extra_size = sizeof(struct kdbus_notify_id_change); ++ s = kdbus_staging_new_kernel(bus, KDBUS_DST_ID_BROADCAST, 0, ++ extra_size, type); ++ if (IS_ERR(s)) ++ return PTR_ERR(s); + -+ kdbus_notify_add_tail(kmsg, bus); ++ s->notify->id_change.id = id; ++ s->notify->id_change.flags = flags; + ++ kdbus_notify_add_tail(s, bus); + return 0; +} + @@ -19051,7 +21633,7 @@ index 0000000..e4a4542 +void kdbus_notify_flush(struct kdbus_bus *bus) +{ + LIST_HEAD(notify_list); -+ struct kdbus_kmsg *kmsg, *tmp; ++ struct kdbus_staging *s, *tmp; + + mutex_lock(&bus->notify_flush_lock); + down_read(&bus->name_registry->rwlock); @@ -19060,25 +21642,23 @@ index 0000000..e4a4542 + list_splice_init(&bus->notify_list, ¬ify_list); + spin_unlock(&bus->notify_lock); + -+ list_for_each_entry_safe(kmsg, tmp, ¬ify_list, notify_entry) { -+ kdbus_meta_conn_collect(kmsg->conn_meta, kmsg, NULL, -+ KDBUS_ATTACH_TIMESTAMP); -+ -+ if (kmsg->msg.dst_id != KDBUS_DST_ID_BROADCAST) { ++ list_for_each_entry_safe(s, tmp, ¬ify_list, notify_entry) { ++ if (s->msg->dst_id != KDBUS_DST_ID_BROADCAST) { + struct kdbus_conn *conn; + -+ conn = kdbus_bus_find_conn_by_id(bus, kmsg->msg.dst_id); ++ conn = kdbus_bus_find_conn_by_id(bus, s->msg->dst_id); + if (conn) { -+ kdbus_bus_eavesdrop(bus, NULL, kmsg); -+ kdbus_conn_entry_insert(NULL, conn, kmsg, NULL); ++ kdbus_bus_eavesdrop(bus, NULL, s); ++ kdbus_conn_entry_insert(NULL, conn, s, NULL, ++ NULL); + kdbus_conn_unref(conn); + } + } else { -+ kdbus_bus_broadcast(bus, NULL, kmsg); ++ kdbus_bus_broadcast(bus, NULL, s); + } + -+ list_del(&kmsg->notify_entry); -+ kdbus_kmsg_free(kmsg); ++ list_del(&s->notify_entry); ++ kdbus_staging_free(s); + } + + up_read(&bus->name_registry->rwlock); @@ -19091,11 +21671,11 @@ index 0000000..e4a4542 + */ +void kdbus_notify_free(struct kdbus_bus *bus) +{ -+ struct kdbus_kmsg *kmsg, *tmp; ++ struct kdbus_staging *s, *tmp; + -+ list_for_each_entry_safe(kmsg, tmp, &bus->notify_list, notify_entry) { -+ list_del(&kmsg->notify_entry); -+ kdbus_kmsg_free(kmsg); ++ list_for_each_entry_safe(s, tmp, &bus->notify_list, notify_entry) { ++ list_del(&s->notify_entry); ++ kdbus_staging_free(s); + } +} diff --git a/ipc/kdbus/notify.h b/ipc/kdbus/notify.h @@ -19136,7 +21716,7 @@ index 0000000..03df464 +#endif diff --git a/ipc/kdbus/policy.c b/ipc/kdbus/policy.c new file mode 100644 -index 0000000..dd7fffa +index 0000000..f2618e15 --- /dev/null +++ b/ipc/kdbus/policy.c @@ -0,0 +1,489 @@ @@ -19486,7 +22066,7 @@ index 0000000..dd7fffa + * In order to allow atomic replacement of rules, the function first removes + * all entries that have been created for the given owner previously. + * -+ * Callers to this function must make sur that the owner is a custom ++ * Callers to this function must make sure that the owner is a custom + * endpoint, or if the endpoint is a default endpoint, then it must be + * either a policy holder or an activator. + * @@ -19688,7 +22268,7 @@ index 0000000..15dd7bc +#endif diff --git a/ipc/kdbus/pool.c b/ipc/kdbus/pool.c new file mode 100644 -index 0000000..45dcdea +index 0000000..63ccd55 --- /dev/null +++ b/ipc/kdbus/pool.c @@ -0,0 +1,728 @@ @@ -19738,7 +22318,7 @@ index 0000000..45dcdea + * The receiver's buffer, managed as a pool of allocated and free + * slices containing the queued messages. + * -+ * Messages sent with KDBUS_CMD_SEND are copied direcly by the ++ * Messages sent with KDBUS_CMD_SEND are copied directly by the + * sending process into the receiver's pool. + * + * Messages received with KDBUS_CMD_RECV just return the offset @@ -20474,10 +23054,10 @@ index 0000000..a903821 +#endif diff --git a/ipc/kdbus/queue.c b/ipc/kdbus/queue.c new file mode 100644 -index 0000000..25bb3ad +index 0000000..f9c44d7 --- /dev/null +++ b/ipc/kdbus/queue.c -@@ -0,0 +1,678 @@ +@@ -0,0 +1,363 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -20651,242 +23231,43 @@ index 0000000..25bb3ad + +/** + * kdbus_queue_entry_new() - allocate a queue entry -+ * @conn_dst: destination connection -+ * @kmsg: kmsg object the queue entry should track -+ * @user: user to account message on (or NULL for kernel messages) ++ * @src: source connection, or NULL ++ * @dst: destination connection ++ * @s: staging object carrying the message + * -+ * Allocates a queue entry based on a given kmsg and allocate space for ++ * Allocates a queue entry based on a given msg and allocate space for + * the message payload and the requested metadata in the connection's pool. + * The entry is not actually added to the queue's lists at this point. + * + * Return: the allocated entry on success, or an ERR_PTR on failures. + */ -+struct kdbus_queue_entry *kdbus_queue_entry_new(struct kdbus_conn *conn_dst, -+ const struct kdbus_kmsg *kmsg, -+ struct kdbus_user *user) ++struct kdbus_queue_entry *kdbus_queue_entry_new(struct kdbus_conn *src, ++ struct kdbus_conn *dst, ++ struct kdbus_staging *s) +{ -+ struct kdbus_msg_resources *res = kmsg->res; -+ const struct kdbus_msg *msg = &kmsg->msg; + struct kdbus_queue_entry *entry; -+ size_t memfd_cnt = 0; -+ struct kvec kvec[2]; -+ size_t meta_size; -+ size_t msg_size; -+ u64 payload_off; -+ u64 size = 0; -+ int ret = 0; ++ int ret; + + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&entry->entry); -+ entry->priority = msg->priority; -+ entry->dst_name_id = kmsg->dst_name_id; -+ entry->msg_res = kdbus_msg_resources_ref(res); -+ entry->proc_meta = kdbus_meta_proc_ref(kmsg->proc_meta); -+ entry->conn_meta = kdbus_meta_conn_ref(kmsg->conn_meta); -+ entry->conn = kdbus_conn_ref(conn_dst); -+ -+ if (kmsg->msg.src_id == KDBUS_SRC_ID_KERNEL) -+ msg_size = msg->size; -+ else -+ msg_size = offsetof(struct kdbus_msg, items); -+ -+ /* sum up the size of the needed slice */ -+ size = msg_size; -+ -+ if (res) { -+ size += res->vec_count * -+ KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec)); -+ -+ if (res->memfd_count) { -+ entry->memfd_offset = -+ kcalloc(res->memfd_count, sizeof(size_t), -+ GFP_KERNEL); -+ if (!entry->memfd_offset) { -+ ret = -ENOMEM; -+ goto exit_free_entry; -+ } -+ -+ size += res->memfd_count * -+ KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd)); -+ } -+ -+ if (res->fds_count) -+ size += KDBUS_ITEM_SIZE(sizeof(int) * res->fds_count); -+ -+ if (res->dst_name) -+ size += KDBUS_ITEM_SIZE(strlen(res->dst_name) + 1); -+ } -+ -+ /* -+ * Remember the offset of the metadata part, so we can override -+ * this part later during kdbus_queue_entry_install(). -+ */ -+ entry->meta_offset = size; -+ -+ if (entry->proc_meta || entry->conn_meta) { -+ entry->attach_flags = -+ atomic64_read(&conn_dst->attach_flags_recv); -+ -+ ret = kdbus_meta_export_prepare(entry->proc_meta, -+ entry->conn_meta, -+ &entry->attach_flags, -+ &meta_size); -+ if (ret < 0) -+ goto exit_free_entry; -+ -+ size += meta_size; -+ } -+ -+ payload_off = size; -+ size += kmsg->pool_size; -+ size = KDBUS_ALIGN8(size); -+ -+ ret = kdbus_conn_quota_inc(conn_dst, user, size, -+ res ? res->fds_count : 0); -+ if (ret < 0) -+ goto exit_free_entry; ++ entry->priority = s->msg->priority; ++ entry->conn = kdbus_conn_ref(dst); ++ entry->gaps = kdbus_gaps_ref(s->gaps); + -+ entry->slice = kdbus_pool_slice_alloc(conn_dst->pool, size, true); ++ entry->slice = kdbus_staging_emit(s, src, dst); + if (IS_ERR(entry->slice)) { + ret = PTR_ERR(entry->slice); + entry->slice = NULL; -+ kdbus_conn_quota_dec(conn_dst, user, size, -+ res ? res->fds_count : 0); -+ goto exit_free_entry; -+ } -+ -+ /* we accounted for exactly 'size' bytes, make sure it didn't grow */ -+ WARN_ON(kdbus_pool_slice_size(entry->slice) != size); -+ entry->user = kdbus_user_ref(user); -+ -+ /* copy message header */ -+ kvec[0].iov_base = (char *)msg; -+ kvec[0].iov_len = msg_size; -+ -+ ret = kdbus_pool_slice_copy_kvec(entry->slice, 0, kvec, 1, msg_size); -+ if (ret < 0) -+ goto exit_free_entry; -+ -+ /* 'size' will now track the write position */ -+ size = msg_size; -+ -+ /* create message payload items */ -+ if (res) { -+ size_t dst_name_len = 0; -+ unsigned int i; -+ size_t sz = 0; -+ -+ if (res->dst_name) { -+ dst_name_len = strlen(res->dst_name) + 1; -+ sz += KDBUS_ITEM_SIZE(dst_name_len); -+ } -+ -+ for (i = 0; i < res->data_count; ++i) { -+ struct kdbus_vec v; -+ struct kdbus_memfd m; -+ -+ switch (res->data[i].type) { -+ case KDBUS_MSG_DATA_VEC: -+ sz += KDBUS_ITEM_SIZE(sizeof(v)); -+ break; -+ -+ case KDBUS_MSG_DATA_MEMFD: -+ sz += KDBUS_ITEM_SIZE(sizeof(m)); -+ break; -+ } -+ } -+ -+ if (sz) { -+ struct kdbus_item *items, *item; -+ -+ items = kmalloc(sz, GFP_KERNEL); -+ if (!items) { -+ ret = -ENOMEM; -+ goto exit_free_entry; -+ } -+ -+ item = items; -+ -+ if (res->dst_name) -+ item = kdbus_item_set(item, KDBUS_ITEM_DST_NAME, -+ res->dst_name, -+ dst_name_len); -+ -+ for (i = 0; i < res->data_count; ++i) { -+ struct kdbus_msg_data *d = res->data + i; -+ struct kdbus_memfd m = {}; -+ struct kdbus_vec v = {}; -+ -+ switch (d->type) { -+ case KDBUS_MSG_DATA_VEC: -+ v.size = d->size; -+ v.offset = d->vec.off; -+ if (v.offset != ~0ULL) -+ v.offset += payload_off; -+ -+ item = kdbus_item_set(item, -+ KDBUS_ITEM_PAYLOAD_OFF, -+ &v, sizeof(v)); -+ break; -+ -+ case KDBUS_MSG_DATA_MEMFD: -+ /* -+ * Remember the location of memfds, so -+ * we can override the content from -+ * kdbus_queue_entry_install(). -+ */ -+ entry->memfd_offset[memfd_cnt++] = -+ msg_size + -+ (char *)item - (char *)items + -+ offsetof(struct kdbus_item, -+ memfd); -+ -+ item = kdbus_item_set(item, -+ KDBUS_ITEM_PAYLOAD_MEMFD, -+ &m, sizeof(m)); -+ break; -+ } -+ } -+ -+ kvec[0].iov_base = items; -+ kvec[0].iov_len = sz; -+ -+ ret = kdbus_pool_slice_copy_kvec(entry->slice, size, -+ kvec, 1, sz); -+ kfree(items); -+ -+ if (ret < 0) -+ goto exit_free_entry; -+ -+ size += sz; -+ } -+ -+ /* -+ * Remember the location of the FD part, so we can override the -+ * content in kdbus_queue_entry_install(). -+ */ -+ if (res->fds_count) { -+ entry->fds_offset = size; -+ size += KDBUS_ITEM_SIZE(sizeof(int) * res->fds_count); -+ } -+ } -+ -+ /* finally, copy over the actual message payload */ -+ if (kmsg->iov_count) { -+ ret = kdbus_pool_slice_copy_iovec(entry->slice, payload_off, -+ kmsg->iov, -+ kmsg->iov_count, -+ kmsg->pool_size); -+ if (ret < 0) -+ goto exit_free_entry; ++ goto error; + } + ++ entry->user = src ? kdbus_user_ref(src->user) : NULL; + return entry; + -+exit_free_entry: ++error: + kdbus_queue_entry_free(entry); + return ERR_PTR(ret); +} @@ -20911,17 +23292,13 @@ index 0000000..25bb3ad + if (entry->slice) { + kdbus_conn_quota_dec(entry->conn, entry->user, + kdbus_pool_slice_size(entry->slice), -+ entry->msg_res ? -+ entry->msg_res->fds_count : 0); ++ entry->gaps ? entry->gaps->n_fds : 0); + kdbus_pool_slice_release(entry->slice); -+ kdbus_user_unref(entry->user); + } + -+ kdbus_msg_resources_unref(entry->msg_res); -+ kdbus_meta_conn_unref(entry->conn_meta); -+ kdbus_meta_proc_unref(entry->proc_meta); ++ kdbus_user_unref(entry->user); ++ kdbus_gaps_unref(entry->gaps); + kdbus_conn_unref(entry->conn); -+ kfree(entry->memfd_offset); + kfree(entry); +} + @@ -20932,134 +23309,22 @@ index 0000000..25bb3ad + * @return_flags: Pointer to store the return flags for userspace + * @install_fds: Whether or not to install associated file descriptors + * -+ * This function will create a slice to transport the message header, the -+ * metadata items and other items for information stored in @entry, and -+ * store it as entry->slice. -+ * -+ * If @install_fds is %true, file descriptors will as well be installed. -+ * This function must always be called from the task context of the receiver. -+ * + * Return: 0 on success. + */ +int kdbus_queue_entry_install(struct kdbus_queue_entry *entry, + u64 *return_flags, bool install_fds) +{ -+ u64 msg_size = entry->meta_offset; -+ struct kdbus_conn *conn_dst = entry->conn; -+ struct kdbus_msg_resources *res; + bool incomplete_fds = false; -+ struct kvec kvec[2]; -+ size_t memfds = 0; -+ int i, ret; -+ -+ lockdep_assert_held(&conn_dst->lock); -+ -+ if (entry->proc_meta || entry->conn_meta) { -+ size_t meta_size; -+ -+ ret = kdbus_meta_export(entry->proc_meta, -+ entry->conn_meta, -+ entry->attach_flags, -+ entry->slice, -+ entry->meta_offset, -+ &meta_size); -+ if (ret < 0) -+ return ret; -+ -+ msg_size += meta_size; -+ } ++ int ret; + -+ /* Update message size at offset 0 */ -+ kvec[0].iov_base = &msg_size; -+ kvec[0].iov_len = sizeof(msg_size); ++ lockdep_assert_held(&entry->conn->lock); + -+ ret = kdbus_pool_slice_copy_kvec(entry->slice, 0, kvec, 1, -+ sizeof(msg_size)); ++ ret = kdbus_gaps_install(entry->gaps, entry->slice, &incomplete_fds); + if (ret < 0) + return ret; + -+ res = entry->msg_res; -+ -+ if (!res) -+ return 0; -+ -+ if (res->fds_count) { -+ struct kdbus_item_header hdr; -+ size_t off; -+ int *fds; -+ -+ fds = kmalloc_array(res->fds_count, sizeof(int), GFP_KERNEL); -+ if (!fds) -+ return -ENOMEM; -+ -+ for (i = 0; i < res->fds_count; i++) { -+ if (install_fds) { -+ fds[i] = get_unused_fd_flags(O_CLOEXEC); -+ if (fds[i] >= 0) -+ fd_install(fds[i], -+ get_file(res->fds[i])); -+ else -+ incomplete_fds = true; -+ } else { -+ fds[i] = -1; -+ } -+ } -+ -+ off = entry->fds_offset; -+ -+ hdr.type = KDBUS_ITEM_FDS; -+ hdr.size = KDBUS_ITEM_HEADER_SIZE + -+ sizeof(int) * res->fds_count; -+ -+ kvec[0].iov_base = &hdr; -+ kvec[0].iov_len = sizeof(hdr); -+ -+ kvec[1].iov_base = fds; -+ kvec[1].iov_len = sizeof(int) * res->fds_count; -+ -+ ret = kdbus_pool_slice_copy_kvec(entry->slice, off, -+ kvec, 2, hdr.size); -+ kfree(fds); -+ -+ if (ret < 0) -+ return ret; -+ } -+ -+ for (i = 0; i < res->data_count; ++i) { -+ struct kdbus_msg_data *d = res->data + i; -+ struct kdbus_memfd m; -+ -+ if (d->type != KDBUS_MSG_DATA_MEMFD) -+ continue; -+ -+ m.start = d->memfd.start; -+ m.size = d->size; -+ m.fd = -1; -+ -+ if (install_fds) { -+ m.fd = get_unused_fd_flags(O_CLOEXEC); -+ if (m.fd < 0) { -+ m.fd = -1; -+ incomplete_fds = true; -+ } else { -+ fd_install(m.fd, -+ get_file(d->memfd.file)); -+ } -+ } -+ -+ kvec[0].iov_base = &m; -+ kvec[0].iov_len = sizeof(m); -+ -+ ret = kdbus_pool_slice_copy_kvec(entry->slice, -+ entry->memfd_offset[memfds++], -+ kvec, 1, sizeof(m)); -+ if (ret < 0) -+ return ret; -+ } -+ + if (incomplete_fds) + *return_flags |= KDBUS_RECV_RETURN_INCOMPLETE_FDS; -+ + return 0; +} + @@ -21123,7 +23388,7 @@ index 0000000..25bb3ad + return 0; + + size = kdbus_pool_slice_size(e->slice); -+ fds = e->msg_res ? e->msg_res->fds_count : 0; ++ fds = e->gaps ? e->gaps->n_fds : 0; + + ret = kdbus_conn_quota_inc(dst, e->user, size, fds); + if (ret < 0) @@ -21158,10 +23423,10 @@ index 0000000..25bb3ad +} diff --git a/ipc/kdbus/queue.h b/ipc/kdbus/queue.h new file mode 100644 -index 0000000..7f2db96 +index 0000000..bf686d1 --- /dev/null +++ b/ipc/kdbus/queue.h -@@ -0,0 +1,92 @@ +@@ -0,0 +1,84 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Greg Kroah-Hartman @@ -21179,6 +23444,13 @@ index 0000000..7f2db96 +#ifndef __KDBUS_QUEUE_H +#define __KDBUS_QUEUE_H + ++#include ++#include ++ ++struct kdbus_conn; ++struct kdbus_pool_slice; ++struct kdbus_reply; ++struct kdbus_staging; +struct kdbus_user; + +/** @@ -21199,52 +23471,37 @@ index 0000000..7f2db96 + * @entry: Entry in the connection's list + * @prio_node: Entry in the priority queue tree + * @prio_entry: Queue tree node entry in the list of one priority -+ * @slice: Slice in the receiver's pool for the message -+ * @attach_flags: Attach flags used during slice allocation -+ * @meta_offset: Offset of first metadata item in slice -+ * @fds_offset: Offset of FD item in slice -+ * @memfd_offset: Array of slice-offsets for all memfd items + * @priority: Message priority + * @dst_name_id: The sequence number of the name this message is + * addressed to, 0 for messages sent to an ID -+ * @msg_res: Message resources -+ * @proc_meta: Process metadata, captured at message arrival -+ * @conn_meta: Connection metadata, captured at message arrival -+ * @reply: The reply block if a reply to this message is expected ++ * @conn: Connection this entry is queued on ++ * @gaps: Gaps object to fill message gaps at RECV time + * @user: User used for accounting ++ * @slice: Slice in the receiver's pool for the message ++ * @reply: The reply block if a reply to this message is expected + */ +struct kdbus_queue_entry { + struct list_head entry; + struct rb_node prio_node; + struct list_head prio_entry; + -+ struct kdbus_pool_slice *slice; -+ -+ u64 attach_flags; -+ size_t meta_offset; -+ size_t fds_offset; -+ size_t *memfd_offset; -+ + s64 priority; + u64 dst_name_id; + -+ struct kdbus_msg_resources *msg_res; -+ struct kdbus_meta_proc *proc_meta; -+ struct kdbus_meta_conn *conn_meta; -+ struct kdbus_reply *reply; + struct kdbus_conn *conn; ++ struct kdbus_gaps *gaps; + struct kdbus_user *user; ++ struct kdbus_pool_slice *slice; ++ struct kdbus_reply *reply; +}; + -+struct kdbus_kmsg; -+ +void kdbus_queue_init(struct kdbus_queue *queue); +struct kdbus_queue_entry *kdbus_queue_peek(struct kdbus_queue *queue, + s64 priority, bool use_priority); + -+struct kdbus_queue_entry *kdbus_queue_entry_new(struct kdbus_conn *conn_dst, -+ const struct kdbus_kmsg *kmsg, -+ struct kdbus_user *user); ++struct kdbus_queue_entry *kdbus_queue_entry_new(struct kdbus_conn *src, ++ struct kdbus_conn *dst, ++ struct kdbus_staging *s); +void kdbus_queue_entry_free(struct kdbus_queue_entry *entry); +int kdbus_queue_entry_install(struct kdbus_queue_entry *entry, + u64 *return_flags, bool install_fds); @@ -21827,6 +24084,463 @@ index 0000000..5297166 +size_t kdbus_kvec_pad(struct kvec *kvec, u64 *len); + +#endif +diff --git a/kernel/events/core.c b/kernel/events/core.c +index 0ceb386..eddf1ed 100644 +--- a/kernel/events/core.c ++++ b/kernel/events/core.c +@@ -4331,20 +4331,20 @@ static void ring_buffer_attach(struct perf_event *event, + WARN_ON_ONCE(event->rcu_pending); + + old_rb = event->rb; ++ event->rcu_batches = get_state_synchronize_rcu(); ++ event->rcu_pending = 1; ++ + spin_lock_irqsave(&old_rb->event_lock, flags); + list_del_rcu(&event->rb_entry); + spin_unlock_irqrestore(&old_rb->event_lock, flags); ++ } + +- event->rcu_batches = get_state_synchronize_rcu(); +- event->rcu_pending = 1; ++ if (event->rcu_pending && rb) { ++ cond_synchronize_rcu(event->rcu_batches); ++ event->rcu_pending = 0; + } + + if (rb) { +- if (event->rcu_pending) { +- cond_synchronize_rcu(event->rcu_batches); +- event->rcu_pending = 0; +- } +- + spin_lock_irqsave(&rb->event_lock, flags); + list_add_rcu(&event->rb_entry, &rb->event_list); + spin_unlock_irqrestore(&rb->event_lock, flags); +diff --git a/net/bridge/br_ioctl.c b/net/bridge/br_ioctl.c +index 8d423bc..a9a4a1b 100644 +--- a/net/bridge/br_ioctl.c ++++ b/net/bridge/br_ioctl.c +@@ -247,7 +247,9 @@ static int old_dev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) + if (!ns_capable(dev_net(dev)->user_ns, CAP_NET_ADMIN)) + return -EPERM; + ++ spin_lock_bh(&br->lock); + br_stp_set_bridge_priority(br, args[1]); ++ spin_unlock_bh(&br->lock); + return 0; + + case BRCTL_SET_PORT_PRIORITY: +diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c +index 7832d07..4114687 100644 +--- a/net/bridge/br_stp_if.c ++++ b/net/bridge/br_stp_if.c +@@ -243,13 +243,12 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br) + return true; + } + +-/* Acquires and releases bridge lock */ ++/* called under bridge lock */ + void br_stp_set_bridge_priority(struct net_bridge *br, u16 newprio) + { + struct net_bridge_port *p; + int wasroot; + +- spin_lock_bh(&br->lock); + wasroot = br_is_root_bridge(br); + + list_for_each_entry(p, &br->port_list, list) { +@@ -267,7 +266,6 @@ void br_stp_set_bridge_priority(struct net_bridge *br, u16 newprio) + br_port_state_selection(br); + if (br_is_root_bridge(br) && !wasroot) + br_become_root_bridge(br); +- spin_unlock_bh(&br->lock); + } + + /* called under bridge lock */ +diff --git a/net/can/af_can.c b/net/can/af_can.c +index 689c818..32d710e 100644 +--- a/net/can/af_can.c ++++ b/net/can/af_can.c +@@ -310,12 +310,8 @@ int can_send(struct sk_buff *skb, int loop) + return err; + } + +- if (newskb) { +- if (!(newskb->tstamp.tv64)) +- __net_timestamp(newskb); +- ++ if (newskb) + netif_rx_ni(newskb); +- } + + /* update statistics */ + can_stats.tx_frames++; +diff --git a/net/core/neighbour.c b/net/core/neighbour.c +index 2237c1b..3de6542 100644 +--- a/net/core/neighbour.c ++++ b/net/core/neighbour.c +@@ -957,8 +957,6 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb) + rc = 0; + if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE)) + goto out_unlock_bh; +- if (neigh->dead) +- goto out_dead; + + if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) { + if (NEIGH_VAR(neigh->parms, MCAST_PROBES) + +@@ -1015,13 +1013,6 @@ out_unlock_bh: + write_unlock(&neigh->lock); + local_bh_enable(); + return rc; +- +-out_dead: +- if (neigh->nud_state & NUD_STALE) +- goto out_unlock_bh; +- write_unlock_bh(&neigh->lock); +- kfree_skb(skb); +- return 1; + } + EXPORT_SYMBOL(__neigh_event_send); + +@@ -1085,8 +1076,6 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new, + if (!(flags & NEIGH_UPDATE_F_ADMIN) && + (old & (NUD_NOARP | NUD_PERMANENT))) + goto out; +- if (neigh->dead) +- goto out; + + if (!(new & NUD_VALID)) { + neigh_del_timer(neigh); +@@ -1236,8 +1225,6 @@ EXPORT_SYMBOL(neigh_update); + */ + void __neigh_set_probe_once(struct neighbour *neigh) + { +- if (neigh->dead) +- return; + neigh->updated = jiffies; + if (!(neigh->nud_state & NUD_FAILED)) + return; +diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c +index a5aa54e..8b47a4d 100644 +--- a/net/ipv4/af_inet.c ++++ b/net/ipv4/af_inet.c +@@ -228,8 +228,6 @@ int inet_listen(struct socket *sock, int backlog) + err = 0; + if (err) + goto out; +- +- tcp_fastopen_init_key_once(true); + } + err = inet_csk_listen_start(sk, backlog); + if (err) +diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c +index 6ddde89..7cfb089 100644 +--- a/net/ipv4/ip_sockglue.c ++++ b/net/ipv4/ip_sockglue.c +@@ -432,15 +432,6 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf + kfree_skb(skb); + } + +-/* For some errors we have valid addr_offset even with zero payload and +- * zero port. Also, addr_offset should be supported if port is set. +- */ +-static inline bool ipv4_datagram_support_addr(struct sock_exterr_skb *serr) +-{ +- return serr->ee.ee_origin == SO_EE_ORIGIN_ICMP || +- serr->ee.ee_origin == SO_EE_ORIGIN_LOCAL || serr->port; +-} +- + /* IPv4 supports cmsg on all imcp errors and some timestamps + * + * Timestamp code paths do not initialize the fields expected by cmsg: +@@ -507,7 +498,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) + + serr = SKB_EXT_ERR(skb); + +- if (sin && ipv4_datagram_support_addr(serr)) { ++ if (sin && serr->port) { + sin->sin_family = AF_INET; + sin->sin_addr.s_addr = *(__be32 *)(skb_network_header(skb) + + serr->addr_offset); +diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c +index bb2ce74..f1377f2 100644 +--- a/net/ipv4/tcp.c ++++ b/net/ipv4/tcp.c +@@ -2545,13 +2545,10 @@ static int do_tcp_setsockopt(struct sock *sk, int level, + + case TCP_FASTOPEN: + if (val >= 0 && ((1 << sk->sk_state) & (TCPF_CLOSE | +- TCPF_LISTEN))) { +- tcp_fastopen_init_key_once(true); +- ++ TCPF_LISTEN))) + err = fastopen_init_queue(sk, val); +- } else { ++ else + err = -EINVAL; +- } + break; + case TCP_TIMESTAMP: + if (!tp->repair) +diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c +index f9c0fb8..46b087a 100644 +--- a/net/ipv4/tcp_fastopen.c ++++ b/net/ipv4/tcp_fastopen.c +@@ -78,6 +78,8 @@ static bool __tcp_fastopen_cookie_gen(const void *path, + struct tcp_fastopen_context *ctx; + bool ok = false; + ++ tcp_fastopen_init_key_once(true); ++ + rcu_read_lock(); + ctx = rcu_dereference(tcp_fastopen_ctx); + if (ctx) { +diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c +index 62d908e..762a58c 100644 +--- a/net/ipv6/datagram.c ++++ b/net/ipv6/datagram.c +@@ -325,16 +325,6 @@ void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu) + kfree_skb(skb); + } + +-/* For some errors we have valid addr_offset even with zero payload and +- * zero port. Also, addr_offset should be supported if port is set. +- */ +-static inline bool ipv6_datagram_support_addr(struct sock_exterr_skb *serr) +-{ +- return serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6 || +- serr->ee.ee_origin == SO_EE_ORIGIN_ICMP || +- serr->ee.ee_origin == SO_EE_ORIGIN_LOCAL || serr->port; +-} +- + /* IPv6 supports cmsg on all origins aside from SO_EE_ORIGIN_LOCAL. + * + * At one point, excluding local errors was a quick test to identify icmp/icmp6 +@@ -399,7 +389,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) + + serr = SKB_EXT_ERR(skb); + +- if (sin && ipv6_datagram_support_addr(serr)) { ++ if (sin && serr->port) { + const unsigned char *nh = skb_network_header(skb); + sin->sin6_family = AF_INET6; + sin->sin6_flowinfo = 0; +diff --git a/net/mac80211/key.c b/net/mac80211/key.c +index 81e9785..a907f2d 100644 +--- a/net/mac80211/key.c ++++ b/net/mac80211/key.c +@@ -66,15 +66,12 @@ update_vlan_tailroom_need_count(struct ieee80211_sub_if_data *sdata, int delta) + if (sdata->vif.type != NL80211_IFTYPE_AP) + return; + +- /* crypto_tx_tailroom_needed_cnt is protected by this */ +- assert_key_lock(sdata->local); +- +- rcu_read_lock(); ++ mutex_lock(&sdata->local->mtx); + +- list_for_each_entry_rcu(vlan, &sdata->u.ap.vlans, u.vlan.list) ++ list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list) + vlan->crypto_tx_tailroom_needed_cnt += delta; + +- rcu_read_unlock(); ++ mutex_unlock(&sdata->local->mtx); + } + + static void increment_tailroom_need_count(struct ieee80211_sub_if_data *sdata) +@@ -98,8 +95,6 @@ static void increment_tailroom_need_count(struct ieee80211_sub_if_data *sdata) + * http://mid.gmane.org/1308590980.4322.19.camel@jlt3.sipsolutions.net + */ + +- assert_key_lock(sdata->local); +- + update_vlan_tailroom_need_count(sdata, 1); + + if (!sdata->crypto_tx_tailroom_needed_cnt++) { +@@ -114,8 +109,6 @@ static void increment_tailroom_need_count(struct ieee80211_sub_if_data *sdata) + static void decrease_tailroom_need_count(struct ieee80211_sub_if_data *sdata, + int delta) + { +- assert_key_lock(sdata->local); +- + WARN_ON_ONCE(sdata->crypto_tx_tailroom_needed_cnt < delta); + + update_vlan_tailroom_need_count(sdata, -delta); +diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c +index fe1610d..b5989c6 100644 +--- a/net/packet/af_packet.c ++++ b/net/packet/af_packet.c +@@ -1272,6 +1272,16 @@ static void packet_sock_destruct(struct sock *sk) + sk_refcnt_debug_dec(sk); + } + ++static int fanout_rr_next(struct packet_fanout *f, unsigned int num) ++{ ++ int x = atomic_read(&f->rr_cur) + 1; ++ ++ if (x >= num) ++ x = 0; ++ ++ return x; ++} ++ + static unsigned int fanout_demux_hash(struct packet_fanout *f, + struct sk_buff *skb, + unsigned int num) +@@ -1283,9 +1293,13 @@ static unsigned int fanout_demux_lb(struct packet_fanout *f, + struct sk_buff *skb, + unsigned int num) + { +- unsigned int val = atomic_inc_return(&f->rr_cur); ++ int cur, old; + +- return val % num; ++ cur = atomic_read(&f->rr_cur); ++ while ((old = atomic_cmpxchg(&f->rr_cur, cur, ++ fanout_rr_next(f, num))) != cur) ++ cur = old; ++ return cur; + } + + static unsigned int fanout_demux_cpu(struct packet_fanout *f, +@@ -1339,7 +1353,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct net_device *dev, + struct packet_type *pt, struct net_device *orig_dev) + { + struct packet_fanout *f = pt->af_packet_priv; +- unsigned int num = READ_ONCE(f->num_members); ++ unsigned int num = f->num_members; + struct packet_sock *po; + unsigned int idx; + +diff --git a/net/sctp/output.c b/net/sctp/output.c +index abe7c2d..fc5e45b 100644 +--- a/net/sctp/output.c ++++ b/net/sctp/output.c +@@ -599,9 +599,7 @@ out: + return err; + no_route: + kfree_skb(nskb); +- +- if (asoc) +- IP_INC_STATS(sock_net(asoc->base.sk), IPSTATS_MIB_OUTNOROUTES); ++ IP_INC_STATS(sock_net(asoc->base.sk), IPSTATS_MIB_OUTNOROUTES); + + /* FIXME: Returning the 'err' will effect all the associations + * associated with a socket, although only one of the paths of the +diff --git a/net/sctp/socket.c b/net/sctp/socket.c +index 5f6c4e6..f09de7f 100644 +--- a/net/sctp/socket.c ++++ b/net/sctp/socket.c +@@ -1528,10 +1528,8 @@ static void sctp_close(struct sock *sk, long timeout) + + /* Supposedly, no process has access to the socket, but + * the net layers still may. +- * Also, sctp_destroy_sock() needs to be called with addr_wq_lock +- * held and that should be grabbed before socket lock. + */ +- spin_lock_bh(&net->sctp.addr_wq_lock); ++ local_bh_disable(); + bh_lock_sock(sk); + + /* Hold the sock, since sk_common_release() will put sock_put() +@@ -1541,7 +1539,7 @@ static void sctp_close(struct sock *sk, long timeout) + sk_common_release(sk); + + bh_unlock_sock(sk); +- spin_unlock_bh(&net->sctp.addr_wq_lock); ++ local_bh_enable(); + + sock_put(sk); + +@@ -3582,7 +3580,6 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, char __user *optval, + if ((val && sp->do_auto_asconf) || (!val && !sp->do_auto_asconf)) + return 0; + +- spin_lock_bh(&sock_net(sk)->sctp.addr_wq_lock); + if (val == 0 && sp->do_auto_asconf) { + list_del(&sp->auto_asconf_list); + sp->do_auto_asconf = 0; +@@ -3591,7 +3588,6 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, char __user *optval, + &sock_net(sk)->sctp.auto_asconf_splist); + sp->do_auto_asconf = 1; + } +- spin_unlock_bh(&sock_net(sk)->sctp.addr_wq_lock); + return 0; + } + +@@ -4125,28 +4121,18 @@ static int sctp_init_sock(struct sock *sk) + local_bh_disable(); + percpu_counter_inc(&sctp_sockets_allocated); + sock_prot_inuse_add(net, sk->sk_prot, 1); +- +- /* Nothing can fail after this block, otherwise +- * sctp_destroy_sock() will be called without addr_wq_lock held +- */ + if (net->sctp.default_auto_asconf) { +- spin_lock(&sock_net(sk)->sctp.addr_wq_lock); + list_add_tail(&sp->auto_asconf_list, + &net->sctp.auto_asconf_splist); + sp->do_auto_asconf = 1; +- spin_unlock(&sock_net(sk)->sctp.addr_wq_lock); +- } else { ++ } else + sp->do_auto_asconf = 0; +- } +- + local_bh_enable(); + + return 0; + } + +-/* Cleanup any SCTP per socket resources. Must be called with +- * sock_net(sk)->sctp.addr_wq_lock held if sp->do_auto_asconf is true +- */ ++/* Cleanup any SCTP per socket resources. */ + static void sctp_destroy_sock(struct sock *sk) + { + struct sctp_sock *sp; +@@ -7209,19 +7195,6 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk, + newinet->mc_list = NULL; + } + +-static inline void sctp_copy_descendant(struct sock *sk_to, +- const struct sock *sk_from) +-{ +- int ancestor_size = sizeof(struct inet_sock) + +- sizeof(struct sctp_sock) - +- offsetof(struct sctp_sock, auto_asconf_list); +- +- if (sk_from->sk_family == PF_INET6) +- ancestor_size += sizeof(struct ipv6_pinfo); +- +- __inet_sk_copy_descendant(sk_to, sk_from, ancestor_size); +-} +- + /* Populate the fields of the newsk from the oldsk and migrate the assoc + * and its messages to the newsk. + */ +@@ -7236,6 +7209,7 @@ static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk, + struct sk_buff *skb, *tmp; + struct sctp_ulpevent *event; + struct sctp_bind_hashbucket *head; ++ struct list_head tmplist; + + /* Migrate socket buffer sizes and all the socket level options to the + * new socket. +@@ -7243,7 +7217,12 @@ static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk, + newsk->sk_sndbuf = oldsk->sk_sndbuf; + newsk->sk_rcvbuf = oldsk->sk_rcvbuf; + /* Brute force copy old sctp opt. */ +- sctp_copy_descendant(newsk, oldsk); ++ if (oldsp->do_auto_asconf) { ++ memcpy(&tmplist, &newsp->auto_asconf_list, sizeof(tmplist)); ++ inet_sk_copy_descendant(newsk, oldsk); ++ memcpy(&newsp->auto_asconf_list, &tmplist, sizeof(tmplist)); ++ } else ++ inet_sk_copy_descendant(newsk, oldsk); + + /* Restore the ep value that was overwritten with the above structure + * copy. diff --git a/samples/Kconfig b/samples/Kconfig index 224ebb4..a4c6b2f 100644 --- a/samples/Kconfig @@ -23349,6 +26063,39 @@ index 0000000..c3ba958 +} + +#endif /* libc sanity check */ +diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c +index 212070e..7dade28 100644 +--- a/security/selinux/hooks.c ++++ b/security/selinux/hooks.c +@@ -403,7 +403,6 @@ static int selinux_is_sblabel_mnt(struct super_block *sb) + return sbsec->behavior == SECURITY_FS_USE_XATTR || + sbsec->behavior == SECURITY_FS_USE_TRANS || + sbsec->behavior == SECURITY_FS_USE_TASK || +- sbsec->behavior == SECURITY_FS_USE_NATIVE || + /* Special handling. Genfs but also in-core setxattr handler */ + !strcmp(sb->s_type->name, "sysfs") || + !strcmp(sb->s_type->name, "pstore") || +diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build +index 98cfc38..10df572 100644 +--- a/tools/build/Makefile.build ++++ b/tools/build/Makefile.build +@@ -94,12 +94,12 @@ obj-y := $(patsubst %/, %/$(obj)-in.o, $(obj-y)) + subdir-obj-y := $(filter %/$(obj)-in.o, $(obj-y)) + + # '$(OUTPUT)/dir' prefix to all objects +-objprefix := $(subst ./,,$(OUTPUT)$(dir)/) +-obj-y := $(addprefix $(objprefix),$(obj-y)) +-subdir-obj-y := $(addprefix $(objprefix),$(subdir-obj-y)) ++prefix := $(subst ./,,$(OUTPUT)$(dir)/) ++obj-y := $(addprefix $(prefix),$(obj-y)) ++subdir-obj-y := $(addprefix $(prefix),$(subdir-obj-y)) + + # Final '$(obj)-in.o' object +-in-target := $(objprefix)$(obj)-in.o ++in-target := $(prefix)$(obj)-in.o + + PHONY += $(subdir-y) + diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 95abddc..b57100c 100644 --- a/tools/testing/selftests/Makefile @@ -23546,10 +26293,10 @@ index 0000000..ed28cca +const char *enum_PAYLOAD(long long id); diff --git a/tools/testing/selftests/kdbus/kdbus-test.c b/tools/testing/selftests/kdbus/kdbus-test.c new file mode 100644 -index 0000000..294e82a +index 0000000..db732e5 --- /dev/null +++ b/tools/testing/selftests/kdbus/kdbus-test.c -@@ -0,0 +1,900 @@ +@@ -0,0 +1,899 @@ +#include +#include +#include @@ -23851,7 +26598,6 @@ index 0000000..294e82a + + ret = kdbus_create_bus(env->control_fd, + args->busname ?: n, -+ _KDBUS_ATTACH_ALL, + _KDBUS_ATTACH_ALL, &s); + free(n); + ASSERT_RETURN(ret == 0); @@ -24541,10 +27287,10 @@ index 0000000..a5c6ae8 +#endif /* _TEST_KDBUS_H_ */ diff --git a/tools/testing/selftests/kdbus/kdbus-util.c b/tools/testing/selftests/kdbus/kdbus-util.c new file mode 100644 -index 0000000..29a0cb1 +index 0000000..a5e54ca --- /dev/null +++ b/tools/testing/selftests/kdbus/kdbus-util.c -@@ -0,0 +1,1617 @@ +@@ -0,0 +1,1611 @@ +/* + * Copyright (C) 2013-2015 Daniel Mack + * Copyright (C) 2013-2015 Kay Sievers @@ -24661,8 +27407,7 @@ index 0000000..29a0cb1 +} + +int kdbus_create_bus(int control_fd, const char *name, -+ uint64_t req_meta, uint64_t owner_meta, -+ char **path) ++ uint64_t owner_meta, char **path) +{ + struct { + struct kdbus_cmd cmd; @@ -24674,12 +27419,12 @@ index 0000000..29a0cb1 + struct kdbus_bloom_parameter bloom; + } bp; + -+ /* required and owner metadata items */ ++ /* owner metadata items */ + struct { + uint64_t size; + uint64_t type; + uint64_t flags; -+ } attach[2]; ++ } attach; + + /* name item */ + struct { @@ -24699,13 +27444,9 @@ index 0000000..29a0cb1 + snprintf(bus_make.name.str, sizeof(bus_make.name.str), + "%u-%s", getuid(), name); + -+ bus_make.attach[0].type = KDBUS_ITEM_ATTACH_FLAGS_RECV; -+ bus_make.attach[0].size = sizeof(bus_make.attach[0]); -+ bus_make.attach[0].flags = req_meta; -+ -+ bus_make.attach[1].type = KDBUS_ITEM_ATTACH_FLAGS_SEND; -+ bus_make.attach[1].size = sizeof(bus_make.attach[0]); -+ bus_make.attach[1].flags = owner_meta; ++ bus_make.attach.type = KDBUS_ITEM_ATTACH_FLAGS_SEND; ++ bus_make.attach.size = sizeof(bus_make.attach); ++ bus_make.attach.flags = owner_meta; + + bus_make.name.type = KDBUS_ITEM_MAKE_NAME; + bus_make.name.size = KDBUS_ITEM_HEADER_SIZE + @@ -24714,8 +27455,7 @@ index 0000000..29a0cb1 + bus_make.cmd.flags = KDBUS_MAKE_ACCESS_WORLD; + bus_make.cmd.size = sizeof(bus_make.cmd) + + bus_make.bp.size + -+ bus_make.attach[0].size + -+ bus_make.attach[1].size + ++ bus_make.attach.size + + bus_make.name.size; + + kdbus_printf("Creating bus with name >%s< on control fd %d ...\n", @@ -26164,10 +28904,10 @@ index 0000000..29a0cb1 +} diff --git a/tools/testing/selftests/kdbus/kdbus-util.h b/tools/testing/selftests/kdbus/kdbus-util.h new file mode 100644 -index 0000000..d1a0f1b +index 0000000..e1e18b9 --- /dev/null +++ b/tools/testing/selftests/kdbus/kdbus-util.h -@@ -0,0 +1,219 @@ +@@ -0,0 +1,218 @@ +/* + * Copyright (C) 2013-2015 Kay Sievers + * Copyright (C) 2013-2015 Daniel Mack @@ -26338,8 +29078,7 @@ index 0000000..d1a0f1b +int kdbus_msg_dump(const struct kdbus_conn *conn, + const struct kdbus_msg *msg); +int kdbus_create_bus(int control_fd, const char *name, -+ uint64_t req_meta, uint64_t owner_meta, -+ char **path); ++ uint64_t owner_meta, char **path); +int kdbus_msg_send(const struct kdbus_conn *conn, const char *name, + uint64_t cookie, uint64_t flags, uint64_t timeout, + int64_t priority, uint64_t dst_id); @@ -27479,10 +30218,10 @@ index 0000000..71a92d8 +} diff --git a/tools/testing/selftests/kdbus/test-connection.c b/tools/testing/selftests/kdbus/test-connection.c new file mode 100644 -index 0000000..e7c4866 +index 0000000..4688ce8 --- /dev/null +++ b/tools/testing/selftests/kdbus/test-connection.c -@@ -0,0 +1,606 @@ +@@ -0,0 +1,597 @@ +#include +#include +#include @@ -27555,15 +30294,6 @@ index 0000000..e7c4866 + + hello.pool_size = POOL_SIZE; + -+ /* -+ * The connection created by the core requires ALL meta flags -+ * to be sent. An attempt to send less than that should result in -+ * -ECONNREFUSED. -+ */ -+ hello.attach_flags_send = _KDBUS_ATTACH_ALL & ~KDBUS_ATTACH_TIMESTAMP; -+ ret = kdbus_cmd_hello(fd, &hello); -+ ASSERT_RETURN(ret == -ECONNREFUSED); -+ + hello.attach_flags_send = _KDBUS_ATTACH_ALL; + hello.offset = (__u64)-1; + @@ -29832,10 +32562,10 @@ index 0000000..2360dc1 +} diff --git a/tools/testing/selftests/kdbus/test-message.c b/tools/testing/selftests/kdbus/test-message.c new file mode 100644 -index 0000000..f1615da +index 0000000..ddc1e0a --- /dev/null +++ b/tools/testing/selftests/kdbus/test-message.c -@@ -0,0 +1,731 @@ +@@ -0,0 +1,734 @@ +#include +#include +#include @@ -29892,9 +32622,12 @@ index 0000000..f1615da + KDBUS_DST_ID_BROADCAST); + ASSERT_RETURN(ret == 0); + -+ /* Make sure that we do not get our own broadcasts */ -+ ret = kdbus_msg_recv(sender, NULL, NULL); -+ ASSERT_RETURN(ret == -EAGAIN); ++ /* Make sure that we do get our own broadcasts */ ++ ret = kdbus_msg_recv(sender, &msg, &offset); ++ ASSERT_RETURN(ret == 0); ++ ASSERT_RETURN(msg->cookie == cookie); ++ ++ kdbus_msg_free(msg); + + /* ... and receive on the 2nd */ + ret = kdbus_msg_recv_poll(conn, 100, &msg, &offset); @@ -30569,10 +33302,10 @@ index 0000000..f1615da +} diff --git a/tools/testing/selftests/kdbus/test-metadata-ns.c b/tools/testing/selftests/kdbus/test-metadata-ns.c new file mode 100644 -index 0000000..ccdfae0 +index 0000000..1f6edc0 --- /dev/null +++ b/tools/testing/selftests/kdbus/test-metadata-ns.c -@@ -0,0 +1,503 @@ +@@ -0,0 +1,500 @@ +/* + * Test metadata in new namespaces. Even if our tests can run + * in a namespaced setup, this test is necessary so we can inspect @@ -30743,9 +33476,8 @@ index 0000000..ccdfae0 + ASSERT_EXIT(ret == 0); + ASSERT_EXIT(msg->dst_id == userns_conn->id); + -+ /* Different namespaces no CAPS */ + item = kdbus_get_item(msg, KDBUS_ITEM_CAPS); -+ ASSERT_EXIT(item == NULL); ++ ASSERT_EXIT(item); + + /* uid/gid not mapped, so we have unpriv cached creds */ + ret = kdbus_match_kdbus_creds(msg, &unmapped_creds); @@ -30771,9 +33503,8 @@ index 0000000..ccdfae0 + ASSERT_EXIT(ret == 0); + ASSERT_EXIT(msg->dst_id == KDBUS_DST_ID_BROADCAST); + -+ /* Different namespaces no CAPS */ + item = kdbus_get_item(msg, KDBUS_ITEM_CAPS); -+ ASSERT_EXIT(item == NULL); ++ ASSERT_EXIT(item); + + /* uid/gid not mapped, so we have unpriv cached creds */ + ret = kdbus_match_kdbus_creds(msg, &unmapped_creds); @@ -30933,9 +33664,8 @@ index 0000000..ccdfae0 + + userns_conn_id = msg->src_id; + -+ /* We do not share the userns, os no KDBUS_ITEM_CAPS */ + item = kdbus_get_item(msg, KDBUS_ITEM_CAPS); -+ ASSERT_RETURN(item == NULL); ++ ASSERT_RETURN(item); + + /* + * Compare received items, creds must be translated into @@ -32098,10 +34828,10 @@ index 0000000..3437012 +} diff --git a/tools/testing/selftests/kdbus/test-policy-priv.c b/tools/testing/selftests/kdbus/test-policy-priv.c new file mode 100644 -index 0000000..a318ccc +index 0000000..0208638 --- /dev/null +++ b/tools/testing/selftests/kdbus/test-policy-priv.c -@@ -0,0 +1,1269 @@ +@@ -0,0 +1,1285 @@ +#include +#include +#include @@ -32214,6 +34944,12 @@ index 0000000..a318ccc + KDBUS_DST_ID_BROADCAST); + ASSERT_RETURN(ret == 0); + ++ /* drop own broadcast */ ++ ret = kdbus_msg_recv(child_2, &msg, NULL); ++ ASSERT_RETURN(ret == 0); ++ ASSERT_RETURN(msg->src_id == child_2->id); ++ kdbus_msg_free(msg); ++ + /* Use a little bit high time */ + ret = kdbus_msg_recv_poll(child_2, 1000, + &msg, NULL); @@ -32249,6 +34985,12 @@ index 0000000..a318ccc + KDBUS_DST_ID_BROADCAST); + ASSERT_EXIT(ret == 0); + ++ /* drop own broadcast */ ++ ret = kdbus_msg_recv(child_2, &msg, NULL); ++ ASSERT_RETURN(ret == 0); ++ ASSERT_RETURN(msg->src_id == child_2->id); ++ kdbus_msg_free(msg); ++ + /* Use a little bit high time */ + ret = kdbus_msg_recv_poll(child_2, 1000, + &msg, NULL); @@ -32417,11 +35159,6 @@ index 0000000..a318ccc + * receiver is not able to TALK to that name. + */ + -+ ret = test_policy_priv_by_broadcast(env->buspath, owner_a, -+ DO_NOT_DROP, -+ -ETIMEDOUT, -ETIMEDOUT); -+ ASSERT_RETURN(ret == 0); -+ + /* Activate matching for a privileged connection */ + ret = kdbus_add_match_empty(owner_a); + ASSERT_RETURN(ret == 0); @@ -32512,6 +35249,15 @@ index 0000000..a318ccc + 0, 0, KDBUS_DST_ID_BROADCAST); + ASSERT_RETURN(ret == 0); + ++ ret = kdbus_msg_recv_poll(owner_a, 100, &msg, NULL); ++ ASSERT_RETURN(ret == 0); ++ ASSERT_RETURN(msg->cookie == expected_cookie); ++ ++ /* Check src ID */ ++ ASSERT_RETURN(msg->src_id == owner_a->id); ++ ++ kdbus_msg_free(msg); ++ + ret = kdbus_msg_recv_poll(owner_b, 100, &msg, NULL); + ASSERT_RETURN(ret == 0); + ASSERT_RETURN(msg->cookie == expected_cookie); @@ -33937,3 +36683,27 @@ index 0000000..cfd1930 + + return TEST_OK; +} +diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c +index 950064a..78fb820 100644 +--- a/virt/kvm/arm/vgic.c ++++ b/virt/kvm/arm/vgic.c +@@ -1561,7 +1561,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, + goto out; + } + +- if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020)) ++ if (irq_num >= kvm->arch.vgic.nr_irqs) + return -EINVAL; + + vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level); +@@ -2161,7 +2161,10 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, + + BUG_ON(!vgic_initialized(kvm)); + ++ if (spi > kvm->arch.vgic.nr_irqs) ++ return -EINVAL; + return kvm_vgic_inject_irq(kvm, 0, spi, level); ++ + } + + /* MSI not implemented yet */