From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 63F03139083 for ; Thu, 23 Nov 2017 18:46:00 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 5EE3EE0E43; Thu, 23 Nov 2017 18:45:27 +0000 (UTC) Received: from smtp.gentoo.org (woodpecker.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 37842E0E45 for ; Thu, 23 Nov 2017 18:45:27 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 37F3B33DDA5 for ; Thu, 23 Nov 2017 18:45:26 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id EACA1A515 for ; Thu, 23 Nov 2017 18:45:24 +0000 (UTC) From: "Michał Górny" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Michał Górny" Message-ID: <1511462694.d39f865f5bbad9523ad6c2cfd06af95d9fa7d402.mgorny@gentoo> Subject: [gentoo-commits] data/glep:glep-manifest commit in: / X-VCS-Repository: data/glep X-VCS-Files: glep-0074.rst X-VCS-Directories: / X-VCS-Committer: mgorny X-VCS-Committer-Name: Michał Górny X-VCS-Revision: d39f865f5bbad9523ad6c2cfd06af95d9fa7d402 X-VCS-Branch: glep-manifest Date: Thu, 23 Nov 2017 18:45:24 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: b38ff6a8-9333-410b-98f3-29607a4cff8a X-Archives-Hash: e39925209b0b8c549076a7c8087fd76f commit: d39f865f5bbad9523ad6c2cfd06af95d9fa7d402 Author: Michał Górny gentoo org> AuthorDate: Thu Nov 23 18:44:54 2017 +0000 Commit: Michał Górny gentoo org> CommitDate: Thu Nov 23 18:44:54 2017 +0000 URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=d39f865f glep-0074: Make extended filename encoding optional glep-0074.rst | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/glep-0074.rst b/glep-0074.rst index 6db6caa..5270b7a 100644 --- a/glep-0074.rst +++ b/glep-0074.rst @@ -142,8 +142,15 @@ corresponding to valid UTF-8 code points excluding the backwards slash (``\``) and characters classified as control characters and whitespace in the current version of the Unicode standard [#UNICODE]_. -Any of the excluded characters that are present in path must be encoded -using one of the following escape sequences: +The implementation can optionally support extended filename encoding +to support those paths. If the encoding is not supported, +the implementation must reject directories containing any files using +non-compliant names, as well as Manifest files whose filename field +contains such filenames. + +If the encoding is supported, then all of the excluded characters that +are present in path must be encoded using one of the following escape +sequences: - characters in the ``U+0000`` to ``U+007F`` range can be encoded as ``\xHH`` where ``HH`` specifies the zero-padded, hexadecimal @@ -615,6 +622,13 @@ by attempting to locate the size field and take everything before it as filename. This was terribly fragile and even if it worked, it would solve the problem only partially. +To preserve compatibility with the current implementations and given +that all of the listed characters are not allowed for the foreseeable +Gentoo uses, the extended encoding support is optional. If such support +is not provided, the implementation must unconditionally reject any +such files. Ignoring them implicitly would be confusing, and it is +not possible to use them in explicit ``IGNORE`` entries. + The character encoding method provides means to overcome the character restrictions to extend the tool usability beyond immediate Gentoo uses. The backslash escape form based on Python unicode strings is used