From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id AE3BB15808A for ; Sat, 09 Aug 2025 19:42:28 +0000 (UTC) Received: from lists.gentoo.org (bobolink.gentoo.org [140.211.166.189]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519) (No client certificate requested) (Authenticated sender: relay-lists.gentoo.org@gentoo.org) by smtp.gentoo.org (Postfix) with ESMTPSA id 72F6D3420E5 for ; Sat, 09 Aug 2025 19:42:28 +0000 (UTC) Received: from bobolink.gentoo.org (localhost [127.0.0.1]) by bobolink.gentoo.org (Postfix) with ESMTP id B315D110560; Sat, 09 Aug 2025 19:42:25 +0000 (UTC) Received: from smtp.gentoo.org (dev.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bobolink.gentoo.org (Postfix) with ESMTPS id A87AF11055F for ; Sat, 09 Aug 2025 19:42:25 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 4C5CE342006 for ; Sat, 09 Aug 2025 19:42:25 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 81B6232A1 for ; Sat, 09 Aug 2025 19:42:23 +0000 (UTC) From: "Kerin Millar" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Kerin Millar" Message-ID: <1754768261.b54f286110c85c5808594b0ced0cb8a5f08994b5.kfm@gentoo> Subject: [gentoo-commits] proj/locale-gen:master commit in: / X-VCS-Repository: proj/locale-gen X-VCS-Files: locale-gen X-VCS-Directories: / X-VCS-Committer: kfm X-VCS-Committer-Name: Kerin Millar X-VCS-Revision: b54f286110c85c5808594b0ced0cb8a5f08994b5 X-VCS-Branch: master Date: Sat, 09 Aug 2025 19:42:23 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: db8ed510-22fd-4388-a4d9-bec3e5185051 X-Archives-Hash: 50b2a465564c7731dd2461891ce7de99 commit: b54f286110c85c5808594b0ced0cb8a5f08994b5 Author: Kerin Millar plushkava net> AuthorDate: Sat Aug 9 19:37:41 2025 +0000 Commit: Kerin Millar plushkava net> CommitDate: Sat Aug 9 19:37:41 2025 +0000 URL: https://gitweb.gentoo.org/proj/locale-gen.git/commit/?id=b54f2861 Disambiguate a mismatching codeset/charmap as a class of error Consider the following two locale definitions, both of which are erroneous. # Invalid because the codeset part mismatches the charmap zh_CN.GB18030 UTF-8 # Invalid because the charmap does not exist zh_CN GB2319 Presently, locale-gen(8) will raise the same diagnostic for both cases, complaining of an "invalid/mismatching charmap". Render the parse_config() subroutine able to distinguish between both cases so that the diagnostics can be improved. Mismatching codeset/charmap at /etc/locale.gen[2]: "zh_CN.GB18030 UTF-8" Invalid charmap at /etc/locale.gen[5]: "zh_CN GB2319" Signed-off-by: Kerin Millar plushkava.net> locale-gen | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/locale-gen b/locale-gen index cb54ed5..fb1a9bf 100755 --- a/locale-gen +++ b/locale-gen @@ -328,13 +328,15 @@ sub parse_config ($fh, $path, $locale_by, $charmap_by) { # Extract the specified locale and character map. Upon success, # a canonicalised representation of the locale is also returned. - my ($locale, $charmap, $canonical) = parse_entry(@fields); + my ($locale, $codeset, $charmap, $canonical) = parse_entry(@fields); # Validate both locale and character map before accepting. if (! $locale_by->{$locale}) { $thrower->('Invalid locale', $line); + } elsif (defined $codeset && $codeset ne $charmap) { + $thrower->('Mismatching codeset/charmap', $line); } elsif (! $charmap_by->{$charmap}) { - $thrower->('Invalid/mismatching charmap', $line); + $thrower->('Invalid charmap', $line); } else { push @locales, [ $locale, $charmap, $canonical ]; } @@ -345,21 +347,19 @@ sub parse_config ($fh, $path, $locale_by, $charmap_by) { sub parse_entry ($locale, $charmap) { my $canonical; + my $codeset; if (2 == (my @fields = split /@/, $locale, 3)) { # de_DE@euro ISO-8859-15 => de_DE.ISO-8859-15@euro $canonical = sprintf '%s.%s@%s', $fields[0], $charmap, $fields[1]; } elsif (2 == (@fields = split /\./, $locale, 3)) { # en_US.UTF-8 UTF-8 => en_US.UTF-8 - $locale = $fields[0]; - $canonical = "$locale.$charmap"; - if ($fields[1] ne $charmap) { - $charmap = ''; - } + ($locale, $codeset) = @fields; + $canonical = "$locale.$codeset"; } elsif (1 == @fields) { # en_US ISO-8859-1 => en_US.ISO-8859-1 $canonical = "$locale.$charmap"; } - return $locale, $charmap, $canonical; + return $locale, $codeset, $charmap, $canonical; } sub check_archive_dir ($prefix, $locale_dir) {