public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Defining TZ in the base system profile?
@ 2023-01-19  1:48 Joshua Kinard
  2023-01-19  5:47 ` Michał Górny
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Joshua Kinard @ 2023-01-19  1:48 UTC (permalink / raw
  To: gentoo-dev


So this article[1] from 2017 popped up again on the tech radar via hackernews[2] and a few other sites[3].  It 
annotates how if the envvar TZ is undefined on a Linux system, it causes glibc to generate a number of 
additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  If defined to an actual value, 
such as ":/etc/localtime" (or even an empty string), glibc will instead generate far fewer, if any at all, of 
these stat-related syscalls.

Apparently, TZ is accessed quite frequently, so this has a compound effect, according to the article, in glibc 
making thousands of unnecessary stat-related syscalls to /etc/localtime (which must be hard-coded somewhere in 
glibc for this case).  Given the article's age (five years old), I tested the example C program out, and it 
does appear to still be accurate on a modern glibc-based system.  When TZ is undefined, I get exactly nine 
newfstatat calls on /etc/localtime.  If I define TZ to ":/etc/localtime", I do not get any of these newfstatat 
calls, and if I set TZ to an empty string, glibc will call openat() against "/usr/share/zoneinfo/Universal" 
and then generate exactly two newfstatat syscalls on that handle to read it.

I ran strace() against the undefined TZ case and the ":/etc/localtime" case, normalized the hex addresses to 
get a clean diff, and this is what it looks like:

     --- a   2023-01-18 20:30:36.826805343 -0500
     +++ b   2023-01-18 20:30:45.106983600 -0500
     @@ -1,4 +1,4 @@
     -# strace ./tz_test
     +# TZ=":/etc/localtime" strace ./tz_test
      execve("./tz_test", ["./tz_test"], 0xhhhhhhhhhhhh /* XX vars */) = 0
      brk(NULL)                               = 0xhhhhhhhhhhhh
      mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xhhhhhhhhhhhh
     @@ -61,15 +61,6 @@ read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0
      lseek(3, -2260, SEEK_CUR)               = 1292
      read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0\0\6\0\0\0\0"..., 3584) = 2260
      close(3)                                = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
     -newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3552, ...}, 0) = 0
      write(1, "Godspeed, dear friend!\n", 23Godspeed, dear friend!
      ) = 23
      exit_group(0)                           = ?

For comparison, I tested the same program on FreeBSD and it does not exhibit this behavior at all, regardless 
of whether TZ is undefined, a value, or an empty string.  I have yet to make a similar test on a mips/musl 
chroot to see how musl handles this.

There is a rather old (2010) StackOverflow question[4] about it as well, and someone left an answer in March 
of last year about the specific code in glibc that handles TZ if it is set or is an empty string.

So is adding a default definition of TZ to our base system /etc/profile something we want to look at?  I 
haven't tried any other methods of benchmarking to see if not making those additional syscalls is just placebo 
or if there are actual impacts.  Given how long this oddity has been around, I can't tell if it's a genuine 
bug in glibc, an unoptimized corner case, or just a big nothingburger.


1. https://blog.packagecloud.io/set-environment-variable-save-thousands-of-system-calls/
2. https://news.ycombinator.com/item?id=34346346
3. https://vermaden.wordpress.com/posts/
4. 
https://stackoverflow.com/questions/4554271/how-to-avoid-excessive-stat-etc-localtime-calls-in-strftime-on-linux


Thoughts?

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
rsa6144/5C63F4E3F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our lives slip away, moment by 
moment, lost in that vast, terrible in-between."

         --Emperor Turhan, Centauri Republic


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-dev] Defining TZ in the base system profile?
  2023-01-19  1:48 [gentoo-dev] Defining TZ in the base system profile? Joshua Kinard
@ 2023-01-19  5:47 ` Michał Górny
  2023-01-19 12:11   ` Arsen Arsenović
  2023-01-19  6:04 ` Ionen Wolkens
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Michał Górny @ 2023-01-19  5:47 UTC (permalink / raw
  To: gentoo-dev

On Wed, 2023-01-18 at 20:48 -0500, Joshua Kinard wrote:
> So this article[1] from 2017 popped up again on the tech radar via hackernews[2] and a few other sites[3].  It 
> annotates how if the envvar TZ is undefined on a Linux system, it causes glibc to generate a number of 
> additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  If defined to an actual value, 
> such as ":/etc/localtime" (or even an empty string), glibc will instead generate far fewer, if any at all, of 
> these stat-related syscalls.
> 
> [...]
> So is adding a default definition of TZ to our base system /etc/profile something we want to look at?  I 
> haven't tried any other methods of benchmarking to see if not making those additional syscalls is just placebo 
> or if there are actual impacts.  Given how long this oddity has been around, I can't tell if it's a genuine 
> bug in glibc, an unoptimized corner case, or just a big nothingburger.
> 

Am I correct that there's no real difference between setting it to
":/etc/localtime" and the actual timezone?

I suppose it would make sense to default it.

-- 
Best regards,
Michał Górny



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-dev] Defining TZ in the base system profile?
  2023-01-19  1:48 [gentoo-dev] Defining TZ in the base system profile? Joshua Kinard
  2023-01-19  5:47 ` Michał Górny
@ 2023-01-19  6:04 ` Ionen Wolkens
  2023-01-19 14:42 ` Michael Orlitzky
  2023-02-14 12:44 ` Haelwenn (lanodan) Monnier
  3 siblings, 0 replies; 6+ messages in thread
From: Ionen Wolkens @ 2023-01-19  6:04 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1102 bytes --]

On Wed, Jan 18, 2023 at 08:48:56PM -0500, Joshua Kinard wrote:
> 
> So this article[1] from 2017 popped up again on the tech radar via hackernews[2] and a few other sites[3].  It 
> annotates how if the envvar TZ is undefined on a Linux system, it causes glibc to generate a number of 
> additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  If defined to an actual value, 
> such as ":/etc/localtime" (or even an empty string), glibc will instead generate far fewer, if any at all, of 
> these stat-related syscalls.
[...]
> 
> Thoughts?

Sounds good to me from the little I know of it, albeit I do imagine it
could raise issues with some packages that try to use/handle TZ
themselves and no telling what obscure thing this is going to break.

exa[1][2] is one example that sam mentioned, but I imagine there's
more to find.

Personally added to /etc/env.d locally anyway, will see what come of it
for the things I use, not that this covers much at all :)

[1] https://github.com/ogham/exa/issues/856
[2] https://github.com/ogham/exa/pull/867
-- 
ionen

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-dev] Defining TZ in the base system profile?
  2023-01-19  5:47 ` Michał Górny
@ 2023-01-19 12:11   ` Arsen Arsenović
  0 siblings, 0 replies; 6+ messages in thread
From: Arsen Arsenović @ 2023-01-19 12:11 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

[-- Attachment #1: Type: text/plain, Size: 1753 bytes --]


Michał Górny <mgorny@gentoo.org> writes:

> On Wed, 2023-01-18 at 20:48 -0500, Joshua Kinard wrote:
>> So this article[1] from 2017 popped up again on the tech radar via hackernews[2] and a few other sites[3].  It 
>> annotates how if the envvar TZ is undefined on a Linux system, it causes glibc to generate a number of 
>> additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  If defined to an actual value, 
>> such as ":/etc/localtime" (or even an empty string), glibc will instead generate far fewer, if any at all, of 
>> these stat-related syscalls.
>> 
>> [...]
>> So is adding a default definition of TZ to our base system /etc/profile something we want to look at?  I 
>> haven't tried any other methods of benchmarking to see if not making those additional syscalls is just placebo 
>> or if there are actual impacts.  Given how long this oddity has been around, I can't tell if it's a genuine 
>> bug in glibc, an unoptimized corner case, or just a big nothingburger.
>> 
>
> Am I correct that there's no real difference between setting it to
> ":/etc/localtime" and the actual timezone?
>
> I suppose it would make sense to default it.

Correct, from ``(libc)TZ Variable'':

   If the ‘TZ’ environment variable does not have a value, the operation
chooses a time zone by default.  In the GNU C Library, the default time
zone is like the specification ‘TZ=:/etc/localtime’ (or
‘TZ=:/usr/local/etc/localtime’, depending on how the GNU C Library was
configured; *note Installation::).  Other C libraries use their own rule
for choosing the default time zone, so there is little we can say about
them.

I don't suspect any downside to this approach.
-- 
Arsen Arsenović

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 381 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-dev] Defining TZ in the base system profile?
  2023-01-19  1:48 [gentoo-dev] Defining TZ in the base system profile? Joshua Kinard
  2023-01-19  5:47 ` Michał Górny
  2023-01-19  6:04 ` Ionen Wolkens
@ 2023-01-19 14:42 ` Michael Orlitzky
  2023-02-14 12:44 ` Haelwenn (lanodan) Monnier
  3 siblings, 0 replies; 6+ messages in thread
From: Michael Orlitzky @ 2023-01-19 14:42 UTC (permalink / raw
  To: gentoo-dev

On Wed, 2023-01-18 at 20:48 -0500, Joshua Kinard wrote:
> 
> So is adding a default definition of TZ to our base system
> /etc/profile something we want to look at?  I 
> haven't tried any other methods of benchmarking to see if not making
> those additional syscalls is just placebo 
> or if there are actual impacts.  Given how long this oddity has been
> around, I can't tell if it's a genuine 
> bug in glibc, an unoptimized corner case, or just a big
> nothingburger.
> 

I thought about doing this on my laptop, and talked myself out of it.
The main counter-arguments are,

  1. ICU doesn't handle the :/etc/localtime format at the moment,

       * https://unicode-org.atlassian.net/browse/ICU-13694
       * https://github.com/nodejs/node/issues/37271

     You could readlink() it or whatever at boot, but that will cause
     changes to /etc/localtime to be mysteriously ignored.

  2. The stats are there for a "good" reason, namely to let glibc
     know if the timezone has changed on the fly.

The first one is only a temporary deal-breaker, but the second is a
tradeoff involving how often your timezone changes (user-dependent) and
what the real performance impact is (probably not much).



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-dev] Defining TZ in the base system profile?
  2023-01-19  1:48 [gentoo-dev] Defining TZ in the base system profile? Joshua Kinard
                   ` (2 preceding siblings ...)
  2023-01-19 14:42 ` Michael Orlitzky
@ 2023-02-14 12:44 ` Haelwenn (lanodan) Monnier
  3 siblings, 0 replies; 6+ messages in thread
From: Haelwenn (lanodan) Monnier @ 2023-02-14 12:44 UTC (permalink / raw
  To: gentoo-dev

[2023-01-18 20:48:56-0500] Joshua Kinard:
>So is adding a default definition of TZ to our base system /etc/profile something we want to look at?  I
>haven't tried any other methods of benchmarking to see if not making those additional syscalls is just placebo
>or if there are actual impacts.  Given how long this oddity has been around, I can't tell if it's a genuine
>bug in glibc, an unoptimized corner case, or just a big nothingburger.

I would take it as a glibc bug / lack of optimisation. At least definitely one
where the fault lies in glibc given that your showed other libc as more
optimized.

And given that POSIX puts ":/etc/localtime" as implementation defined[1],
I think we should avoid it, glibc isn't alone in dealing with timezones.

1: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-02-14 12:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-19  1:48 [gentoo-dev] Defining TZ in the base system profile? Joshua Kinard
2023-01-19  5:47 ` Michał Górny
2023-01-19 12:11   ` Arsen Arsenović
2023-01-19  6:04 ` Ionen Wolkens
2023-01-19 14:42 ` Michael Orlitzky
2023-02-14 12:44 ` Haelwenn (lanodan) Monnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox