Re: [PATCH] fs: switch timespec64 fields in inode to discrete integers
From: David Sterba
Date: Tue May 21 2024 - 12:08:28 EST
On Sat, May 18, 2024 at 06:48:30AM -0400, Jeff Layton wrote:
> On Sat, 2024-05-18 at 06:23 +0100, Matthew Wilcox wrote:
> > On Fri, May 17, 2024 at 08:08:40PM -0400, Jeff Layton wrote:
> > > For reference (according to pahole):
> > >
> > > Before: /* size: 624, cachelines: 10, members: 53 */
> > > After: /* size: 616, cachelines: 10, members: 56 */
> >
> > Smaller is always better, but for a meaningful improvement, we'd want
> > to see more. On my laptop running a Debian 6.6.15 kernel, I see:
> >
> > inode_cache 11398 11475 640 25 4 : tunables 0 0 0 : slabdata 459 459 0
> >
> > so there's 25 inodes per 4 pages. Going down to 632 is still 25 per 4
> > pages. At 628 bytes, we get 26 per 4 pages. Ar 604 bytes, we're at 27.
> > And at 584 bytes, we get 28.
> >
> > Of course, struct inode gets embedded in a lot of filesystem inodes.
> > xfs_inode 142562 142720 1024 32 8 : tunables 0 0 0 : slabdata 4460 4460 0
> > ext4_inode_cache 81 81 1184 27 8 : tunables 0 0 0 : slabdata 3 3 0
> > sock_inode_cache 2123 2223 832 39 8 : tunables 0 0 0 : slabdata 57 57 0
> >
> > So any of them might cross a magic boundary where we suddenly get more
> > objects per slab.
> >
> > Not trying to diss the work you've done here, just pointing out the
> > limits for anyone who's trying to do something similar. Or maybe
> > inspire someone to do more reductions ;-)
>
> Way to bust my bubble, Willy. ;-)
>
> On a more serious note, I may be able to squeeze out another 4 bytes by
> moving __i_ctime to a single 8 byte word. It's never settable from
> userland, so we probably don't need the full range of times that a
> timespec64 gives us there. Shrinking that may also make the multigrain
> time rework simpler.
>
> David Howells was also looking at removing the i_private field as well.
> Since these structs are usually embedded in a larger structure, it's
> not clear that we need that field. If we can make that work, it'll mean
> another 8 bytes goes away on 64-bit arches.
>
> IOW, I think there may be some other opportunities for shrinkage in the
> future.
Incremental shrinking works well, we've managed to get btrfs_inode under
1024 bytes recently but it took several releases, removing, reordering
or otherwise optimizing the size. It's easier to focus on what's left
there than to take notes and assume all the other struct members that
could be optimized eventually.