Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: NixOS/nixpkgs
base: 214af51225d1
Choose a base ref
...
head repository: NixOS/nixpkgs
compare: b34cf366aa4c
Choose a head ref
  • 2 commits
  • 1 file changed
  • 1 contributor

Commits on Oct 30, 2020

  1. stage-1: modprobe ext{2,3,4} before resizing

    I noticed booting a system with an ext4 root which expanded to 5T took
    quite a long time (12 minutes in some cases, 43(!) in others.)
    
    I changed stage-1 to run `resize2fs -d 62` for extra debug output and
    timing information. It revealed the adjust_superblock step taking
    almost all of the time:
    
        [Fri Oct 30 11:10:15 UTC 2020] zero_high_bits_in_metadata: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
        [Fri Oct 30 11:21:09 UTC 2020] adjust_superblock: Memory used: 396k/4556k (295k/102k), time: 654.21/ 0.59/ 5.13
    
    but when I ran resize2fs on a disk with the identical content growing
    to the identical target size, it would only take about 30 seconds. I
    looked at what happened between those two steps in the fast case with
    strace and found:
    
    ```
       235	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=1795}, ru_stime={tv_sec=0, tv_usec=3590}, ...}) = 0
       236	write(1, "zero_high_bits_in_metadata: Memo"..., 84zero_high_bits_in_metadata: Memory used: 132k/0k (72k/61k), time:  0.00/ 0.00/ 0.00
       237	) = 84
       238	gettimeofday({tv_sec=1604061278, tv_usec=480147}, NULL) = 0
       239	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=1802}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
       240	gettimeofday({tv_sec=1604061278, tv_usec=480192}, NULL) = 0
       241	mmap(NULL, 2564096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa3c7355000
       242	access("/sys/fs/ext4/features/lazy_itable_init", F_OK) = 0
       243	brk(0xf85000)                           = 0xf85000
       244	brk(0xfa6000)                           = 0xfa6000
       245	gettimeofday({tv_sec=1604061278, tv_usec=538828}, NULL) = 0
       246	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=58720}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
       247	write(1, "adjust_superblock: Memory used: "..., 79adjust_superblock: Memory used: 396k/2504k (305k/92k), time:  0.06/ 0.06/ 0.00
       248	) = 79
       249	gettimeofday({tv_sec=1604061278, tv_usec=539119}, NULL) = 0
       250	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=58812}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
       251	gettimeofday({tv_sec=1604061279, tv_usec=939}, NULL) = 0
       252	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=520411}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
       253	write(1, "fix_uninit_block_bitmaps 2: Memo"..., 88fix_uninit_block_bitmaps 2: Memory used: 396k/2504k (305k/92k), time:  0.46/ 0.46/ 0.00
       254	) = 88
    ```
    
    In particular the access to /sys/fs seemed interesting. Looking
    at the source of resize2fs:
    
    ```
    [root@ip-172-31-22-182:~/e2fsprogs-1.45.5]# rg -B2 -A1 /sys/fs/ext4/features/lazy_itable_init .
    ./resize/resize2fs.c
    923-	if (getenv("RESIZE2FS_FORCE_LAZY_ITABLE_INIT") ||
    924-	    (!getenv("RESIZE2FS_FORCE_ITABLE_INIT") &&
    925:	     access("/sys/fs/ext4/features/lazy_itable_init", F_OK) == 0))
    926-		lazy_itable_init = 1;
    ```
    
    I confirmed /sys is mounted, and then found a bug suggesting the
    ext4 module is maybe not loaded:
    https://bugzilla.redhat.com/show_bug.cgi?id=1071909
    
    My home server doesn't have ext4 loaded and had 3T to play with, so
    I tried (and succeeded with) replicating the issue locally:
    
    ```
    [root@kif:/scratch]# lsmod | grep -i ext
    
    [root@kif:/scratch]# zfs create -V 3G rpool/scratch/ext4
    
    [root@kif:/scratch]# time mkfs.ext4 /dev/zvol/rpool/scratch/ext4
    mke2fs 1.45.5 (07-Jan-2020)
    Discarding device blocks: done
    Creating filesystem with 786432 4k blocks and 196608 inodes
    Filesystem UUID: 560a4a8f-93dc-40cc-97a5-f10049bf801f
    Superblock backups stored on blocks:
    	32768, 98304, 163840, 229376, 294912
    
    Allocating group tables: done
    Writing inode tables: done
    Creating journal (16384 blocks): done
    Writing superblocks and filesystem accounting information: done
    
    real	0m2.261s
    user	0m0.000s
    sys	0m0.025s
    
    [root@kif:/scratch]# zfs set volsize=3T rpool/scratch/ext4
    
    [root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4
    resize2fs 1.45.5 (07-Jan-2020)
    fs has 11 inodes, 1 groups required.
    fs requires 16390 data blocks.
    With 1 group(s), we have 22234 blocks available.
    Last group's overhead is 10534
    Need 16390 data blocks in last group
    Final size of last group is 26924
    Estimated blocks needed: 26924
    Extents safety margin: 49
    Resizing the filesystem on /dev/zvol/rpool/scratch/ext4 to 805306368 (4k) blocks.
    read_bitmaps: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
    read_bitmaps: I/O read: 1MB, write: 0MB, rate: 3802.28MB/s
    fix_uninit_block_bitmaps 1: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
    resize_group_descriptors: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
    move_bg_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
    zero_high_bits_in_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
    ```
    
    here it got stuck for quite some time ... straceing this 20 minutes in revealed this in a tight loop:
    
    ```
    getuid()                                = 0
    geteuid()                               = 0
    getgid()                                = 0
    getegid()                               = 0
    prctl(PR_GET_DUMPABLE)                  = 1 (SUID_DUMP_USER)
    fallocate(3, FALLOC_FL_ZERO_RANGE, 2222649901056, 2097152) = 0
    fsync(3)                                = 0
    ```
    
    it finally ended 43(!) minutes later:
    
    ```
    adjust_superblock: Memory used: 264k/3592k (210k/55k), time: 2554.03/ 0.16/15.07
    fix_uninit_block_bitmaps 2: Memory used: 264k/3592k (210k/55k), time:  0.16/ 0.16/ 0.00
    blocks_to_move: Memory used: 264k/3592k (211k/54k), time:  0.00/ 0.00/ 0.00
    Number of free blocks: 755396/780023556, Needed: 0
    block_mover: Memory used: 264k/3592k (216k/49k), time:  0.05/ 0.01/ 0.00
    block_mover: I/O read: 1MB, write: 0MB, rate: 18.68MB/s
    inode_scan_and_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
    inode_ref_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
    move_itables: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
    calculate_summary_stats: Memory used: 264k/3592k (216k/49k), time: 16.35/16.35/ 0.00
    fix_resize_inode: Memory used: 264k/3592k (222k/43k), time:  0.04/ 0.00/ 0.00
    fix_resize_inode: I/O read: 1MB, write: 0MB, rate: 22.80MB/s
    fix_sb_journal_backup: Memory used: 264k/3592k (222k/43k), time:  0.00/ 0.00/ 0.00
    overall resize2fs: Memory used: 264k/3592k (222k/43k), time: 2570.90/16.68/15.07
    overall resize2fs: I/O read: 1MB, write: 1MB, rate: 0.00MB/s
    The filesystem on /dev/zvol/rpool/scratch/ext4 is now 805306368 (4k) blocks long.
    
    real	43m1.943s
    user	0m16.761s
    sys	0m15.069s
    ```
    
    I then cleaned up and recreated the zvol, loaded the ext4 module, created the ext4 fs,
    resized the volume, and resize2fs'd and it went quite quickly:
    
    ```
    [root@kif:/scratch]# zfs destroy rpool/scratch/ext4
    
    [root@kif:/scratch]# zfs create -V 3G rpool/scratch/ext4
    
    [root@kif:/scratch]# modprobe ext4
    
    [root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4
    
    [root@kif:/scratch]# time mkfs.ext4 /dev/zvol/rpool/scratch/ext4
    mke2fs 1.45.5 (07-Jan-2020)
    Discarding device blocks: done
    Creating filesystem with 786432 4k blocks and 196608 inodes
    Filesystem UUID: 5b415f2f-a8c4-4ba0-ac1d-78860de77610
    Superblock backups stored on blocks:
    	32768, 98304, 163840, 229376, 294912
    
    Allocating group tables: done
    Writing inode tables: done
    Creating journal (16384 blocks): done
    Writing superblocks and filesystem accounting information: done
    
    real	0m1.013s
    user	0m0.001s
    sys	0m0.023s
    
    [root@kif:/scratch]# zfs set volsize=3T rpool/scratch/ext4
    
    [root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4
    resize2fs 1.45.5 (07-Jan-2020)
    fs has 11 inodes, 1 groups required.
    fs requires 16390 data blocks.
    With 1 group(s), we have 22234 blocks available.
    Last group's overhead is 10534
    Need 16390 data blocks in last group
    Final size of last group is 26924
    Estimated blocks needed: 26924
    Extents safety margin: 49
    Resizing the filesystem on /dev/zvol/rpool/scratch/ext4 to 805306368 (4k) blocks.
    read_bitmaps: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
    read_bitmaps: I/O read: 1MB, write: 0MB, rate: 3389.83MB/s
    fix_uninit_block_bitmaps 1: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
    resize_group_descriptors: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
    move_bg_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
    zero_high_bits_in_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
    adjust_superblock: Memory used: 264k/1540k (210k/55k), time:  0.02/ 0.02/ 0.00
    fix_uninit_block_bitmaps 2: Memory used: 264k/1540k (210k/55k), time:  0.15/ 0.15/ 0.00
    blocks_to_move: Memory used: 264k/1540k (211k/54k), time:  0.00/ 0.00/ 0.00
    Number of free blocks: 755396/780023556, Needed: 0
    block_mover: Memory used: 264k/3592k (216k/49k), time:  0.01/ 0.01/ 0.00
    block_mover: I/O read: 1MB, write: 0MB, rate: 157.11MB/s
    inode_scan_and_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
    inode_ref_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
    move_itables: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
    
    calculate_summary_stats: Memory used: 264k/3592k (216k/49k), time: 16.20/16.20/ 0.00
    fix_resize_inode: Memory used: 264k/3592k (222k/43k), time:  0.00/ 0.00/ 0.00
    fix_resize_inode: I/O read: 1MB, write: 0MB, rate: 5319.15MB/s
    fix_sb_journal_backup: Memory used: 264k/3592k (222k/43k), time:  0.00/ 0.00/ 0.00
    overall resize2fs: Memory used: 264k/3592k (222k/43k), time: 16.45/16.38/ 0.00
    overall resize2fs: I/O read: 1MB, write: 1MB, rate: 0.06MB/s
    The filesystem on /dev/zvol/rpool/scratch/ext4 is now 805306368 (4k) blocks long.
    
    real	0m17.908s
    user	0m16.386s
    sys	0m0.079s
    ```
    
    Success!
    grahamc committed Oct 30, 2020
    Copy the full SHA
    ece5c0f View commit details
    Browse the repository at this point in the history
  2. Merge pull request #102171 from grahamc/faster-ext-resize

    stage-1: modprobe ext{2,3,4} before resizing (so resizing takes less than 45 minutes)
    grahamc committed Oct 30, 2020
    Copy the full SHA
    b34cf36 View commit details
    Browse the repository at this point in the history