Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI out of disk space #1578

Closed
acomodi opened this issue Feb 4, 2021 · 10 comments
Closed

CI out of disk space #1578

acomodi opened this issue Feb 4, 2021 · 10 comments

Comments

@acomodi
Copy link
Contributor

acomodi commented Feb 4, 2021

The last merge resulted in the Zynq CI to fail due to insufficient disk space (link to CI).

++ export FUZDIR=/tmpfs/src/github/symbiflow-prjxray-continuous-db-zynq7/fuzzers/031-cmt-mmcm
++ FUZDIR=/tmpfs/src/github/symbiflow-prjxray-continuous-db-zynq7/fuzzers/031-cmt-mmcm
++ test 1 -ge 1
++ test '!' -e ''
++ export SPECDIR=build/specimen_008
++ SPECDIR=build/specimen_008
++ mkdir -p build/specimen_008
mkdir: cannot create directory 'build/specimen_008': No space left on device
make[3]: *** [build/specimen_008/OK] Error 1
ERROR: [Common 17-49] Internal Data Exception: HDDMProto::writeMessage failed
@litghost
Copy link
Contributor

litghost commented Feb 4, 2021

There has been an intermittent kokoro failure where the /tmp disk is not mounted. I don't believe this is a something that is fixable on the outside.

@dnltz
Copy link
Contributor

dnltz commented Mar 19, 2021

dmesg from a failed container. Looks like no partition was found on sdb:

[    7.730380] tsc: Refined TSC clocksource calibration: 1999.804 MHz
[    7.731697] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x39a6ea8bb3d, max_idle_ns: 881590680816 ns
[    7.732629] sd 0:0:1:0: Attached scsi generic sg0 type 0
[    7.732701] sd 0:0:1:0: [sda] 209715200 512-byte logical blocks: (107 GB/100 GiB)
[    7.732703] sd 0:0:1:0: [sda] 4096-byte physical blocks
[    7.732820] sd 0:0:2:0: Attached scsi generic sg1 type 0
[    7.732884] sd 0:0:1:0: [sda] Write Protect is off
[    7.732886] sd 0:0:1:0: [sda] Mode Sense: 1f 00 00 08
[    7.732920] sd 0:0:2:0: [sdb] 8589934592 512-byte logical blocks: (4.40 TB/4.00 TiB)
[    7.732921] sd 0:0:2:0: [sdb] 4096-byte physical blocks
[    7.732947] sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    7.733001] sd 0:0:3:0: Attached scsi generic sg2 type 0
[    7.733168] sd 0:0:4:0: Attached scsi generic sg3 type 0
[    7.733273] sd 0:0:3:0: [sdc] 524288000 512-byte logical blocks: (268 GB/250 GiB)
[    7.733275] sd 0:0:3:0: [sdc] 4096-byte physical blocks
[    7.733312] sd 0:0:4:0: [sdd] 524288000 512-byte logical blocks: (268 GB/250 GiB)
[    7.733314] sd 0:0:4:0: [sdd] 4096-byte physical blocks
[    7.733332] sd 0:0:5:0: Attached scsi generic sg4 type 0
[    7.733346] sd 0:0:2:0: [sdb] Write Protect is off
[    7.733348] sd 0:0:2:0: [sdb] Mode Sense: 1f 00 00 08
[    7.733411] sd 0:0:2:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    7.733429] sd 0:0:5:0: [sde] 209715200 512-byte logical blocks: (107 GB/100 GiB)
[    7.733431] sd 0:0:5:0: [sde] 4096-byte physical blocks
[    7.733583] sd 0:0:3:0: [sdc] Write Protect is off
[    7.733585] sd 0:0:3:0: [sdc] Mode Sense: 1f 00 00 08
[    7.733637] sd 0:0:4:0: [sdd] Write Protect is off
[    7.733639] sd 0:0:4:0: [sdd] Mode Sense: 1f 00 00 08
[    7.733653] sd 0:0:3:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    7.733742] sd 0:0:4:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    7.733762] sd 0:0:5:0: [sde] Write Protect is off
[    7.733764] sd 0:0:5:0: [sde] Mode Sense: 1f 00 00 08
[    7.733882] sd 0:0:5:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    7.734916]  sda: sda1
[    7.735614] sd 0:0:1:0: [sda] Attached SCSI disk
[    7.735801] sd 0:0:2:0: [sdb] Attached SCSI disk
[    7.746552] sd 0:0:4:0: [sdd] Attached SCSI disk
[    7.746655] sd 0:0:5:0: [sde] Attached SCSI disk
[    7.766597]  sdc: sdc1
[    7.767440] sd 0:0:3:0: [sdc] Attached SCSI disk
[    8.411753] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input4
[   10.570444] floppy0: no floppy controllers found

@litghost
Copy link
Contributor

Ya. It looks like the disk is there, but just needs to be partitioned and formatted. Huh. Maybe a race condition?

@mithro
Copy link
Contributor

mithro commented Mar 19, 2021

Sounds like some type of race-condition....

The following debug information could be useful;

sudo partprobe -s || true
sudo dmesg | tail -n 30 || true
sudo cat /proc/partitions || true
sudo cat /etc/fstab || true
sudo cat /etc/mtab || true
sudo lsblk --list --output 'NAME,KNAME,FSTYPE,MOUNTPOINT,LABEL,UUID,PARTTYPE,PARTLABEL,PARTUUID' || true
sudo sfdisk --list || true
sudo systemctl | grep mount || true
sudo systemctl | grep dev || true

@dnltz
Copy link
Contributor

dnltz commented Mar 20, 2021

Ah, the sdb1 appears after 15 seconds. I think my recently added check is too early.

[   19.612977] floppy0: no floppy controllers found
[   20.263614] aufs 4.x-rcN-20160111
[   20.453046] bridge: automatic filtering via arp/ip/ip6tables has been deprecated. Update your scripts to load br_netfilter if you need this.
[   20.465988] Bridge firewalling registered
[   20.474843] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[   20.501845] ip_tables: (C) 2000-2006 Netfilter Core Team
[   20.579205] Initializing XFRM netlink socket
[   20.587496] Netfilter messages via NETLINK v0.30.
[   20.592694] ctnetlink v0.93: registering with nfnetlink.
[   20.627454] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[   20.752740] aufs au_opts_verify:1597:dockerd[2066]: dirperm1 breaks the protection by the permission bits on the lower branch
[   22.850914]  sdb:
[   22.877975]  sdb: sdb1
[   22.920217] format_tmpfs.sh (2775): drop_caches: 3


[ID: 9721149] Build finished after 1 secs, exit value: 1


Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
[09:28:42] Collecting build artifacts from build VM
Build script failed with exit code: 1

@mithro
Copy link
Contributor

mithro commented Mar 20, 2021

@dnltz -- This is super interesting! We should probably make sure the system waits until /tmpfs is mounted / appears before it does anything.

@litghost
Copy link
Contributor

litghost commented Mar 26, 2021

Another instance of /tmp failure with logging: https://source.cloud.google.com/results/invocations/4cf2fe46-8e69-4325-8bf7-b4fa264637ac/log

Develop your code on the Google Cloud Platform.

@mithro
Copy link
Contributor

mithro commented Mar 26, 2021

@litghost - Any idea why /tmpfs doesn't appear in /etc/fstab?

@mithro
Copy link
Contributor

mithro commented Mar 30, 2021

I have made some modifications to the kokoro base image that is used here to hopefully fix the issue. Please keep an eye out for this problem continuing to appear.

@umarcor
Copy link
Contributor

umarcor commented Aug 18, 2022

Closing since kokoro is not used anymore.

@umarcor umarcor closed this as completed Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants