Caus,
bojuju s nahodnym problemem, ktery se obcas objevi pri v drtive vetsine hromadnych aktualizacich, na nahodne VM nahodnem Proxmox clusteru. Jedna se o Debiany (11 a 12, ale objevilo se to od Debian 10), s jinymi linuxy tu neni jak porovnat. Po resetu to nabehne bez problemu.
systemctl cat mnt-storage-barman.mnt
# /run/systemd/generator/mnt-storage-barman.mount
# Automatically generated by systemd-fstab-generator
[Unit]
Documentation=man:fstab(5) man:systemd-fstab-generator(8)
SourcePath=/etc/fstab
Before=local-fs.target
After=blockdev@dev-mapper-vg1\x2dbarman.target
[Mount]
What=/dev/mapper/vg1-barman
Where=/mnt/storage/barman
Type=ext4
Options=errors=remount-ro
journalctl -u mnt-storage-barman.mount
Jun 14 10:44:42 HOSTNAME systemd[1]: mnt-storage-barman.mount: Deactivated successfully.
Jun 14 10:44:42 HOSTNAME systemd[1]: Unmounted mnt-storage-barman.mount - /mnt/storage/barman.
-- Boot d7bd76fb9d1a490b944d74ea43fe2516 --
Jun 14 10:46:31 HOSTNAME systemd[1]: Dependency failed for mnt-storage-barman.mount - /mnt/storage/barman.
Jun 14 10:46:31 HOSTNAME systemd[1]: mnt-storage-barman.mount: Job mnt-storage-barman.mount/start failed with result 'dependency'.
-- Boot e2b2bb006de241a69feb33b8f15b33cf --
Jun 14 10:49:00 HOSTNAME systemd[1]: mnt-storage-barman.mount: Directory /mnt/storage/barman to mount over is not empty, mounting anyway.
Jun 14 10:49:00 HOSTNAME systemd[1]: Mounting mnt-storage-barman.mount - /mnt/storage/barman...
Jun 14 10:49:00 HOSTNAME systemd[1]: Mounted mnt-storage-barman.mount - /mnt/storage/barman.
Jun 14 10:45:02 HOSTNAME systemd[1]: Reached target network.target - Network.
Jun 14 10:45:02 HOSTNAME systemd[1]: Starting systemd-networkd-wait-online.service - Wait for Network to be Configured...
Jun 14 10:45:02 HOSTNAME systemd[1]: systemd-pstore.service - Platform Persistent Storage Archival was skipped because of an unmet condition check (ConditionDirectoryNotEmpty=/sys/fs/pstore).
Jun 14 10:45:02 HOSTNAME systemd[1]: systemd-repart.service - Repartition Root Disk was skipped because no trigger condition checks were met.
Jun 14 10:45:02 HOSTNAME lvm[446]: PV /dev/sdb1 online, VG vg0 is complete.
Jun 14 10:45:02 HOSTNAME lvm[447]: PV /dev/sda1 online, VG vg1 is complete.
Jun 14 10:45:02 HOSTNAME lvm[446]: VG vg0 finished
Jun 14 10:45:02 HOSTNAME lvm[447]: VG vg1 finished
Jun 14 10:45:02 HOSTNAME systemd-networkd[312]: ens18: Link UP
Jun 14 10:45:02 HOSTNAME systemd-networkd[312]: ens18: Gained carrier
Jun 14 10:45:02 HOSTNAME systemd[1]: Finished systemd-journal-flush.service - Flush Journal to Persistent Storage.
Jun 14 10:45:02 HOSTNAME systemd[1]: Mounting sys-fs-fuse-connections.mount - FUSE Control File System...
Jun 14 10:45:02 HOSTNAME systemd[1]: Mounted sys-fs-fuse-connections.mount - FUSE Control File System.
Jun 14 10:45:03 HOSTNAME systemd-networkd[312]: ens18: Gained IPv6LL
Jun 14 10:45:03 HOSTNAME systemd-networkd[312]: ens19: Gained IPv6LL
Jun 14 10:45:15 HOSTNAME systemd[1]: Finished systemd-networkd-wait-online.service - Wait for Network to be Configured.
Jun 14 10:45:15 HOSTNAME systemd[1]: Reached target network-online.target - Network is Online.
Jun 14 10:46:31 HOSTNAME systemd[1]: dev-mapper-vg1\x2dbarman.device: Job dev-mapper-vg1\x2dbarman.device/start timed out.
Jun 14 10:46:31 HOSTNAME systemd[1]: Timed out waiting for device dev-mapper-vg1\x2dbarman.device - /dev/mapper/vg1-barman.
Jun 14 10:46:31 HOSTNAME systemd[1]: Dependency failed for mnt-storage-barman.mount - /mnt/storage/barman.
Jun 14 10:46:31 HOSTNAME systemd[1]: Dependency failed for local-fs.target - Local File Systems.
Jun 14 10:46:31 HOSTNAME systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
Jun 14 10:46:31 HOSTNAME systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
Jun 14 10:46:31 HOSTNAME systemd[1]: mnt-storage-barman.mount: Job mnt-storage-barman.mount/start failed with result 'dependency'.
Jun 14 10:46:31 HOSTNAME systemd[1]: dev-mapper-vg1\x2dbarman.device: Job dev-mapper-vg1\x2dbarman.device/start failed with result 'timeout'.
Tzn, boot po restartu v ramci VM (tedy ne v ramci Proxmoxu) je timeout pri mountu diskoveho oddilu (default 90s). Vypis journalu v priloze.
Nasel jsem hromadu podobnych (
https://www.suse.com/support/kb/doc/?id=000020331 ) popisu, akorat do RedHat znalostni baze nevidim. Nepotkal se s tim nekdo a po nasazeni nejake zmeny to zmizelo? Zvazuji zvyseni timeoutu, ale mam dost pochybnosti. UUID se mi zase moc nehodi kvuli konfiguraci v ansible (sice disky formatuju rucne...), zrusit LV nevim, zda ma vliv. V produkci je tohle docela problemove, kdyz dojde k restartu VM neplanovane a ten "bug" se objevi. Realne se mi to stane treba pri aktualizacich (davky po 10 VM), kdy jeden nahodny VM vypadne presne na tomhle - prumer je 1-2 VM pri aktualizaci cca 150 VM. A je to v 99% pripadech datova LV, root LV tim zasazena nebyva.
Diky.