[netctl] interface-dependencies time out when fsck takes too long (?)
Task Info (Flyspray) | |
---|---|
Opened By | Erich Eckner (deepthought) |
Task ID | 67963 |
Type | Bug Report |
Project | Arch Linux |
Category | Arch Projects |
Version | None |
OS | All |
Opened | 2020-09-22 07:19:56 UTC |
Status | Assigned |
Assignee | Jouke Witteveen (jouke) |
Details
Description:
This morning, my server did not bring up all its networks properly after a reboot.
There is lag0, which merges eno3 and eno4. On top of that are the vlans lag0.1, lag0.3 and lag0.4. All these failed to come up properly, but luckily the backup-networks on eno1 and eno2 did come up (so I was able to hot-fix the problem from remote).
Looking at the logs (see below), it appears, like some fsck took some time and delayed the startup of the network interfaces. Therefore, lag0 timed out (and all of its dependencies), but the profiles for eno1 and eno2 were started correctly at the end.
I think, there are two solutions to this issue: 1: like eno1 and eno2, the waiting dependency could later start lag0 and its dependent profiles (is this possible with systemd directives?) 2 (imho the "right" solution): make the network interfaces not dependent on non-vital file systems (the checked file system has "0 2" in /etc/fstab, which should make its check non-fatal and non-blocking if I understand correctly).
Additional info:
- package version(s) netctl 1.23-1
- config and/or log files etc.
Sep 22 06:03:08 kernel: audit: initializing netlink subsys (disabled) Sep 22 06:03:08 kernel: drop_monitor: Initializing network drop monitor service Sep 22 06:03:08 systemd[1]: Created slice system-netctl.slice. Sep 22 06:03:08 systemd[1]: Created slice system-systemd\x2dfsck.slice. Sep 22 06:03:08 systemd-sysctl[310]: Not setting net/ipv4/conf/all/rp_filter (explicit setting exists). Sep 22 06:03:08 systemd-sysctl[310]: Not setting net/ipv4/conf/default/rp_filter (explicit setting exists). Sep 22 06:03:08 systemd-sysctl[310]: Not setting net/ipv4/conf/all/accept_source_route (explicit setting exists). Sep 22 06:03:08 systemd-sysctl[310]: Not setting net/ipv4/conf/default/accept_source_route (explicit setting exists). Sep 22 06:03:08 systemd-sysctl[310]: Not setting net/ipv4/conf/all/promote_secondaries (explicit setting exists). Sep 22 06:03:08 systemd-sysctl[310]: Not setting net/ipv4/conf/default/promote_secondaries (explicit setting exists). Sep 22 06:03:11 systemd-fsck[381]: fsck.fat 4.1 (2017-01-24) Sep 22 06:03:11 systemd-fsck[381]: /dev/sda1: 16 files, 12227/261373 clusters Sep 22 06:03:11 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-disk-by\x2duuid-3C11\x2dFE0D comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:03:11 kernel: audit: type=1130 audit(1600747391.041:10): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-disk-by\x2duuid-3C11\x2dFE0D comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:03:11 systemd-fsck[382]: 4TB wurde 91 Tage ohne Überprüfung genutzt, Prüfung erzwungen. Sep 22 06:03:11 systemd-fsck[379]: data wurde 91 Tage ohne Überprüfung genutzt, Prüfung erzwungen. Sep 22 06:03:11 systemd[1]: Found device NetXtreme II BCM57800 1/10 Gigabit Ethernet (BCM57800 10-Gigabit Ethernet). Sep 22 06:03:11 systemd[1]: Found device NetXtreme II BCM57800 1/10 Gigabit Ethernet (BCM57800 1-Gigabit Ethernet). Sep 22 06:03:11 systemd[1]: Found device NetXtreme II BCM57800 1/10 Gigabit Ethernet (BCM57800 10-Gigabit Ethernet). Sep 22 06:03:11 systemd[1]: Found device NetXtreme II BCM57800 1/10 Gigabit Ethernet (BCM57800 1-Gigabit Ethernet). Sep 22 06:03:22 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 153190575 (auf Ebene 1) könnte kürzer sein. IGNORIERT. Sep 22 06:03:22 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 153191418 (auf Ebene 1) könnte kürzer sein. IGNORIERT. Sep 22 06:03:28 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 165708004 (auf Ebene 1) könnte kürzer sein. IGNORIERT. Sep 22 06:03:31 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 168330955 (auf Ebene 1) könnte kürzer sein. IGNORIERT. Sep 22 06:03:31 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 169705652 (auf Ebene 1) könnte kürzer sein. IGNORIERT. Sep 22 06:03:31 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 169706536 (auf Ebene 1) könnte kürzer sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564868 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564877 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564885 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564886 (auf Ebene 1) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564888 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564897 (auf Ebene 1) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564898 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564899 (auf Ebene 1) könnte schmaler sein. IGNORIERT. Sep 22 06:03:35 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564900 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:36 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564904 (auf Ebene 2) könnte schmaler sein. IGNORIERT. Sep 22 06:03:36 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564917 (auf Ebene 1) könnte schmaler sein. IGNORIERT. Sep 22 06:03:36 systemd-fsck[379]: data: Der Erweiterungsbaum von Inode 214564919 (auf Ebene 1) könnte schmaler sein. IGNORIERT. Sep 22 06:04:37 systemd-fsck[379]: data: 2182653/366231552 Dateien (2.8% nicht zusammenhängend), 2330371692/5859704832 Blöcke Sep 22 06:04:37 systemd[1]: sys-subsystem-net-devices-lag0.device: Job sys-subsystem-net-devices-lag0.device/start timed out. Sep 22 06:04:37 systemd[1]: Timed out waiting for device /sys/subsystem/net/devices/lag0. Sep 22 06:04:37 systemd[1]: netctl@laser_control\x2dvlan\x2dstatic\x2dlag0.service: Job netctl@laser_control\x2dvlan\x2dstatic\x2dlag0.service/start failed with result 'dependency'. Sep 22 06:04:37 systemd[1]: netctl@laser_amplitude\x2dvlan\x2dstatic\x2dlag0.service: Job netctl@laser_amplitude\x2dvlan\x2dstatic\x2dlag0.service/start failed with result 'dependency'. Sep 22 06:04:37 systemd[1]: netctl@lab\x2dvlan\x2dstatic\x2dlag0.service: Job netctl@lab\x2dvlan\x2dstatic\x2dlag0.service/start failed with result 'dependency'. Sep 22 06:04:37 systemd[1]: sys-subsystem-net-devices-lag0.device: Job sys-subsystem-net-devices-lag0.device/start failed with result 'timeout'. Sep 22 06:04:38 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-disk-by\x2duuid-8d3979cd\x2de152\x2d45e7\x2d848a\x2dce6dd0390d42 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:04:38 kernel: audit: type=1130 audit(1600747478.004:11): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-disk-by\x2duuid-8d3979cd\x2de152\x2d45e7\x2d848a\x2dce6dd0390d42 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:09 systemd-fsck[382]: 4TB: 39710120/244154368 Dateien (0.1% nicht zusammenhängend), 496370243/976617472 Blöcke Sep 22 06:54:14 kernel: audit: type=1130 audit(1600750454.644:12): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-disk-by\x2duuid-9379dd3c\x2dfafc\x2d4174\x2dba5b\x2dadbb1add7267 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:14 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-disk-by\x2duuid-9379dd3c\x2dfafc\x2d4174\x2dba5b\x2dadbb1add7267 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:15 systemd[1]: Starting Static ethernet connection to the containers... Sep 22 06:54:15 systemd[1]: Starting A static IP for the Hinterzimmer-net... Sep 22 06:54:15 network[464]: Starting network profile 'containers-static'... Sep 22 06:54:15 network[466]: Starting network profile 'lag0profile'... Sep 22 06:54:15 network[468]: Starting network profile 'uni-dhcp'... Sep 22 06:54:15 network[465]: Starting network profile 'hinterzimmer-static'... Sep 22 06:54:18 kernel: bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. Sep 22 06:54:18 systemd[1]: Started Static ethernet connection to the containers. Sep 22 06:54:18 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=netctl@containers\x2dstatic comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:18 network[464]: Started network profile 'containers-static' Sep 22 06:54:19 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=netctl@lag0profile comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:19 network[466]: Started network profile 'lag0profile' Sep 22 06:54:19 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=netctl@uni\x2ddhcp comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:21 systemd[1]: Started A static IP for the Hinterzimmer-net. Sep 22 06:54:21 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=netctl@hinterzimmer\x2dstatic comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:21 kernel: audit: type=1130 audit(1600750461.431:31): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=netctl@hinterzimmer\x2dstatic comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 22 06:54:21 systemd[1]: Started Internet domain name server. Sep 22 06:54:21 dhcpd[645]: Internet Systems Consortium DHCP Server 4.4.2 Sep 22 06:54:21 dhcpd[645]: Copyright 2004-2020 Internet Systems Consortium. Sep 22 06:54:21 dhcpd[645]: No subnet declaration for lag0.1 (no IPv4 addresses). Sep 22 06:54:21 dhcpd[645]: you want, please write a subnet declaration Sep 22 06:54:21 dhcpd[645]: in your dhcpd.conf file for the network segment Sep 22 06:54:21 systemd[1]: Started vnStat network traffic monitor. Sep 22 06:54:21 network[465]: Started network profile 'hinterzimmer-static' Sep 22 06:54:21 kernel: NFSD: starting 90-second grace period (net f00000a0) Sep 22 06:54:25 kernel: NFSD: all clients done reclaiming, ending NFSv4 grace period (net f00000a0) Sep 22 06:54:26 network[468]: Started network profile 'uni-dhcp' Sep 22 06:54:32 named[643]: BIND 9 is maintained by Internet Systems Consortium,
for sake of completeness, here's the netctl configuration:
cat lag0profile
Description='bonding eno3 and eno4 to lag0' Interface=lag0 Connection=bond
The variable name is plural, but needs precisely one interface
BindsToInterfaces=(eno3 eno4) IP=static Adress=('127.0.0.1')
cat lab-vlan-static-lag0
Description='Virtual LAN 1 (labor default in switch M4100) on interface lag0' Interface=lag0.1 Connection=vlan
netctl@lag0profile.service", ignoring: Invalid argument"
This just produces "Failed to add dependency on "After=('lag0profile')
BindsToInterfaces=(lag0) VLANID=1 IP=static Address=('192.168.4.200/23')
cat laser_amplitude-vlan-static-lag0
Description='Virtual LAN 3 (laser_amplitude in switch M4100) on interface lag0' Interface=lag0.3 Connection=vlan
netctl@lag0profile.service", ignoring: Invalid argument"
This just produces "Failed to add dependency on "After=('lag0profile')
BindsToInterfaces=(lag0) VLANID=3 IP=static Address=('192.168.0.200/24')
cat laser_control-vlan-static-lag0
Description='Virtual LAN 4 (laser_control in switch M4100) on interface lag0' Interface=lag0.4 Connection=vlan
netctl@lag0profile.service", ignoring: Invalid argument"
This just produces "Failed to add dependency on "After=('lag0profile')
BindsToInterfaces=(lag0) VLANID=4 IP=static Address=('192.168.1.200/24')
(the following interfaces came up properly after fsck finished)
cat uni-dhcp
Description='Connection to the Uni' Interface=eno2 Connection=ethernet IP=dhcp
cat hinterzimmer-static
Description='A static IP for the Hinterzimmer-net' Interface=eno1 Connection=ethernet IP=static Address=('192.168.8.200/24')
Steps to reproduce: maybe this is a minimal working example (I did not try): create a lag of two interfaces, then force invocation of fsck on a big device on next reboot, then reboot.