Ahoj,
řeším technický problém na serveru dell t320, kde mám připojené, mimojiné, dva 8TB SATA disky v mdadm RAID1. Pokoušel jsem se změřit rychlost disů pomocí příkazu
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
Po chvíli mi do logu naskáče následující sada chybových zpráv a server má vysoký load a je do rebootu nepoužitelný. Snažím se to vyřešit, ale nedaří se mi přijít na to v čem je problém.
Zkoušel jsem vyměnit disky a je to stále to samé. Kam byste mi doporučili se obrátit s reportem chyby? Na vývojáře jádra, mdadm, xfs? Případně nenapadá vás, co dál s tím?
May 31 22:01:31 datel kernel: INFO: task kworker/u96:1:12 blocked for more than 122 seconds.
May 31 22:01:31 datel kernel: Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
May 31 22:01:31 datel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 22:01:31 datel kernel: task:kworker/u96:1 state:D stack: 0 pid: 12 ppid: 2 flags:0x00004000
May 31 22:01:31 datel kernel: Workqueue: xfs-cil/md0 xlog_cil_push_work [xfs]
May 31 22:01:31 datel kernel: Call Trace:
May 31 22:01:31 datel kernel: <TASK>
May 31 22:01:31 datel kernel: __schedule+0x248/0x620
May 31 22:01:31 datel kernel: schedule+0x5a/0xc0
May 31 22:01:31 datel kernel: xlog_state_get_iclog_space+0x10d/0x340 [xfs]
May 31 22:01:31 datel kernel: ? wake_up_q+0x90/0x90
May 31 22:01:31 datel kernel: xlog_write+0x144/0x6f0 [xfs]
May 31 22:01:31 datel kernel: xlog_cil_push_work+0x3b5/0x5a0 [xfs]
May 31 22:01:31 datel kernel: ? pick_next_task_fair+0x41/0x490
May 31 22:01:31 datel kernel: ? xfs_swap_extents+0x630/0x630 [xfs]
May 31 22:01:31 datel kernel: process_one_work+0x1e8/0x3c0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: worker_thread+0x50/0x3b0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: kthread+0xd9/0x100
May 31 22:01:31 datel kernel: ? kthread_complete_and_exit+0x20/0x20
May 31 22:01:31 datel kernel: ret_from_fork+0x22/0x30
May 31 22:01:31 datel kernel: </TASK>
May 31 22:01:31 datel kernel: INFO: task kworker/u96:6:121 blocked for more than 122 seconds.
May 31 22:01:31 datel kernel: Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
May 31 22:01:31 datel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 22:01:31 datel kernel: task:kworker/u96:6 state:D stack: 0 pid: 121 ppid: 2 flags:0x00004000
May 31 22:01:31 datel kernel: Workqueue: xfs-cil/md0 xlog_cil_push_work [xfs]
May 31 22:01:31 datel kernel: Call Trace:
May 31 22:01:31 datel kernel: <TASK>
May 31 22:01:31 datel kernel: __schedule+0x248/0x620
May 31 22:01:31 datel kernel: schedule+0x5a/0xc0
May 31 22:01:31 datel kernel: xlog_cil_order_write+0x19b/0x1c0 [xfs]
May 31 22:01:31 datel kernel: ? wake_up_q+0x90/0x90
May 31 22:01:31 datel kernel: xlog_cil_push_work+0x325/0x5a0 [xfs]
May 31 22:01:31 datel kernel: ? pick_next_task_fair+0x41/0x490
May 31 22:01:31 datel kernel: ? xfs_swap_extents+0x630/0x630 [xfs]
May 31 22:01:31 datel kernel: process_one_work+0x1e8/0x3c0
May 31 22:01:31 datel kernel: worker_thread+0x50/0x3b0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: kthread+0xd9/0x100
May 31 22:01:31 datel kernel: ? kthread_complete_and_exit+0x20/0x20
May 31 22:01:31 datel kernel: ret_from_fork+0x22/0x30
May 31 22:01:31 datel kernel: </TASK>
May 31 22:01:31 datel kernel: INFO: task kworker/u96:8:123 blocked for more than 122 seconds.
May 31 22:01:31 datel kernel: Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
May 31 22:01:31 datel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 22:01:31 datel kernel: task:kworker/u96:8 state:D stack: 0 pid: 123 ppid: 2 flags:0x00004000
May 31 22:01:31 datel kernel: Workqueue: xfs-cil/md0 xlog_cil_push_work [xfs]
May 31 22:01:31 datel kernel: Call Trace:
May 31 22:01:31 datel kernel: <TASK>
May 31 22:01:31 datel kernel: __schedule+0x248/0x620
May 31 22:01:31 datel kernel: schedule+0x5a/0xc0
May 31 22:01:31 datel kernel: xlog_cil_order_write+0x19b/0x1c0 [xfs]
May 31 22:01:31 datel kernel: ? wake_up_q+0x90/0x90
May 31 22:01:31 datel kernel: xlog_cil_push_work+0x325/0x5a0 [xfs]
May 31 22:01:31 datel kernel: ? pick_next_task_fair+0x41/0x490
May 31 22:01:31 datel kernel: ? xfs_swap_extents+0x630/0x630 [xfs]
May 31 22:01:31 datel kernel: process_one_work+0x1e8/0x3c0
May 31 22:01:31 datel kernel: worker_thread+0x50/0x3b0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: kthread+0xd9/0x100
May 31 22:01:31 datel kernel: ? kthread_complete_and_exit+0x20/0x20
May 31 22:01:31 datel kernel: ret_from_fork+0x22/0x30
May 31 22:01:31 datel kernel: </TASK>
May 31 22:01:31 datel kernel: INFO: task kworker/u96:13:128 blocked for more than 122 seconds.
May 31 22:01:31 datel kernel: Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
May 31 22:01:31 datel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 22:01:31 datel kernel: task:kworker/u96:13 state:D stack: 0 pid: 128 ppid: 2 flags:0x00004000
May 31 22:01:31 datel kernel: Workqueue: xfs-cil/md0 xlog_cil_push_work [xfs]
May 31 22:01:31 datel kernel: Call Trace:
May 31 22:01:31 datel kernel: <TASK>
May 31 22:01:31 datel kernel: __schedule+0x248/0x620
May 31 22:01:31 datel kernel: schedule+0x5a/0xc0
May 31 22:01:31 datel kernel: xlog_cil_order_write+0x19b/0x1c0 [xfs]
May 31 22:01:31 datel kernel: ? wake_up_q+0x90/0x90
May 31 22:01:31 datel kernel: xlog_cil_push_work+0x325/0x5a0 [xfs]
May 31 22:01:31 datel kernel: ? pick_next_task_fair+0x41/0x490
May 31 22:01:31 datel kernel: ? xfs_swap_extents+0x630/0x630 [xfs]
May 31 22:01:31 datel kernel: process_one_work+0x1e8/0x3c0
May 31 22:01:31 datel kernel: worker_thread+0x50/0x3b0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: kthread+0xd9/0x100
May 31 22:01:31 datel kernel: ? kthread_complete_and_exit+0x20/0x20
May 31 22:01:31 datel kernel: ret_from_fork+0x22/0x30
May 31 22:01:31 datel kernel: </TASK>
May 31 22:01:31 datel kernel: INFO: task kworker/5:1:330 blocked for more than 122 seconds.
May 31 22:01:31 datel kernel: Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
May 31 22:01:31 datel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 22:01:31 datel kernel: task:kworker/5:1 state:D stack: 0 pid: 330 ppid: 2 flags:0x00004000
May 31 22:01:31 datel kernel: Workqueue: xfs-sync/md0 xfs_log_worker [xfs]
May 31 22:01:31 datel kernel: Call Trace:
May 31 22:01:31 datel kernel: <TASK>
May 31 22:01:31 datel kernel: __schedule+0x248/0x620
May 31 22:01:31 datel kernel: schedule+0x5a/0xc0
May 31 22:01:31 datel kernel: schedule_timeout+0x11d/0x160
May 31 22:01:31 datel kernel: ? load_balance+0x492/0x6d0
May 31 22:01:31 datel kernel: __wait_for_common+0x93/0x1d0
May 31 22:01:31 datel kernel: ? usleep_range_state+0x90/0x90
May 31 22:01:31 datel kernel: __flush_workqueue+0x13a/0x3f0
May 31 22:01:31 datel kernel: xlog_cil_push_now.isra.0+0x63/0xa0 [xfs]
May 31 22:01:31 datel kernel: xlog_cil_force_seq+0x73/0x240 [xfs]
May 31 22:01:31 datel kernel: ? pick_next_task+0x854/0x940
May 31 22:01:31 datel kernel: ? __update_idle_core+0x1b/0xb0
May 31 22:01:31 datel kernel: xfs_log_force+0x7a/0x230 [xfs]
May 31 22:01:31 datel kernel: xfs_log_worker+0x35/0x80 [xfs]
May 31 22:01:31 datel kernel: process_one_work+0x1e8/0x3c0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: worker_thread+0x50/0x3b0
May 31 22:01:31 datel kernel: ? rescuer_thread+0x3a0/0x3a0
May 31 22:01:31 datel kernel: kthread+0xd9/0x100
May 31 22:01:31 datel kernel: ? kthread_complete_and_exit+0x20/0x20
May 31 22:01:31 datel kernel: ret_from_fork+0x22/0x30
May 31 22:01:31 datel kernel: </TASK>
May 31 22:01:31 datel kernel: INFO: task fio:7198 blocked for more than 122 seconds.
May 31 22:01:31 datel kernel: Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
May 31 22:01:31 datel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 22:01:31 datel kernel: task:fio state:D stack: 0 pid: 7198 ppid: 7101 flags:0x00000002
May 31 22:01:31 datel kernel: Call Trace:
May 31 22:01:31 datel kernel: <TASK>
May 31 22:01:31 datel kernel: __schedule+0x248/0x620
May 31 22:01:31 datel kernel: schedule+0x5a/0xc0
May 31 22:01:31 datel kernel: io_schedule+0x42/0x70
May 31 22:01:31 datel kernel: folio_wait_bit_common+0x147/0x380
May 31 22:01:31 datel kernel: ? find_get_pages_range_tag+0x19e/0x220
May 31 22:01:31 datel kernel: ? file_check_and_advance_wb_err+0xd0/0xd0
May 31 22:01:31 datel kernel: folio_wait_writeback+0x28/0x80
May 31 22:01:31 datel kernel: write_cache_pages+0x12a/0x4c0
May 31 22:01:31 datel kernel: ? iomap_writepage_map+0x530/0x530
May 31 22:01:31 datel kernel: iomap_writepages+0x1c/0x40
May 31 22:01:31 datel kernel: xfs_vm_writepages+0x7a/0xb0 [xfs]
May 31 22:01:31 datel kernel: do_writepages+0xcf/0x1d0
May 31 22:01:31 datel kernel: ? __schedule+0x250/0x620
May 31 22:01:31 datel kernel: ? timerqueue_del+0x2a/0x50
May 31 22:01:31 datel kernel: filemap_fdatawrite_wbc+0x66/0x90
May 31 22:01:31 datel kernel: file_write_and_wait_range+0x9c/0x100
May 31 22:01:31 datel kernel: xfs_file_fsync+0x5a/0x220 [xfs]
May 31 22:01:31 datel kernel: ? __x64_sys_futex+0x73/0x1d0
May 31 22:01:31 datel kernel: __x64_sys_fsync+0x33/0x60
May 31 22:01:31 datel kernel: do_syscall_64+0x5c/0x90
May 31 22:01:31 datel kernel: ? do_syscall_64+0x69/0x90
May 31 22:01:31 datel kernel: ? __rseq_handle_notify_resume+0x32/0x50
May 31 22:01:31 datel kernel: ? exit_to_user_mode_loop+0xd0/0x130
May 31 22:01:31 datel kernel: ? exit_to_user_mode_prepare+0xec/0x100
May 31 22:01:31 datel kernel: ? syscall_exit_to_user_mode+0x12/0x30
May 31 22:01:31 datel kernel: ? do_syscall_64+0x69/0x90
May 31 22:01:31 datel kernel: ? do_syscall_64+0x69/0x90
May 31 22:01:31 datel kernel: ? do_syscall_64+0x69/0x90
May 31 22:01:31 datel kernel: ? do_syscall_64+0x69/0x90
May 31 22:01:31 datel kernel: ? do_syscall_64+0x69/0x90
May 31 22:01:31 datel kernel: ? sysvec_apic_timer_interrupt+0x3c/0x90
May 31 22:01:31 datel kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
May 31 22:01:31 datel kernel: RIP: 0033:0x7fb3d5d452db
May 31 22:01:31 datel kernel: RSP: 002b:00007fb3d5f02e80 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
May 31 22:01:31 datel kernel: RAX: ffffffffffffffda RBX: 0000555e78ef2f28 RCX: 00007fb3d5d452db
May 31 22:01:31 datel kernel: RDX: 0000000000000002 RSI: 0000000000000002 RDI: 0000000000000006
May 31 22:01:31 datel kernel: RBP: 0000000000000006 R08: 0000000000000000 R09: 00007fb3d5e00ba8
May 31 22:01:31 datel kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007fb3d5dfa2c0
May 31 22:01:31 datel kernel: R13: 00007fb3d5f03740 R14: 00007fb3d5f02eb8 R15: 0000555e78ecf340
May 31 22:01:31 datel kernel: </TASK>