Nvme raid locking up

I have an HP DL380 Gen 9 with a RAID5 array built from 6 INTEL SSDPE2MX020T4
devices. That raid device makes up a volume group with a couple logical
volumes with XFS filesystems backing VM storage. Twice now in 2 months the
raid array has become mostly unresponsive:

May 08 03:33:21 host kernel: INFO: task worker:1798511 blocked for more than
120 seconds.
May 08 03:33:21 host kernel:       Not tainted 4.18.0-348.23.1.el8_5.x86_64 #1
May 08 03:33:21 host kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 08 03:33:21 host kernel: task:worker          state:D stack:    0
pid:1798511 ppid:     1 flags:0x000043a0
May 08 03:33:21 host kernel: Call Trace:
May 08 03:33:21 host kernel:  __schedule+0x2bd/0x760
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  schedule+0x37/0xa0
May 08 03:33:21 host kernel:  md_bitmap_startwrite+0x16f/0x1e0
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  add_stripe_bio+0x4a3/0x7c0 [raid456]
May 08 03:33:21 host kernel:  raid5_make_request+0x1bf/0xb60 [raid456]
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  ? blk_queue_split+0xd4/0x660
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  md_handle_request+0x119/0x190
May 08 03:33:21 host kernel:  md_make_request+0x84/0x160
May 08 03:33:21 host kernel:  generic_make_request+0x25b/0x350
May 08 03:33:21 host kernel:  submit_bio+0x3c/0x160
May 08 03:33:21 host kernel:  iomap_submit_ioend.isra.38+0x4a/0x70
May 08 03:33:21 host kernel:  iomap_writepage_map+0x422/0x670
May 08 03:33:21 host kernel:  write_cache_pages+0x197/0x420
May 08 03:33:21 host kernel:  ? iomap_invalidatepage+0xe0/0xe0
May 08 03:33:21 host kernel:  iomap_writepages+0x1c/0x40
May 08 03:33:21 host kernel:  xfs_vm_writepages+0x64/0x90 [xfs]
May 08 03:33:21 host kernel:  do_writepages+0x41/0xd0
May 08 03:33:21 host kernel:  __filemap_fdatawrite_range+0xcb/0x100
May 08 03:33:21 host kernel:  file_write_and_wait_range+0x4c/0xa0
May 08 03:33:21 host kernel:  xfs_file_fsync+0x69/0x200 [xfs]
May 08 03:33:21 host kernel:  do_fsync+0x38/0x70
May 08 03:33:21 host kernel:  __x64_sys_fdatasync+0x13/0x20
May 08 03:33:21 host kernel:  do_syscall_64+0x5b/0x1a0
May 08 03:33:21 host kernel:  entry_SYSCALL_64_after_hwframe+0x65/0xca
May 08 03:33:21 host kernel: RIP: 0033:0x7f969efb858f
May 08 03:33:21 host kernel: Code: Unable to access opcode bytes at RIP
0x7f969efb8565.
May 08 03:33:21 host kernel: RSP: 002b:00007f94b3ffe6b0 EFLAGS: 00000293
ORIG_RAX: 000000000000004b
May 08 03:33:21 host kernel: RAX: ffffffffffffffda RBX: 000000000000000e RCX:
00007f969efb858f
May 08 03:33:21 host kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
000000000000000e
May 08 03:33:21 host kernel: RBP: 0000563f940b5b20 R08: 0000000000000000 R09:
0000000032f01b0c
May 08 03:33:21 host kernel: R10: 0000000e171e5000 R11: 0000000000000293 R12:
0000563f92a73bb4
May 08 03:33:21 host kernel: R13: 0000563f940b5b88 R14: 0000563f94097eb0 R15:
00007f94b3ffe800
May 08 03:33:21 host kernel: INFO: task worker:1799573 blocked for more than
120 seconds.
May 08 03:33:21 host kernel:       Not tainted 4.18.0-348.23.1.el8_5.x86_64 #1
May 08 03:33:21 host kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 08 03:33:21 host kernel: task:worker          state:D stack:    0
pid:1799573 ppid:     1 flags:0x000043a0
May 08 03:33:21 host kernel: Call Trace:
May 08 03:33:21 host kernel:  __schedule+0x2bd/0x760
May 08 03:33:21 host kernel:  schedule+0x37/0xa0
May 08 03:33:21 host kernel:  io_schedule+0x12/0x40
May 08 03:33:21 host kernel:  wait_on_page_bit+0x137/0x230
May 08 03:33:21 host kernel:  ? file_fdatawait_range+0x20/0x20
May 08 03:33:21 host kernel:  __filemap_fdatawait_range+0x88/0xe0
May 08 03:33:21 host kernel:  file_write_and_wait_range+0x76/0xa0
May 08 03:33:21 host kernel:  xfs_file_fsync+0x69/0x200 [xfs]
May 08 03:33:21 host kernel:  do_fsync+0x38/0x70
May 08 03:33:21 host kernel:  __x64_sys_fdatasync+0x13/0x20
May 08 03:33:21 host kernel:  do_syscall_64+0x5b/0x1a0
May 08 03:33:21 host kernel:  entry_SYSCALL_64_after_hwframe+0x65/0xca
May 08 03:33:21 host kernel: RIP: 0033:0x7f20c514c58f
May 08 03:33:21 host kernel: Code: Unable to access opcode bytes at RIP
0x7f20c514c565.
May 08 03:33:21 host kernel: RSP: 002b:00007f1ef4ff86b0 EFLAGS: 00000293
ORIG_RAX: 000000000000004b
May 08 03:33:21 host kernel: RAX: ffffffffffffffda RBX: 000000000000001b RCX:
00007f20c514c58f
May 08 03:33:21 host kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
000000000000001b
May 08 03:33:21 host kernel: RBP: 00005594bed1f120 R08: 0000000000000000 R09:
00000000ffffffff
May 08 03:33:21 host kernel: R10: 00007f1ef4ff86a0 R11: 0000000000000293 R12:
00005594bd72ebb4
May 08 03:33:21 host kernel: R13: 00005594bed1f188 R14: 00005594bed31c30 R15:
00007f1ef4ff8800
May 08 03:33:21 host kernel: INFO: task worker:871154 blocked for more than
120 seconds.
May 08 03:33:21 host kernel:       Not tainted 4.18.0-348.23.1.el8_5.x86_64 #1
May 08 03:33:21 host kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 08 03:33:21 host kernel: task:worker          state:D stack:    0
pid:871154 ppid:     1 flags:0x000043a0
May 08 03:33:21 host kernel: Call Trace:
May 08 03:33:21 host kernel:  __schedule+0x2bd/0x760
May 08 03:33:21 host kernel:  schedule+0x37/0xa0
May 08 03:33:21 host kernel:  io_schedule+0x12/0x40
May 08 03:33:21 host kernel:  wait_on_page_bit+0x137/0x230
May 08 03:33:21 host kernel:  ? file_fdatawait_range+0x20/0x20
May 08 03:33:21 host kernel:  __filemap_fdatawait_range+0x88/0xe0
May 08 03:33:21 host kernel:  file_write_and_wait_range+0x76/0xa0
May 08 03:33:21 host kernel:  xfs_file_fsync+0x69/0x200 [xfs]
May 08 03:33:21 host kernel:  do_fsync+0x38/0x70
May 08 03:33:21 host kernel:  __x64_sys_fdatasync+0x13/0x20
May 08 03:33:21 host kernel:  do_syscall_64+0x5b/0x1a0
May 08 03:33:21 host kernel:  entry_SYSCALL_64_after_hwframe+0x65/0xca
May 08 03:33:21 host kernel: RIP: 0033:0x7f13d27fd58f
May 08 03:33:21 host kernel: Code: Unable to access opcode bytes at RIP
0x7f13d27fd565.
May 08 03:33:21 host kernel: RSP: 002b:00007f0f697f96b0 EFLAGS: 00000293
ORIG_RAX: 000000000000004b
May 08 03:33:21 host kernel: RAX: ffffffffffffffda RBX: 000000000000000e RCX:
00007f13d27fd58f
May 08 03:33:21 host kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
000000000000000e
May 08 03:33:21 host kernel: RBP: 00005594f48b9010 R08: 0000000000000000 R09:
00000000ffffffff
May 08 03:33:21 host kernel: R10: 00007f0f697f96a0 R11: 0000000000000293 R12:
00005594f2222bb4
May 08 03:33:21 host kernel: R13: 00005594f48b9078 R14: 00005594f4e8ee50 R15:
00007f0f697f9800
May 08 03:33:21 host kernel: INFO: task kworker/u97:2:1790841 blocked for more
than 120 seconds.
May 08 03:33:21 host kernel:       Not tainted 4.18.0-348.23.1.el8_5.x86_64 #1
May 08 03:33:21 host kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 08 03:33:21 host kernel: task:kworker/u97:2   state:D stack:    0
pid:1790841 ppid:     2 flags:0x80004080
May 08 03:33:21 host kernel: Workqueue: writeback wb_workfn (flush-253:3)
May 08 03:33:21 host kernel: Call Trace:
May 08 03:33:21 host kernel:  __schedule+0x2bd/0x760
May 08 03:33:21 host kernel:  ? blk_flush_plug_list+0xc2/0x100
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  schedule+0x37/0xa0
May 08 03:33:21 host kernel:  md_bitmap_startwrite+0x16f/0x1e0
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  add_stripe_bio+0x4a3/0x7c0 [raid456]
May 08 03:33:21 host kernel:  raid5_make_request+0x1bf/0xb60 [raid456]
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  ? blk_queue_split+0xd4/0x660
May 08 03:33:21 host kernel:  ? finish_wait+0x80/0x80
May 08 03:33:21 host kernel:  md_handle_request+0x119/0x190
May 08 03:33:21 host kernel:  md_make_request+0x84/0x160
May 08 03:33:21 host kernel:  generic_make_request+0x25b/0x350
May 08 03:33:21 host kernel:  submit_bio+0x3c/0x160
May 08 03:33:21 host kernel:  iomap_submit_ioend.isra.38+0x4a/0x70
May 08 03:33:21 host kernel:  iomap_writepage_map+0x422/0x670
May 08 03:33:21 host kernel:  write_cache_pages+0x197/0x420
May 08 03:33:21 host kernel:  ? iomap_invalidatepage+0xe0/0xe0
May 08 03:33:21 host kernel:  iomap_writepages+0x1c/0x40
May 08 03:33:21 host kernel:  xfs_vm_writepages+0x64/0x90 [xfs]
May 08 03:33:21 host kernel:  do_writepages+0x41/0xd0
May 08 03:33:21 host kernel:  __writeback_single_inode+0x39/0x2f0
May 08 03:33:21 host kernel:  writeback_sb_inodes+0x1e6/0x450
May 08 03:33:21 host kernel:  __writeback_inodes_wb+0x5f/0xc0
May 08 03:33:21 host kernel:  wb_writeback+0x25b/0x2f0
May 08 03:33:21 host kernel:  wb_workfn+0x344/0x4c0
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x35/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x41/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x35/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x41/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x35/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x41/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x35/0x70
May 08 03:33:21 host kernel:  ? __switch_to_asm+0x41/0x70
May 08 03:33:21 host kernel:  process_one_work+0x1a7/0x360
May 08 03:33:21 host kernel:  worker_thread+0x30/0x390
May 08 03:33:21 host kernel:  ? create_worker+0x1a0/0x1a0
May 08 03:33:21 host kernel:  kthread+0x116/0x130
May 08 03:33:21 host kernel:  ? kthread_flush_work_fn+0x10/0x10
May 08 03:33:21 host kernel:  ret_from_fork+0x35/0x40

I have another nearly identical system that has run without trouble, though
not with as much IO load as this one. Is there anything else I can check to
see if there is a hardware issue or if this might be an issue with the linux
RAID system?