Help enabling T10-DIX on a SAS disk

Hey everyone. I’ve been on an extended rabbit hole regarding T10-PI (protection information) for SCSI and NVMe devices under Linux. Basically, a set of extensions to the SCSI and NVMe specifications, referred to variously as T10-PI, T10-DIF, and T10-DIX, allow there to be end-to-end data integrity verification through checksums, checked at various parts of the chain connecting the OS to the disk.

If my understanding is correct, in the context of SAS drives,

  • T10-DIF (data integrity field) is a setup where the HBA performs checksums and transmits them to the disk along with every block. The disk is supposed to verify and store the checksum, and return it, to be checked by the HBA, upon reading that block back.
  • T10-DIX takes this a step further and allows the OS to compute the checksum. I think the HBA is supposed to verify it, and pass it along to the disk.

Anyways, I’m trying to get this working on real hardware. I have:

  • A Seagate Exos 20TB SAS drive
  • An LSI card based on the SAS2008 chip, flashed in IT mode to act as an HBA
  • An x86_64 Linux host running kernel 6.1.0-17 (debian)

I loaded the mpt3sas driver with the option prot_mask=0x7f, since according to the driver documentation (https://docs.broadcom.com/doc/Linux_Driver-RHEL7-8_SLES12-15_PCIe_P18_0.pdf) this will enable DIF and DIX.

I formatted the disk. Since (I think) DIF works by using larger blocks to store the checksum in addition to the data, at first I tried formatting it with larger sectors (4160 bytes) via sg_format --format --size=4160 --fmtpinfo=2 --pfu=0 --long /dev/sdg. The fmtpinfo flag is supposed to enable Type1 DIF according to the sg_format man page. But after that the detected capacity was zero due to unsupported block size. So I tried again with sg_format --format --size=4096 --fmtpinfo=2 --pfu=0 --long /dev/sdg And now I have a drive that shows up with a usable capacity.

I see a note in dmesg [sdg] Enabling DIF Type 1 protection, which sounds good, and lsscsi -p notes DIF/Type1 beside the drive. But, when I cat the various parameters in /sys/block/sdg/integrity/ it seems like integrity is not actually enabled:

  • device_is_integrity_capable = 0
  • format = none

I’ll note that I did all this without a machine restart. I could restart it but the host is in use so it’s inconvenient.

My questions:

  • I gather DIX is not enabled, but I’m not clear on whether DIF is even working properly. Is it? How do I know?
  • How would I enable DIX if I wanted to?
  • Does anyone have any other general advice or knowledge to share?

Related:

Have you restarted it yet? :thinking:

No! :grin:

I think what’s going on is that Type1 DIF is working, but the HBA doesn’t support DIX. Here’s why I think this:

  • lsscsi and dmesg say/said that Type 1 DIF is enabled
  • smartctl -a says
Formatted with type 1 protection
8 bytes of protection information per logical block

I started poking around in the linux source code as well, and it looks like the path /sys/block/sdg/integrity/device_is_integrity_capable is reporting on DIX, not DIF. It’s reporting on the blk_integrity struct, which is only actually registered if DIX is being used, not DIF.