On Thu, May 23, 2024 at 09:50:09AM +0800, yangxingui wrote:I found that lsscsi will traverse the dev node and obtain device major and min. If no matching dev node is found, it will display "- ".
Hi, Greg
On 2024/5/22 20:23, Greg KH wrote:
On Wed, May 22, 2024 at 11:43:46AM +0000, Xingui Yang wrote:We execute the lsscsi command after the disk is connected, we occasionally
Currently, no exception information is output when devtmpfs create node
failed, so add log info for it.
Why? Who is going to do something with this?
find that some disks do not have dev nodes and these disks cannot be used.
Ok, but why do you think that devtmpfs create failed?
My subsequent reply touches on these points.
However, there is no abnormal log output during disk scanning. We analyze
that it may be caused by the failure of devtmpfs create dev node, so the log
is added here.
But is that the case? Why is devtmpfs failing? Shouldn't we fix that
instead?
The lscsi command query results and kernel logs as follows:
[root@localhost]# lsscsi
[9:0:4:0] disk ATA ST10000NM0086-2A SN05 -
kernel: [586669.541218] hisi_sas_v3_hw 0000:b4:04.0: phyup: phy0
link_rate=10(sata)
kernel: [586669.541341] sas: phy-9:0 added to port-9:0, phy_mask:0x1
(5000000000000900)
kernel: [586669.541511] sas: DOING DISCOVERY on port 0, pid:2330731
kernel: [586669.541518] hisi_sas_v3_hw 0000:b4:04.0: dev[4:5] found
kernel: [586669.630816] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
kernel: [586669.665960] hisi_sas_v3_hw 0000:b4:04.0: phydown: phy0
phy_state=0xe
kernel: [586669.665964] hisi_sas_v3_hw 0000:b4:04.0: ignore flutter phy0
down
kernel: [586669.863360] hisi_sas_v3_hw 0000:b4:04.0: phyup: phy0
link_rate=10(sata)
kernel: [586670.024482] ata19.00: ATA-10: ST10000NM0086-2AA101, SN05, max
UDMA/133
kernel: [586670.024487] ata19.00: 19532873728 sectors, multi 16: LBA48 NCQ
(depth 32), AA
kernel: [586670.027471] ata19.00: configured for UDMA/133
kernel: [586670.027490] sas: --- Exit sas_scsi_recover_host: busy: 0 failed:
0 tries: 1
kernel: [586670.037541] sas: ata19: end_device-9:0:
model:ST10000NM0086-2AA101 serial: ZA2B3PR2
kernel: [586670.100856] scsi 9:0:4:0: Direct-Access ATA ST10000NM0086-2A
SN05 PQ: 0 ANSI: 5
kernel: [586670.101114] sd 9:0:4:0: [sdk] 19532873728 512-byte logical
blocks: (10.0 TB/9.10 TiB)
kernel: [586670.101116] sd 9:0:4:0: [sdk] 4096-byte physical blocks
kernel: [586670.101125] sd 9:0:4:0: [sdk] Write Protect is off
kernel: [586670.101137] sd 9:0:4:0: [sdk] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
kernel: [586670.101620] sd 9:0:4:0: Attached scsi generic sg10 type 0
kernel: [586670.101714] sas: DONE DISCOVERY on port 0, pid:2330731, result:0
kernel: [586670.101731] sas: sas_form_port: phy0 belongs to port0
already(1)!
kernel: [586670.152512] sd 9:0:4:0: [sdk] Attached SCSI disk
Looks like sdk was found properly, what's the problem?
Ok.
Do you recommend using pr_err()?
Signed-off-by: Xingui Yang <yangxingui@xxxxxxxxxx>
---
drivers/base/core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 5f4e03336e68..32a41e0472b2 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3691,7 +3691,10 @@ int device_add(struct device *dev)
if (error)
goto SysEntryError;
- devtmpfs_create_node(dev);
+ error = devtmpfs_create_node(dev);
+ if (error)
+ pr_info("devtmpfs create node for %s failed: %d\n",
+ dev_name(dev), error);
Why is an error message pr_info()?
Do not print errors at the information level :)
Yes, /dev/sdk is a file that may create by fio. and I tried to do the same and could reproduce the problem, and after adding the log, it shows that the file already exists.
And again, why is this needed? If this needs to be checked, why are youAs above, we want to get the error info when the dev node fails to be
now checking it but ignoring the error?
What would this help with?
created. We currently haven't figured out how to handle this exception well.
But judging from the problems we are currently encountering, some may be
because the corresponding dev node already exists, causing the creation to
fail, but the node information is incorrect and the device cannot be used.
as follows:
[root@localhost]# ll /dev/sdk
-rw-------. 1 root root 5368709120 Jul 8 09:51 /dev/sdk
Looks like the device node is created to me. What is incorrect about
it, the values? What is 'll' an alias for? And are you sure that other
tools aren't getting the device node creation uevent and doing something
with it in userspace? How do you know this is the kernel failing?
Wait, is /dev/sdk really a device node and not a file? Perhaps
something else wrote to it first, before it was created? And that's why
devtmpfs couldn't create it. That sounds like a userspace error,
nothing the kernel can do about it.