zfs changes back to /dev/sdx device names upon reboot

Created on 1 Dec 2014  Â·  33Comments  Â·  Source: openzfs/zfs

I use /dev/disk/by-id/ device names, but every time I reboot the devices are reimported as /dev/sdx names. I export the tank, then force an import with the desired device names like this:

zpool import -d /dev/disk/by-id tank

and then zpool status shows the devices imported with the /dev/disk/by-id names. I can then export the tank, and reimport with no -d argument and the devices come in with the /dev/disk/by-id names as they should. However as soon as I reboot, the tank comes in with the /dev/sdx names that I don't want. I suspect this has something to do with systemd and associated scripts, which is why I am reporting this issue here rather than to zfsonlinux. Prior to upgrading to 14.04 I did not have this problem (though before the upgrade I did have problems getting the zfs tank to automount on boot, despite putting an extra delay in a systemd script; I had to finally edit /etc/defaul/zfs to force mount at boot, but those options were removed from /etc/default/zfs at the last upgrade and no longer seem to be necessary).

I have tried exporting, removing zpool.cache, reimporting with desired device names, and then rebooting, and the tank comes in with the undesired /dev/sdx names.

Thanks for the excellent work on zfs. It has worked great for me for years.

Version info below.

Thanks,
Rich

root@main:~# dpkg -l | grep zfs
ii dkms 2.2.0.3-1.1ubuntu5+zfs9~trusty all Dynamic Kernel Module Support Framework
ii libzfs2 0.6.3-3~trusty amd64 Native ZFS filesystem library for Linux
ii mountall 2.53-zfs1 amd64 filesystem mounting tool
ii ubuntu-zfs 8~trusty amd64 Native ZFS filesystem metapackage for Ubuntu.
ii zfs-dkms 0.6.3-3~trusty amd64 Native ZFS filesystem kernel modules for Linux
ii zfs-doc 0.6.3-3~trusty amd64 Native ZFS filesystem documentation and examples.
ii zfsutils 0.6.3-3~trusty amd64 Native ZFS management utilities for Linux
root@main:~# uname -a
Linux main 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Most helpful comment

This was recently fixed in master with commit 325414e483a7858c1d10fb30cefe5749207098f4, the plan is to backport it for the next point release.

All 33 comments

To clarify my statement "I suspect this has something to do with systemd and associated scripts, which is why I am reporting this issue here rather than to zfsonlinux": I now realize this is in fact the zfsonlinux bug tracker and not specifically for the Ubuntu ppa packaging of zfs as I thought at first. I suspect the problem I am seeing has to do with how systemd or other Ubuntu-specific scripts are handling the zfs tank on shutdown and/or restart, so if the this is not the right forum for this report, my apologies. Just in case someone here as a clue I do want to emphasize that merely exporting and reimporting the tank on a running system does _not_ cause reversion to the /dev/sdx device names. Only a reboot causes the undesired device names to be forgotten (does the Ubuntu startup/shutdown delete zpool.cache?) One additional clue: as I mentioned, before upgrading to 14.04, I had difficulty getting the tank to automount via systemd. Based on some research I think this might have been related to my LSI SATA controller not initializing fast enough. Putting looong delays into systemd scripts did not help. Perhaps the normal zfs mount is still failing, causing deletion of zpool.cache and loss of the desired device names, and some new scanning process is taking over and finding the tank and mounting it with the raw /dev/sdx device names.

Rich

I suspect the problem I am seeing has to do with how systemd or other Ubuntu-specific scripts are handling the zfs tank on shutdown and/or restart, so if the this is not the right forum for this report, my apologies.

@rpdrewes, this is zfsonlinux/pkg-zfs#132.

The systemd packages published for Ubuntu 14.04 are incompatible with the PPA and unlikely to be updated. Ubuntu is tentatively planning a full systemd transition for the Ubuntu 15.10 release, which will happen in the W series next year.

does the Ubuntu startup/shutdown delete zpool.cache?

No, and the /etc/zfs/zpool.cache file must be persistent to get the desired behavior.

Perhaps the normal zfs mount is still failing, causing deletion of zpool.cache and loss of the desired device names, and some new scanning process is taking over and finding the tank and mounting it with the raw /dev/sdx device names.

Any automatic deletion of the /etc/zfs/zpool.cache file would be a defect.

Thanks for your response. I think what you are saying is that the reversion to /dev/sdx device names upon reboot is some problem with Ubuntu's systemd integration with ZFS and that should get fixed in a future Ubuntu release when Ubuntu gets systemd figured out.

I'm surprised though that more Ubuntu+zfsonlinux aren't reporting the problem of device names reverting to /dev/sdx upon reboot. I wonder if for most use cases this problem does not happen and it is only because of some aspect of my system (using the LSI SATA controller for example) that I have the problem.

Thanks again,
Rich

You could run "zpool history" to see what import command is currently being run in your pool on boot.

Maybe you can just find the script that does your import on startup and change it to something the import with the ids for now.

More information: the device names revert upon reboot to /dev/sdx from /dev/disk/by-id/ _only_ for devices connected to the LSI SATA controller! I know this sounds improbable but the behavior is easy to reproduce and selectively happens only to disks attached to that controller and not to the other SATA ports on the motherboard. Based on some references I have found around the web, there seem to have been issues with related controllers initializing too slowly and breaking automounting with systemd. Failure to automount was a problem I had seen on an earlier release of zfs/Ubuntu, but with recent updates to zfs/Ubuntu the mounting is happening but the device names are getting changed without explanation _just for disks on the LSI controller_.

The LSI SATA controller in question is this one, which is built into the Supermicro X10SL7-F motherboard:
description: Serial Attached SCSI controller
product: SAS2308 PCI-Express Fusion-MPT SAS-2
vendor: LSI Logic / Symbios Logic

I am using this LSI controller with the default firmware that supports onboard RAID, though of course I am not using the onboard RAID features and I'm just accessing the disks directly for ZFS and raidz2. There is a special firmware load for the controller that leaves out the onboard RAID capabilities and just supports JBOD. I have seen suggestions around the web that the automounting problems mentioned above might have gone away by installing the JBOD firmware, and I wouldn't be surprised if that JBOD firmware might fix this device name reversion problem.

Rich

I would believe that has something to do with it.

I use that motherboard in my server and have all 12 of my disks connected to the LSI controller via a SAS expander.

I have not ever had this problem but I never used the LSI controller in RAID mode. I flashed the IT firmware for passthrough mode before i ever used ZFS with it.

RAID mode is just not good for ZFS.

I upgraded the LSI SATA controller firmware to IT (JBOD) mode and after several reboots the device names for the disks in ZFS are not reverting to /dev/sdx (from the desired /dev/disk/by-id/x) as they were previously. So I think the firmware upgrade fixed the issue but I'm not sure why. It seems to be that the IR mode firmware made initialization of the disks on the LSI SATA controller happen in a way that didn't inform systemd the way it wants and then the mounting of the zfs zpool gets picked up again later by some other process that does not respect the proper /dev/disk/by-id device names from zpool.cache.

Here is a link to a post that is possibly related: https://github.com/zfsonlinux/pkg-zfs/issues/55

I wonder if simply putting mpt2sas into the initrd would also have fixed the problem . . .

Thanks,
Rich

One more thing, since someone will eventually have this problem and want more details. This is a good description of how to do the firmware upgrade:

http://blog.widodh.nl/2014/10/flash-lsi-2308-to-it-mode-on-a-supermicro-x10sl7-f-mainboard/

Rich

Just a note I see with those instructions.

It recommends to flash the latest firmware which is P19. However the mpt2sas driver in the Linux kernel is P16. LSI recommends matching the driver and firmware for best operation.

I would either re-flash to P16 or install the P19 driver from LSI on your system.

Good idea.

Rich

i have zfs box also suffering the issue,but initially we use /dev/sd*,but unancipirated reboot,import,we found some /dev/sd? name being replaced with the id name.

cents 6.4 not debian like.

zpool status

pool: zpool
state: ONLINE
scan: none requested
config:

NAME                                         STATE     READ WRITE CKSUM
zpool                                        ONLINE       0     0     0
  raidz1-0                                   ONLINE       0     0     0
    ata-HGST_HUS724040ALA640_PN1338P4G39LAB  ONLINE       0     0     0
    ata-HGST_HUS724040ALA640_PN1338P4G4EXLB  ONLINE       0     0     0
    ata-HGST_HUS724040ALA640_PN1334PCGYUHWS  ONLINE       0     0     0
    ata-HGST_HUS724040ALA640_PN1338P4G4HY7B  ONLINE       0     0     0
    sdg                                      ONLINE       0     0     0
    sdh                                      ONLINE       0     0     0
    sdi                                      ONLINE       0     0     0
    sdj                                      ONLINE       0     0     0
    sdk                                      ONLINE       0     0     0
    sdl                                      ONLINE       0     0     0
    sdm                                      ONLINE       0     0     0
    sdf                                      ONLINE       0     0     0

errors: No known data errors

so issue is contrary to the one you supplied,but maybe same in fact.

inevity,

Is it the case that all the drives that had their names changed from sdx to ata-x are wired to one disk controller, and all the drives that didn't have their names changed are wired to a different controller?

Rich

(By the way, in some of my earlier posts I referred to systemd when I actually meant upstart. I am not using the systemd PPA on my Ubuntu systems. The problem I reported initially was apparently related to an interaction between upstart and the slow? initialization of the RAID firmware on my LSI disk controller, which somehow caused the devices in zpools to be permanently renamed from /dev/disk/by-id/x to /dev/sdx. This problem went away when I changed to JBOD firmware on the LSI disk controller.)

Rich

@rpdrewes ,the four disk ata-HGST_HUS724040ALA640* all belong to LSI SAS2308 HBA. other sd* disk belong to other controller.

Maybe i should try the firmware of hba.

I have the same problem, my disks are attached to builtin LSI 2308 HBA on motherboard Supermicro X9DA7 . I'm running ArchLinux with the mainline kernel 1.18.1 (no patches) and zsf 0.63 (no patches eitehr) and zfs is root filesystem mirrored on 2 devices. I tried to fix the problem by temporarily attaching more devices to the mirror selecting new disks by id (i.e. temporarily giving it more redundancy than required) . However at the last stage when I remove 1 mirrored device which was sdx and only leave 2 other mirrored devices selected by id and then restart the system, it won't boot anymore. Zfs complains that there are no labels on any device in the pool and there is nothing I can do to fix it. I can remove zpool.cache , unload zfs module , load it again and successfully import my zpool and then export it again, however the "solution" is not persisted no mater how I try to store it to persisted zpool.cache on mounted zfs filesystem - on next the next boot I encounter exactly same problem.

@Bronek, did you try changing the firmware on the LSI controller from RAID mode to JBOD mode, as described in the earlier posts? For me, doing that mostly (but not completely) eliminated the problem.

Yes I did, still have the problem. However I also have actual LSI MegaRAID card installed in this machine which is not meant to be used by Linux at all, however its driver gets loaded. I will report again after I have blacklisted megaraid_sas module and test zfs behaviour then. Also I'm looking for a way to add a delay before zfs module is loaded in mkinitcpio (I'm using ArchLinux), hoping this might help as well.

I blacklisted megaraid and still have this problem, now using 0.6.3-1.2 plus small selection of patches merged from master zfs. I think this might be related to errors I often receive when running zpool import -d /dev/disk/by-id (it's a dual boot machine, both running exactly same kernel, zfs and spl binaries).

Hi,
I am having the same issue on our 14.04 server. We have 15 drives in a pool, and every time I reboot different numbers of drives are labelled as either the /dev/sdX name or /dev/disk/by-id/. The actual drives that are labelled correctly by-id differ every time I reboot. Most drives are connected via a LSI SAS 9211-8i cards that are in JBOD mode (flashed with IT firmware), using the P20 version and associated linux drivers. The remaining drives are connected directly to an Intel SP2600COE motherboard in RSTe (pass through) mode.

Also, about 3/10 boots the system hangs shortly after the ZFS module is loaded, just after it mounts the root (mdadm) partition. I presume it is trying to mount the zfs pool. If I export the pool and disconnect the drives it boots successfully 100% of the time.

I'm guessing its an issue related to the order everything is loaded. Any help or suggestions would be much appreciated. Not sure if this is the best place to ask for help, however it seems to be related to the issues others have been having. I'm happy to email the boot logs if it helps.
Cheers,
Dave

I've followed the instructions to add a devicewait script to mountall.conf. So far it appears to have fixed the device renaming, however it still hangs on mount about 3/10 times (even though the devicewait script successfully finds all drives in the pool).

The only other clue is that when I hit ctrl-alt-delete after it hangs, it outputs a message to the screen "An error occurred mounting /sys/firmware/efi/efivars". I'm not sure if this is related, however it's possible that my problem is not a zfsonlinux issue.

I can reproduce this bug using Ubuntu 14.04 on AWS EC2 with EBS volumes.

Steps:

  1. Spin up an Ubuntu 14.04 instance in EC2 and make sure it's fully patched
  2. Attach 4 EBS SSD volumes (gp2) to sdf, sdg, sdh & sdi
  3. Run these commands to set up ZFS the way I have
apt-add-repository -y ppa:zfs-native/stable
apt-get update && apt-get -y install ubuntu-zfs
modprobe zfs
for d in /dev/xvdf /dev/xvdg /dev/xvdh /dev/xvdi; do
  parted -s $d unit s mklabel gpt mkpart primary zfs 2048s 100%
done
zpool create bug2944 /dev/disk/by-partuuid/*
zpool status bug2944

Confirm that the zpool was created with devices from /dev/disk/by-partuuid/* and not from /dev/xvd*.

Finally, reboot and check zpool status again. When I do this, the zpool is made up of /dev/xvd* devices, rather than from the /dev/disk/by-partuuid/* devices. Of course, I can always export and import, but would be quite awkward.

Any suggestions?

Same issue here as described above where devices imported or created with /dev/disk/by-id go to /dev/sdX after a reboot. This is on Ubuntu 14.04 LTS. If it makes any difference the hardware is SSG-6048R-E1CR24L which has LSI 3008 (IT mode). Interestingly the logs and cache devices show up fine with /dev/disk/by-id and are on a different controller on the motherboard.

Same here. I installed a new Ubuntu 14.04 LTS server because my boot drive crashed. Installed zfs and imported the existing pools by id. Zpool status showed the pools correct with disks by id. After reboot zpool status showed the disk with sdx.
I tried zpool export/import several times. Same result on every try :(

Just wanted to chime in that this has been an issue with my Ubuntu 14.04.3 LTS and Perc H310/LSI SAS 9211-8i that have been flashed to IT mode. If a fix is coming for Ubuntu 15, I suppose I can hang on... has anyone else had success with the device wait script?

Also just met this problem (IBM M1015 reflashed to P16 IT 9211-8i) on a Debian Wheezy derivative (OpenMediaVault 2.1.15). Using option ZPOOL_IMPORT_PATH in file /etc/default/zfs as advised in this comment solved it for me, but I'm not sure that it doesn't juste "hide" the real issue.

root@joel-server:~# dpkg -l | grep -i zfs
ii  debian-zfs                         7~wheezy                         amd64        Native ZFS filesystem metapackage for Debian.
ii  dkms                               2.2.0.3-1.2+zfs6                 all          Dynamic Kernel Module Support Framework
ii  libzfs2                            0.6.5.2-1-wheezy                 amd64        Native ZFS filesystem library for Linux
ii  libzpool2                          0.6.5.2-1-wheezy                 amd64        Native ZFS pool library for Linux
ii  openmediavault-zfs                 0.6.4.2                          amd64        OpenMediaVault plugin for ZFS
ii  zfs-dkms                           0.6.5.2-1-wheezy                 all          Native ZFS filesystem kernel modules for Linux
ii  zfsutils                           0.6.5.2-1-wheezy                 amd64        command-line tools to manage ZFS filesystems
root@joel-server:~# uname -a
Linux joel-server 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u4 x86_64 GNU/Linux

By the way, I tried to use P19 on my IBM M1015/LSI 9211-8i using Ubuntu 14.04 (Linux Mint 17 actually). It resulted in tons of UDMA CRC errors, so I went back to P16. It was about 1 year ago and I didn't touch the drivers (to be honest, I didn't know there were newer drivers available).

Same for me. stock Ubuntu 16.04 on a Dell T110 ii. Drives attached to the Dell Perc S110 which might actually be an LSI card

This was recently fixed in master with commit 325414e483a7858c1d10fb30cefe5749207098f4, the plan is to backport it for the next point release.

I just switched from Ubuntu 14.04 LTS to Arch and ran into this bug also, I too have an LSI HBA flashed to IT mode. Deleting /etc/zfs/zpool.cache and rebooting allows everything to be mounted again.

Behlendorf - Is this fix back ported? How can I tell?

@Ralithune yes, currently it's available as part of PR #4605 from @nedbass . It will be merged in to the release branch with perhaps a few small tweaks. But if you want to test it sooner it's there.

@behlendorf I'm totally new to using git to deploy major software like this - is there a how-to on how to install the release candidate? I'm also using Ubutnu 14.04.X, and the release candidate appears to be failing the checks for that?

Edit: Apparently it's failing the (TEST) build of Ubuntu 14.04, but passing the other Ubuntus (???). Thanks for your help on this.

@Ralithune there's not as much documentation as I'd like but we're working on it.

For testing purposes you should be able to follow the guide on the wiki for building ZFS with a few small modifications.

  • Add @nedbass as a Git remote and checkout his branch from the PR for the build.
  • After building you can use the Use the zfs.sh script to unload the existing kernel modules and load the ones you just built.
  • Run the zpool export -a then zpool import <pool> from cmd/zpool directory to verify the vdev names are preserved as expected.

Alternately, if you can wait a few days we should have it tagged.

Closing. Addressed in the 0.6.5.7 release which was just tagged.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dmaziuk picture dmaziuk  Â·  52Comments

ltz3317 picture ltz3317  Â·  82Comments

allanjude picture allanjude  Â·  72Comments

torn5 picture torn5  Â·  57Comments

cytrinox picture cytrinox  Â·  66Comments