Failed hard disks are in-evadable. There are many ways to provide resiliency for hard disk failure, and Windows Server 2012/Windows Server 2012 R2’s build in feature to provide this is Storage Spaces.
A hard disk failed inside my Storage Pool, so lets switch over to PowerShell to get this resolved.
- Diagnosis
- Retiring the Failed Disk
- Adding a New Disk
- Repairing the Volumes
- Remove the Lost VirtualDisks
- Removing the Failed Disk from the Pool
- Summary
Diagnosis
Firstly, open up an Administrative PowerShell prompt. To get the status of my Storage Space (which I called pool
) I run the command:
I can see that my Storage Space named pool
is in a degraded state.
To check the health of the volumes sitting inside the Storage Pool, use the command
We can see that Media
, Software
and DocumentsPhotos
volumes are have Degraded
as their OperationalStatus
. This means that they are still attached and accessible, but their reliability cannot be ensured should there be another drive failure. These volumes have either a Parity or Mirror parity setting, which has allowed Storage Spaces to save my data even with the drive failure.
The Backups
and VMTemplates
have a Detached
operational status. I was not using any resiliency mode on this data as it is easily replaced, so it looks like I have lost the data on these volumes.
To get an idea what is happening at the physical disk layer I run the command:
We can see that PhysicalDisk1
is in a failed state. As the HP N40L has a 4 bay enclosure with 4TB Hard Disks in them, it is easy to determine that PhyisicalDisk1 is in the first bay in the enclosure.
Retiring the Failed Disk
Now I determined which disk had failed, the server was shutdown and the failed disk from the first bay was replaced with a spare 4TB Hard Disk.
With the server back online, open PowerShell back up with administrative permissions and check what the physical disks look like now:
We can see that the new disk that was installed has taken the FriendlyName
of PhysicalDisk1
and has a HealthStatus
of Healthy
. The failed disk has lost its FriendlyName
and its OperationalStatus
has changed to Lost Communication
.
First lets single out the missing disk:
Assign the missing disk to a variable:
Next we need to tell the storage pool that the disk has been retired:
Adding a New Disk
To add the replacement disk into the Storage Pool
Repairing the Volumes
The next step after adding the new disk to the Storage Pool is to repair each of the Virtual Disks residing on it.
We can see the the repair running by entering:
The OperationalStatus
of InService
lets us know the volume is currently being repaired. The percentage completion of the repair can be found by running:
Remove the Lost VirtualDisks
Since there were no parity on the VMTemplates
and Backups
Volumes, they can be deleted with the following command:
Removing the Failed Disk from the Pool
This step will not work if you still have Degraded
disk in the Storage Pool, so make sure all repairs complete first.
Summary
To wrap up, to replace a failed disk in a storage pool:
The full list of Windows Server Storage Spaces CmdLets can be found on TechNet here: http://technet.microsoft.com/en-us/library/hh848705.aspx.