Solutions for Solaris SVM Needs Maintenance and Last Erred status

This post is because while patching a Solaris 10 server with patch level Generic_142900-02 to Generic_142900-13.. There’s a need maintenance error on one of the sub mirror:

When a slice in a mirror or RAID5 metadevice device experiences errors,
DiskSuite puts the slice in the “Maintenance” state. No further reads or
writes are performed to a slice in the “Maintenance” state. Subsequent
errors on other slices in the same metadevice are handled differently,
depending on the type of the metadevice.

A mirror may be able to tolerate many slices in the “Maintenance” state and still be read from and written to. A RAID5 metadevice, by definition, can only tolerate a single slice in the “Maintenance” state. When either a mirror or RAID5 metadevice has a slice in the “Last Erred” state, I/O is still attempted to the slice marked “Last Erred”. This is because a “Last Erred” slice contains the last good copy of data from DiskSuite’s point of view.


With a slice in the “Last Erred” state, metadevice behaves like a normal
device (disk) and returns I/O errors to an application. Usually, at this
point some data has been lost.
Always replace slices in the “Maintenance” state, followed by those in the
“Last Erred” state. After a slice is replaced and resynched, use the metastat
command to verify its state, then validate data to make sure it is good.
Here are the specifics for Mirrors and RAID5 devices:
1. Mirrors
If slices are in the “Maintenance” state, no data has been lost. You can
safely replace or enable the slices in any order. If a slice is in the “Last
Erred” state, you cannot replace it until you first replace all the other
mirrored slices in the “Maintenance” state. Replacing or enabling a slice in
the “Last Erred” state usually means that some data has been lost. Be sure
to validate the data on the mirror after repairing it.
2. RAID5 metadevices
A RAID5 metadevice can tolerate a single slice failure. You can safely
replace a single slice in the “Maintenance” state without losing data. If
an error on another slice occurs, it is put into the “Last Erred” state. At
this point, the RAID5 metadevice is a read-only device; you need to perform
some type of error recovery so that the state of the RAID5 metadevice is
non-errored and the possibility of data loss is reduced. If a RAID5
metadevice reaches a “Last Erred” state, there is a good chance it has
lost data. Be sure to validate the data on the RAID5 metadevice after
repairing it.
How to remove “maintenance” and “last erred” example, In this point maybe pay attention
in some case The “lost Erred” sub-mirror side have contained “maintenance” and “Okey” state

Metastat gives :
d0: Mirror
Submirror 0: d10
State: Needs maintenance
Submirror 1: d20
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 8395200 blocksd10: Submirror of d0
State: Needs maintenance
Invoke: after replacing “Maintenance” components:
metareplace d0 c1t0d0s0
Size: 8395200 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s0 0 No Last Erred
d20: Submirror of d0
State: Needs maintenance
Invoke: metareplace d0 c1t1d0s0
Size: 8395200 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s0 0 No Maintenance
d1: Mirror
Submirror 0: d11
State: Okay
Submirror 1: d21
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 8395200 blocksd11: Submirror of d1
State: Okay
Size: 8395200 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s1 0 No Okay
d21: Submirror of d1
State: Okay
Size: 8395200 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s1 0 No Okay
d4: Mirror
Submirror 0: d14
State: Needs maintenance
Submirror 1: d24
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 4202688 blocksd14: Submirror of d4
State: Needs maintenance
Invoke: after replacing “Maintenance” components:
metareplace d4 c1t0d0s4
Size: 4202688 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s4 0 No Last Erred
d24: Submirror of d4
State: Needs maintenance
Invoke: metareplace d4 c1t1d0s4
Size: 4202688 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s4 0 No Maintenance
d5: Mirror
Submirror 0: d15
State: Okay
Submirror 1: d25
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 62918208 blocksd15: Submirror of d5
State: Okay
Size: 62918208 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s5 0 No Okay
d25: Submirror of d5
State: Okay
Size: 62918208 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s5 0 No Okay
d6: Mirror
Submirror 0: d16
State: Needs maintenance
Submirror 1: d26
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 52436928 blocksd16: Submirror of d6
State: Needs maintenance
Invoke: metareplace d6 c1t0d0s6
Size: 52436928 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s6 0 No Maintenance
d26: Submirror of d6
State: Okay
Size: 52436928 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s6 0 No Okay
d7: Mirror
Submirror 0: d17
State: Okay
Submirror 1: d27
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 6970560 blocksd17: Submirror of d7
State: Okay
Size: 6970560 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s7 0 No Okay
d27: Submirror of d7
State: Needs maintenance
Invoke: metareplace d7 c1t1d0s7
Size: 6970560 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s7 0 No Maintenance
-First dettach,delete master slice6 and resync with mirror side slice6,Because in this way the intact data only in mirror side:
# metadetach d6 d16
# metaclear d16
# metainit d16 1 1 c1t0d0s6
# metattach d6 d16
– Wait for “Okay” status for the master slice6,
– Then replace “Last erred” state disk before.
Run “metadetach” to detach all the sub-mirrors on the mirror disk from
their respective mirrors (see the following):
[b]# metadetach -f [/b]
NOTE: The “-f” option is not required if the metadevice is in an “okay”
state.
Then run metaclear to remove the configuration from the disk:
[b]# metaclear [/b]
# metadb -d c1t1d0s3
# luxadm remove_device /dev/rdsk/c1t1d0s2 (physical remove disk when prompt to pull out disk)
# devfsadm -C -c disk -v
# luxadm insert_device /dev/rdsk/c1t1d0s2 (physical insert disk when prompt to pull in disk)
# prtvtoc /dev/rdsk/c1t0d0s2 |fmthard -s – /dev/rdsk/c1t1d0s2
# metadb -afc 3 c1t1d0s3
Use “metainit” and “metattach” to re-create and attach those submirrors to
the mirrors to start the resync:
[b]# metainit 1 1 c#t#d#s#[/b]
[b]# metattach [/b]
When all slice resync are finished,Begin replace “Last Erred” master disk
Run “metadetach” to detach all the sub-mirrors on the master disk from
their respective mirrors (see the following):
[b]# metadetach -f [/b]
NOTE: The “-f” option is not required if the metadevice is in an “okay”
state.
Then run metaclear to remove the configuration from the disk:
[b]# metaclear [/b]
# metadb -d c1t1d0s3
# luxadm remove_device /dev/rdsk/c1t0d0s2 (physical remove disk when prompt to pull out disk)
# devfsadm -C -c disk -v
# luxadm insert_device /dev/rdsk/c1t0d0s2 (physical insert disk when prompt to pull in disk)
# prtvtoc /dev/rdsk/c1t1d0s2 |fmthard -s – /dev/rdsk/c1t0d0s2
# metadb -afc 3 c1t0d0s3
Use “metainit” and “metattach” to re-create and attach those submirrors to
the mirrors to start the resync:
[b]# metainit 1 1 c#t#d#s#[/b]
[b]# metattach [/b]
When all slice resync finished,the two sub-mirror is normal.

Leave a Reply

Your email address will not be published. Required fields are marked *