www.centos.org Forum Index CentOS 5 - General Support mismatch_cnt
|
Bottom Previous Topic Next Topic |
| |
|
|
|---|
| Poster | Thread | Rated: 3 Votes |
|---|
|
Re: mismatch_cnt | #2 |
|
|---|---|---|---|
|
Newbie
![]()
Joined: 2009/11/8
From
Posts: 6
|
any ideas?
![]() |
||
Posted on: 2009/11/9 9:35
|
|||
|
Re: mismatch_cnt | #3 |
|
|---|---|---|---|
|
Peeking in the Member Window
![]()
Joined: 2007/5/6
From
Posts: 19
|
5.4 added a raid check script. Before 5.4, mismatch_cnt was never checked...
How big a problem this is, I don't know. I read that a small mismatch_cnt is nothing to worry about, it usually occurs on an unused part of the filesystem. But a repair is basically just copy the block to the other disk, without actually knowing if your copying good over bad or the other way around... |
||
Posted on: 2009/11/9 11:30
|
|||
|
Re: mismatch_cnt | #4 |
|
|---|---|---|---|
|
Newbie
![]()
Joined: 2009/11/8
From
Posts: 6
|
thanks alot for your reply. I'm getting a bit confused now about repairing the raid. If i do so it might replace the "good" data with the corrupted one?
|
||
Posted on: 2009/11/9 13:42
|
|||
|
Re: mismatch_cnt | #5 |
|
|---|---|---|---|
|
Jr Board Member
![]()
Joined: 2008/8/1
From
Posts: 28
|
Quote:
Also getting this when running fdisk -l : Usually the md device maps to a partition table on the hard drive (/dev/sda4+/dev/sdb4 = /dev/md3) - fdisk -l /dev/sd? Quote: 5.4 added a raid check script. Before 5.4, mismatch_cnt was never checked... I don't think the raid was checked at all. Debian has had a job to check md arrays for a while, but I've never seen any such scripts on RedHat/Fedora/CentOS systems until now. I've been using software mdadm raid1 for a while and have always done my own monthly checks, but mismatch_cnt is new to me... There's some stuff in Debian's bug tracker and on mailing lists that seem to downplay it a bit, but I'm still fuzzy. I added a second job yesterday to check for a non-zero mismatch_cnt and issue a repair if found. I've heard that smaller mismatches (<=128) may be alright, but I don't know if that's fact so I'm repairing anything > 0. You can issue a 'repair' on the affected md device with "echo repair > /sys/block/md6/md/sync_action". This will attempt to fix the problem, but won't actually update /sys/block/md*/md/mismatch_cnt though. A followup 'check' will, which you could just let run at the normal interval from cron. |
||
Posted on: 2009/11/11 19:21
|
|||
|
Re: mismatch_cnt | #6 |
|
|---|---|---|---|
|
Peeking in the Member Window
![]()
Joined: 2008/7/11
From
Posts: 13
|
As far as I know if there is swap on the md device you will potentially see this issue (also if swap is part of an LVM on the md device).
We may have also seen this if you have a VMWare VM on an md (maybe cause their swap is on the VM), but I can't totally confirm that. The repair then check, I believe is the best option, but the paranoid part of me wanted to fsck the filesystems after they were repaired to help make sure everything was happy (touching /forcefsck and rebooting) |
||
Posted on: 2009/11/13 16:33
|
|||
|
Re: mismatch_cnt | #7 |
|
|---|---|---|---|
|
Newbie
![]()
Joined: 2009/11/20
From
Posts: 2
|
We also have problems with mismatch_cnt on our servers. One is Vmware server (raid10), and another is a mail server, which is also used for some backups (raid1).
These mismatches are appearing every week, and sound like running a repair on it is not a good idea (bad "good" block decision making by md). So the question is, how serious is this ? (we already asked that on linux-raid list, someone says it is corrupting files, other guys are saying that on the fs level nobody will "feel" that mismatches.) |
||
Posted on: 2009/11/20 9:23
|
|||
|
Re: mismatch_cnt | #8 |
|
|---|---|---|---|
|
Newbie
![]()
Joined: 2009/11/20
From
Posts: 2
|
Any updates on this issue ?
|
||
Posted on: 2009/11/24 13:13
|
|||
|
Re: mismatch_cnt | #9 |
|
|---|---|---|---|
|
Moderator
![]()
Joined: 2006/12/13
From Tidewater, Virginia, North America
Posts: 7185
|
Seems to be a feature rather than an issue. Have you tried the repair and check procedure?
http://lists.centos.org/pipermail/centos/2009-October/084510.html Might want to follow with |
||
|
_________________
Phil Required reading: FAQ & Readme first ; Search hint: google "your topic site:centos.org"; Smart Questions |
|||
Posted on: 2009/11/24 15:15
|
|||
|
Re: mismatch_cnt | #10 |
|
|---|---|---|---|
|
Newbie
![]()
Joined: 2009/11/8
From
Posts: 6
|
Quote:
Well i did repair the raid and checked it after that... and it reterned 0 unsynchronized blocks. Every sunday just after the weekly cron job i'm getting 128 unsynchronized blocks. Not more not less... every week the same thing. |
||
Posted on: 2009/11/26 7:28
|
|||
|
Re: mismatch_cnt | #11 |
|
|---|---|---|---|
|
Moderator
![]()
Joined: 2006/12/13
From Tidewater, Virginia, North America
Posts: 7185
|
That does sound problematic. At least it is deterministic. What is running weekly, or do you just mean that 99-raid-check is regularly finding the problem? A bug report seems in order unless you can pin it on something in cron.weekly..
|
||
|
_________________
Phil Required reading: FAQ & Readme first ; Search hint: google "your topic site:centos.org"; Smart Questions |
|||
Posted on: 2009/11/26 17:54
|
|||
|
Re: mismatch_cnt | #12 |
|
|---|---|---|---|
|
Newbie
![]()
Joined: 2009/11/8
From
Posts: 6
|
Quote:
cat /sys/block/md6/md/mismatch_cnt always reports 128 after the weekly 99-raid-check check. I did repair the raid twice and checked it right after to see the result and it was 0. I also did a check last saturday 2 hours before the 99-raid-check check was going to start an the result was still 0. I said ok, looks fine but on sunday morning again WARNING: mismatch_cnt is not 0 on /dev/md6 and again sys/block/md6/md/mismatch_cnt reports 128 unsynchronized blocks. I'm a bit confused, i'm not goin to do any check or repair this week just to see what will be the output of the weekly cron this sunday morning... i bet it will be again 128 ![]() Anyway my server works fine but it would be good to find out what is causing this (if there is something that's causing it). sorry for my poor english ![]() |
||
Posted on: 2009/11/27 2:41
|
|||
|
Re: mismatch_cnt | #13 |
|
|---|---|---|---|
|
Peeking in the Member Window
![]()
Joined: 2007/3/3
From
Posts: 18
|
I have the problem, after the script is running the Array will be rebuild.
It will only happens on the first of 3 Raid 1 arrays. All Array are build over the same disks. |
||
Posted on: 2009/12/7 7:13
|
|||
|
Re: mismatch_cnt | #14 |
|
|---|---|---|---|
|
Professional Board Member
![]()
Joined: 2007/1/7
From Central IL USA
Posts: 2195
|
What is the rebuild command that you are running? The 99-raid-check check does not rebuild, it just reports. You could (and *SHOULD*) run something like:
|
||
Posted on: 2009/12/7 15:04
|
|||
|
Re: mismatch_cnt | #15 |
|
|---|---|---|---|
|
Moderator
![]()
Joined: 2006/12/13
From Tidewater, Virginia, North America
Posts: 7185
|
Looks to me like the action of 99-raid-check depends on what's defined in /etc/sysconfig/raid-check and that it should check weekly if configured to do so. Am I missing something?
|
||
|
_________________
Phil Required reading: FAQ & Readme first ; Search hint: google "your topic site:centos.org"; Smart Questions |
|||
Posted on: 2009/12/7 16:01
|
|||
|
Re: mismatch_cnt | #16 |
|
|---|---|---|---|
|
Professional Board Member
![]()
Joined: 2007/1/7
From Central IL USA
Posts: 2195
|
Quote:
Learned something new... "REPAIR_DEVS a space delimited list of devs that the user specifically wants to run a repair on." in the /etc/sysconfig/raid-check as @pschaff says. |
||
Posted on: 2009/12/7 16:41
|
|||
|
Re: mismatch_cnt | #17 |
|
|---|---|---|---|
|
Moderator
![]()
Joined: 2006/12/13
From Tidewater, Virginia, North America
Posts: 7185
|
Just looked back at the comments in /etc/sysconfig/raid-check and it looks like CHECK will actually repair things if finds automatically, so now I'm a bit confused as to the utility of the REPAIR option as check seems to do both. Found a FAQ but it still is not totally clear to me what is meant by "personalities" being "taught" about check. Seems others have suffered similar confusion. Any enlightenment welcome.
|
||
|
_________________
Phil Required reading: FAQ & Readme first ; Search hint: google "your topic site:centos.org"; Smart Questions |
|||
Posted on: 2009/12/7 18:30
|
|||
|
Re: mismatch_cnt | #18 |
|
|---|---|---|---|
|
Professional Board Member
![]()
Joined: 2007/1/7
From Central IL USA
Posts: 2195
|
If the check is suppose to also repair, it did not work that way for me on the box with that messages. I needed to run the sync operation (or maybe the explicit REPAIR_DEVS) in order for the message to stop after the upgrade from 5.3 -> 5.4.
|
||
Posted on: 2009/12/8 13:29
|
|||
|
Re: mismatch_cnt | #19 |
|
|---|---|---|---|
|
Professional Board Member
![]()
Joined: 2007/1/7
From Central IL USA
Posts: 2195
|
Just to make sure all that have or have had this issue understand:
http://lists.centos.org/pipermail/centos/2009-December/086667.html and to quote part that make me sleep better:Quote: On 12/1/2009 8:05 AM, Paul Bijnens wrote: |
||
Posted on: 2009/12/17 17:11
|
|||
|
Re: mismatch_cnt | #20 |
|
|---|---|---|---|
|
Peeking in the Member Window
![]()
Joined: 2008/7/11
From
Posts: 13
|
Sorry to resurrect an old thread, but just to update that this has now become a bugzilla with already an updated mdadm package in the "pending" state on Fedora 12, and as it has a separate RH5 bugzilla which I'd hope has mdadm updates pending too.
The patch basically stops the script checking mismatch_cnt on RAID 1 devices where it isn't really meaningful, but does still do the check which is essential for all RAIDs. Fedora 12 Bugzilla Entry RH5 Bugzilla Entry |
||
Posted on: 2010/2/20 0:51
|
|||
Top Previous Topic Next Topic |
|



Topic options
Print Topic
Threaded
Newest First
3 Votes
Defected









You cannot start a new topic.
You can view topic.