Re: [Hampshire] ext4 and dd disc cloning

Top Page

Reply to this message
Author: John Cooper
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] ext4 and dd disc cloning
On 08/07/10 23:42, James Courtier-Dutton wrote:
> On 8 July 2010 22:10, John Cooper <lug@???> wrote:
>> My normal backup routine is to tar up my home directory files daily and
>> then every month or so do a full disc clone using DD
>>
>> dd if=/dev/sda of=/dev/sdb bs=32M
>>
>> This has worked for years but now fails to copy all files on Fedora 13.
>> Some directories like /sys have permissions ???? . I also tried
>> clonezilla using partclone which took 24 hours and then crashed. I think
>> this is the first ext4 backup I've done. The only other change is my
>> removable caddies which are now cheaper ones.
>>
>> Any ideas?
>>
> Maybe the hard disk is failing.
> What does "smartctl -a /dev/sda" output in the reallocated sectors row?
>
> Kind Regards
>
> James
>
> --
> Please post to: Hampshire@???
> Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
> LUG URL: http://www.hantslug.org.uk
> --------------------------------------------------------------
>


James, was nothing to do with ext4, Extended offline test showed
"Completed: read failure" and "Current_Pending_Sector" was 3.
Configuring smartd service to run automatically it also confirmed it
found 3 unreadable sectors. So I'm rebuilding a new drive and using g4l
to backup/restore that partition.

Thanks for pointing me in the right direction.

Full howto http://smartmontools.sourceforge.net/badblockhowto.html

Run tests

sudo smartctl -t short /dev/sda
sudo smartctl -t long /dev/sda
sudo smartctl -t conveyance /dev/sda

Once all finished

sudo smartctl -l selftest /dev/sda

smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Conveyance offline  Completed without error       00%      3182
     -
# 2  Extended offline    Completed: read failure       50%      3175
     572267884
# 3  Short offline       Completed without error       00%      3174
     -


sudo smartctl -A /dev/sda

smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always
      -       0
  3 Spin_Up_Time            0x0027   141   140   021    Pre-fail  Always
      -       3925
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always
      -       36
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always
      -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always
      -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always
      -       3196
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always
      -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always
      -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always
      -       34
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always
      -       19
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always
      -       16
194 Temperature_Celsius     0x0022   102   095   000    Old_age   Always
      -       41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always
      -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always
      -       3
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age
Offline      -       2
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always
      -       1
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age
Offline      -       6



Note to fix it you will need to use the LVM path if it is an LVM partition

sudo fdisk -lu /dev/sda

/dev/sda1   *          63      401624      200781   83  Linux
/dev/sda2          401625   976768064   488183220   8e  Linux LVM


My fault is at 572267884 so in /dev/sda2, LVM partition.

sudo tune2fs -l /dev/mapper/VolGroup00-LogVol00 | grep Block

Block count:              121004032
Block size:               4096
Blocks per group:         32768


So 4096 block count

b = (int)((L-S)*512/B)
where:
b = File System block number
B = File system block size in bytes
L = LBA of bad sector
S = Starting sector of partition as shown by fdisk -lu
and (int) denotes the integer part.

L=572267884, S=401625 B=4096
b=(572267884-401625)x512/4096=71483282

I ran debug on the mounted and running system and can take a while to
run each command (500GB disc).

sudo debugfs

debugfs 1.41.10 (10-Feb-2009)
debugfs: open /dev/mapper/VolGroup00-LogVol00

debugfs: testb 71483282

Block 71483282 marked in use

debugfs: icheck 71483282

Block    Inode number
71483282    17867467


debugfs:  ncheck 17867467
Inode    Pathname
17867467    /home/user/.cxoffice/Other
Application/drive_c/windows/system32/winecfg.exe


sudo dd if=/dev/zero of=/dev/sda2 bs=4096 count=1 seek=71483282

1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00148771 s, 2.8 MB/s

ls -altr /home/user/.cxoffice/Other
Application/drive_c/windows/system32/winecfg.exe

ls: cannot access /home/user/.cxoffice/Other: No such file or directory

Re-check

sudo smartctl -t long /dev/sda

90 minutes later :-

sudo smartctl -l selftest /dev/sda

smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       50%      3198
     572267890
# 2  Conveyance offline  Completed without error       00%      3182
     -
# 3  Extended offline    Completed: read failure       50%      3175
     572267884
# 4  Short offline       Completed without error       00%      3174
     -


So will repeat process to fix 572267884.

John.
--
--------------------------------------------------------------
Discover Linux - Open Source Solutions to Business and Schools
http://discoverlinux.co.uk
--------------------------------------------------------------