Revision of Case Study: Recover Pages documents from failed Macbook from Wed, 12/02/2009 - 12:23
A Macbook became unbootable and the owner contacted a telephone support system. After being guided over the phone in trying to repair the filesystem and resolve the problem, the owner was told that the Hard Disk needed to be reformatted and the OS reinstalled.
The owner needed several Pages documents to be recovered before the drive was to be wiped.
The drive was imaged and the laptop returned to the owner. The owner used a proprietary program, DiskWarrior, to repair the filesystem, but the Macbook was still unbootable. The /home folder was restored and the Pages documents were found and backed up. Subsequently, the OS was reinstalled.
Data recovery was attempted on the image using Ubuntu-rescue-remix and the same level of success was ultimately achieved.
The Macbook was plugged in and Ubuntu-rescue-remix was booted from DVD by inserting the disk and pressing "c" while the power button was pressed.
An external drive was connected via USB and mounted. The internal drive was imaged using GNU ddrescue.
sudo ddrescue /dev/sda image log
The drive was imaged without any errors.
----Initial State of the filesystem----
GNU Parted was used to display the partition table:
$sudo parted image unit b print
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 20480B 209735679B 209715200B fat32 EFI System Partition boot
2 209735680B 119899885567B 119690149888B hfs+ Apple_HFS_Untitled_1
The HFSPlus partition is found at 209735680 bytes. It was mounted as a loop device:
$sudo mount -o loop,offset=209735680 image mnt
Listing of the contents of the filesystem revealed only a few folders. Notably absent was the /home folder containing all the owner's data.
The filesystem was unmounted.
$sudo umount mnt
File carving was attempted using Photorec.
It is important to note the characteristics of Pages files.
"Pages is both a streamlined word processor and an easy-to-use page layout tool. It allows you to be a writer one minute and a designer the next, always with a perfect document in the works."
Pages files are not files. Pages documents are folders. Each folder named "filename.pages" contains an gzipped XML file (index.xml.gz) which contains all of the contents and layout information. Additionally it contains a Contents, Quicklook and Thumbs folders. The Contents folder contains a file named "PkgInfo" which presumable contains the Pages version format information. The Quicklook folder contains a jpg files which is an image of the first page of the document. The Thumbs folder contains tiff images of the document.
None of the data from any of the Pages documents was found using file carving. Only two jpegs were found which were assumed to be from Quicklook Pages folders.
In light of this, the owner was told that it was likely that the data was lost due to the corruption and subsequent repair attempts. The owner decided to try DiskWarrior which was able to repair the filesystem enough to allow the Pages documents to be recovered.
In light of that success, Free-Libre tools were used on the image to try to achieve the same results.
Ubuntu-rescue-remix provides the "hfsplus" package which in turn provides the hpfsck tool.
The "hfsprogs" package provides a better tool:
"Apple provides mkfs and fsck for HFS+ with the Unix core of their operating system, Darwin.
This package is a port of Apple's tools for HFS+ filesystems."
The image was copied. The partition on the image file was attached to a loop device.
sudo losetup /dev/loop0 image -o 209735680
The filesystem on the loop device was then repaired. Upon the first attempt, mkfs.hfsplus exited unsuccessfully.
$sudo fsck.hfsplus /dev/loop0
However, after repeated attempts, the filesystem was partially repaired. The Catalog file was also rebuild using the "-r" option. After many attempts, the filesystem could still not be completely repaired.
The loop device was mounted and many more folders were present in the listing.
All of the pertinent missing data was present in subfolders of the "lost+found" folder. The subfolders are named according to inode number and the documents were found by browsing the folders.
The accessing of certain files within these folders caused the loop device to hang. Subsequent attempts to access the loop device including attempts to detach it resulted in unkillable hung processes. Subsequent loop devices were attached to the image, though, and the data was copied with care to avoid the files which caused this problem.
This case demonstrated a head-to-head comparison of a proprietary application (DiskWarrior) and Free-Libre, Open Source tools. Although the proprietary application more accurately restored the /home folder and the names of the subfolders within, the amount of important data that was recovered was the same. the proprietary application could not completely repair the system and restore the Macbook to a bootable state so the difference in success is somewhat negligible in this case.
As well, this case brought to light the shortcomings of the hsfsck tool. Subsequent versions of ubuntu-rescue-remix will include the hfsprogs package.