A critical bug was discovered and fixed in the Linux kernel’s HFS+ implementation, assigned as CVE-2021-46989. This flaw could trigger file system corruption and data loss when files are truncated (cut short) in a certain tricky way on HFS+ partitions. In this post, you'll learn what happened, why it’s dangerous, how it can be exploited, and how the fix works—all explained in plain American English with code snippets for clarity.

What’s HFS+ and Why Does Truncate Matter?

HFS+ is the “Hierarchical File System Plus” used mostly on macOS disks and removable drives. Linux can read and write HFS+ partitions via the kernel driver. Every file is stored as a series of “extents”—blocks that point to data on the disk—kept in records. For performance and scalability, each “extent record” holds up to 8 extents; when more are needed, the system creates new records in what’s called the “extents overflow file.”

The truncate() operation is used to shrink or zero files. It’s common, but if buggy, it can harm data and the file system structure.

The Bug: Caught During Truncate

In late 2021, it was found that commit 31651c607151 tried to fix a deadlock problem during file truncation in HFS+, but introduced another risky bug instead.

What Actually Broke?

When you shrink a file whose blocks (extents) span more than one extent record—especially more than 8 extents—the truncate code forgot to check exactly where the new file end sits. If the cutoff fell inside the *middle* of the last extent record (not all or none), the code ended up stupidly *removing the whole last extent record*, instead of only releasing blocks beyond the cutoff. In plain words: it zipped up and tossed out more of your data than it should, causing corruption and lost files.

Original snippet (with the bug)

// Inside hfsplus_file_truncate()
/* ... */
if (blk_cnt > start)
    hfs_brec_remove(...); // <-- Not properly guarded!
/* ... */

This omitted a crucial guard, which was post-moved, meaning hfs_brec_remove() could run unconditionally and clear entire records that shouldn't be cleared.

How To Trigger (Exploit!) The Bug

This isn’t a security bug in the sense of gaining privileges, but a “security” bug in the sense of *breaking your data* by clever file manipulation.

Using dd and a nearly-full disk, or with files that are highly fragmented.

2. Shrink the file with truncate or ftruncate, specifically so the new end falls in the *middle* of the last extent record.

Example sequence (assume disk is already fragmented for demonstration)

dd if=/dev/zero of=/mnt/hfsplus/testfile bs=1M count=50
# force an extra extent record by writing non-contiguously:
dd if=/dev/zero of=/mnt/hfsplus/testfile seek=200 bs=512 count=1 # creates a hole
# Now shrink — this will leave the last extent record partially used
truncate -s 25M /mnt/hfsplus/testfile
# Check file - likely corrupted after truncate!

The last extent record (which might have 8 slots) is supposed to keep only the relevant extents, but instead, the whole record is killed. You can lose real data on other extents.

The Fix: Restore The Guard, Respect Locks

Maintainer and developer Tetsuo Handa reported and fixed it by adding back the check on where the truncation falls *before* calling dangerous record-removal functions, but with careful locking to avoid race conditions.

Corrected snippet

if (blk_cnt > start) {
    mutex_lock(...->tree_lock);
    if (new_truncate_end < start_of_extent_record)
        hfs_brec_remove(...); // Remove whole record safely
    else
        hfsplus_free_extents(...); // Just shrink inside the record
    mutex_unlock(...->tree_lock);
}

This logic ensures we only remove the whole record when we’re certain nothing remains in it, and maintains the right file system locks to avoid further corruption or use-after-unlock bugs.

Why Is Locking Important?

In filesystems, “locks” are used to prevent two threads from changing shared data at the same time. If you drop a lock too soon, or unlock twice, data can be trashed without warning—even if it doesn’t show up until much later. The original commit had moved operations around to avoid deadlock, but carelessly let some operations happen outside the guard—bad both for data and for debugging.

Real-World Risk and Mitigation

- Who is at risk? Anyone writing to HFS+ disks from Linux, especially with large or fragmented files.
- What’s the harm? Silent data loss: files vanish, get corrupted, or in worst case, the entire file system may be damaged.
- Has it been fixed? Yes, in mainline Linux kernel as of 2021-12-07 and later stable kernels.
- How to protect yourself: Update your kernel! If you cannot update, do NOT truncate large/fragmented HFS+ files, or do so only on backup copies.

References (Original and Technical)

- CVE-2021-46989 NVD Entry
- Linux Kernel Official Commit (31651c607151)
- Fix Commit
- Linux Kernel Mailing List report
- Filesystem, Extents, and HFS+ Format Reference

In Short

CVE-2021-46989 is a classic example of how a fix for one problem (deadlock) can cause a much nastier one—silent data loss—if not meticulously tested. If you use HFS+ on Linux, apply security updates and try not to truncate large files until you do. As always: back up your irreplaceable data!


*This exclusive explanation aims to make a subtle file system bug understandable for everyone, not only kernel hackers. Share and stay safe!*

Timeline

Published on: 02/28/2024 09:15:37 UTC
Last modified on: 11/04/2024 17:35:01 UTC