Hacker News Clone new | comments | show | ask | jobs | submit | github repologin
Ntfs2btrfs does in-place conversion of NTFS filesystem to the open-source Btrfs (github.com)
159 points by Sami_Lehtinen 3 hours ago | hide | past | web | 62 comments | favorite





The degree of hold-my-beer here is off the charts.

It's not quite as dangerous as you'd think.

The standard technique is to reserve a big file on the old filesystem for the new filesystem metadata, and then walk all files on the old filesystem and use fiemap() to create new extents that point to the existing data - only writing to the space you reserved.

You only overwrite the superblock at the very end, and you can verify that the old and new filesystems have the same contents before you do.


I believe that is also the method [btrfs-convert](https://btrfs.readthedocs.io/en/latest/Convert.html) uses. A cool trick that tool uses is to keep the ext4 structures on disk (as a subvolume), which allows reverting to ext4 if the conversion didn't go as planned (as long as you don't do anything to mess with the ext4 extents, such as defragmenting or balancing the filesystem, and you can't revert after deleting the subvolume of course).

I believe you are right. You can only convert back to the metadata from before. So any new or changed (different extents) files will be lost or corrupted.

So best to only mount ro when considering to rollback. Otherwise it's pretty risky.


No, it also covers the data. As long as you don't delete the rollback subvolume, all the original data should still be there, uncorrupted.

Even if you disable copy-on-write, as long as the rollback subvolume is there to lay claim to the old data, it's considered immutable and any modification will still have to copy it.


A couple of years ago it was more like juggling chainsaws: https://github.com/maharmstone/ntfs2btrfs/issues/9

I tracked down a couple of nasty bugs at that time playing around with it, hopefully it's more stable now.


Apple did something like this with a billion live OS X/iOS deployments (HFS+ -> APFS). It can be done methodically at scale as other commenters point out, but obviously needs care).

Note this is not the Linux btrfs:

"WinBtrfs is a Windows driver for the next-generation Linux filesystem Btrfs. A reimplementation from scratch, it contains no code from the Linux kernel, and should work on any version from Windows XP onwards. It is also included as part of the free operating system ReactOS."

This is from the ntfs2btrfs maintainer's page.

https://github.com/maharmstone/btrfs


It's the same file system, with two different drivers for two different operating systems.

The metadata is adjusted for Windows in a way that is foreign to Linux.

Do Linux NTFS drivers deal with alternate streams?

"Getting and setting of Access Control Lists (ACLs), using the xattr security.NTACL"

"Alternate Data Streams (e.g. :Zone.Identifier is stored as the xattr user.Zone.Identifier)"


Not sure what point you're making here. WinBtrfs is a driver for the same btrfs filesystem that Linux uses. It's most common use case is reading the Linux partitions in Windows on machines that dual-boot both operating systems

What? Why would you need a Linux NTFS driver to read a btrfs filesystem? that makes no sense.

Storing Windows ACLs in xattrs is also pretty common (Samba does the same)


I'd delete my comment if I could at this point.

Yes it is?

As someone who has witnessed Windows explode twice from in-place upgrades I would just buy a new disk or computer and start over. I get that this is different but the time that went into that data is worth way more than a new disk. It's just not worth the risk IMO. Maybe if you don't care about the data or have good backups and wish to help shake bugs out - go for it I guess.

And the new disk is also likely to have more longevity left in it, doesn't it?

I would be very surprised if it supported files that are under LZX compression.

(Not to be confused with Windows 2000-era file compression, this is something you need to activate with "compact.exe /C /EXE:LZX (filename)")


it seems to contain code that handles LZX, among other formats

https://github.com/search?q=repo%3Amaharmstone%2Fntfs2btrfs%...


Conversion from one bloatware filesystem to another...

btrfs isn't that terrible for desktop use right now. I mean, I wouldn't personally use it, I lost data on it a couple times four plus years ago, but it's come a long way since then. (my preference is keep everything I care about on a fileserver like truenas running zfs with proper snapshotting, replication and backup and live dangerously on the desktop testing out bcachefs, but I recognize not everyone can live my life and some people just want a laptop with a reasonable filesystem resistant to bit rot.

I recently found out Fedora defaults to btrfs with zstd compression enabled by default. Seems to work well enough.

On my personal devices I prefer BTRFS' snapshotting ability over the risk of having to restore from backup at some point.


Is there decent encryption support, or are we stuck using full disk encryption at the block level?

How is bcachefs for personal use these days?

The political situation for bcachefs is far from good, with pressure from Linus and a CoC violation.

The net effect will likely delay stability.

https://www.phoronix.com/news/Bcachefs-Fixes-Two-Choices

https://www.phoronix.com/news/Linux-CoC-Bcachefs-6.13


Honestly the political situation will probably be a /good/ thing for long term stability, because I get a few months without any stupid arguments with upstream and finally get to write code in peace :)

It sucks for users though, because now you have to get my tree if you want the latest fixes, and there's some stuff that should be backported for forwards compatibility with the scalability improvements coming soon [1].

[1]: https://www.patreon.com/posts/more-expensive-116975457


Don't worry about the users, we'll manage somehow, it's such a tiny burden compared to the actual development. I'm just really happy to see you're not discouraged by the petty political mess and keep pushing through. Thank you!

I'm hoping to use your filesystem when it's ready.

Everyone wishes that this were easier for you.


I switched back to the arch default kernel for my 32TB home media server, would you recommend going back to compiling your kernel for the time being?

I personally stopped compiling your code in my personal repo when bcachefs was upstreamed. It often was a pain to rebase against the latest hardened code and I'm happier since it's upstream. I use your fs for 7-8 years now and I hope your latest changes to the disk format will actually improve mount performance (yes I'm one of the silent "victims" you were talking about). I hope nothing breaks...

Anyway thank you for your work and I wish you all the best on the lkml and your work.


Echoing the sibling comment Kent, bcachefs is a really wonderful and important project. The whole world wants your filesystem to become the de-facto standard Linux filesystem for the next decade. One more month of LKML drama is a small price for that (at LKML prices).

I haven't lost any data yet. It did something stupid on my laptop that looked like it was about to repeat btrfs's treatment of my data a few months ago but 15 minutes of googling on my phone and I figured out the right commands to get it to fix whatever was broken and get to a bootable state. I'm a decade away from considering it for a file server holding data I actually care about but as my main desktop and my laptop file system (with dotfiles backed up to my git instance via yadm and everything I care about nfs mounted in from my fileservers) it's totally fine.

Been using it since 6.7 on my root partition. Around 6.9 there were issues that needed fsck. Now on 6.12 it is pretty stable already. And fast, it is easy to run thousands of Postgres tests on it. Not something zfs or btrfs really could do without tuning...

So if you're a cowboy, now it's a good time to test. If not, wait one more year.


I believe the main bcachefs mantainer does not advocate production use yet

We're still six months or so from taking the experimental label off, yeah. Getting close, though: filesystem-is-offline bugs have slowed to a trickle, and it's starting to be performance issues that people are complaining about.

Hoping to get online fsck and erasure coding finished before taking off experimental, and I want to see us scaling to petabyte sized filesystems as well.


Oh wow... That's actually really fast progress all things considered. Well done! I really hope all the... umm... misunderstandings get worked out because you're doing great work.

After a bit of peer pressure from a friend, I decided ended up using btrfs with my laptop about three months ago. It’s been fine thus far.

Emphasize on ‘this far’…

To me that kind of experiment is the equivalent of changing your car brake for something ‘open source’, maybe better, maybe not.

But when you need them you’re going to want to make sure they’re working.


What filesystem would you suggest that has data checksums, efficient snapshots, and doesn't require compiling an out of tree kernel module?

If you artificially tailor your criteria such that the only answer you your question is btrfs then that is the answer you will get.

There's nothing "artificial" in his requirements, data checksums and efficient snapshots are required for some workloads (for example, we use them for end-to-end testing on copies of the real production database that are created and thrown away in seconds), and building your own kernel modules is a stupid idea in many cases outside of two extremes of the home desktop or a well-funded behemoth like Facebook.

ZFS… using a BSD kernel :)

There is an OpenZFS port to Windows, but I'm not sure how to find it from here:

https://github.com/openzfsonwindows/ZFSin

There is also Microsoft's own ReFS:

https://en.m.wikipedia.org/wiki/ReFS


Zfs is in tree if you use a different kernel. :p

no, that is not in in tree. it's in build. big difference.

zfs is licensed too freely to be in-tree, but it’s still an excellent choice.

I may be wrong but I don't think it's just "excessive freeness", the CDDL also has restrictions the GPL does not have (stuff about patents), it's mutual incompatibility.

Apache v2 and GPLv3 were made explicitly compatible while providing different kinds of freedom.


The CDDL is more permissive, it's a weak copyleft license while the GPL is strong copyleft, and that makes the two incompatible. Calling it "excessive freeness" is inflammatory, but they're broadly correct.

> The CDDL is more permissive, it's a weak copyleft license while the GPL is strong copyleft, and that makes the two incompatible. Calling it "excessive freeness" is inflammatory, but they're broadly correct.

It's not really. Many aspects of the license are free-er, but that's not what causes the incompatibility. The GPL does not have any kind of clause saying code that is distributed under too permissive of a license may not be incorporated into a derived/combined work. It's not that it's weak copyleft, it is that it contains particular restrictions that makes it incompatible with GPL's restrictions.

BSD licenses do not have that incompatible restriction (= are freer than CDDL, in that aspect) and can be compatible with the GPL.


In the Linux tree.

In Windows, Satya would need to write Larry a check. It would probably be hefty.

Edit: there was a time that this was planned for MacOS.

https://arstechnica.com/gadgets/2016/06/zfs-the-other-new-ap...


Edit: there was a time that this was planned for MacOS.

That was a joyous prospect. A single volume manager/filesystem across all UNIX platforms would be wonderful.

We had the UNIX wars of the 1990s. Since Linux won, they have been replaced by the filesystem wars.


Yeah the situation is unfortunate. There's a decent chance I'd be using ZFS if not for the licensing issues, but as a practical matter I'm getting too old to be futzing with kernel modules on my daily driver.

DKMS solved these "licensing issues." Dell is mum on official motivation- but it provides a licensing demarcation point, and a way for kernels to update without breaking modules- so it's easier for companies to develop for Linux.

_Windows Drivers work the same way and nobody huffs and puffs about that_

I'd love to have an intelligent discussion on how one person's opinion on licensing issues stacks up against the legal teams of half the fortune 50's. Licensing doesn't work on "well, I didn't mean it THAT way."


I admit I'm not fully up to date on whether it's actually "license issues" or something else. I'm not a lawyer. As a layman here's what I know. I go to the Arch wiki (https://wiki.archlinux.org/title/ZFS) and I see this warning under the DKMS section (as you advised):

> Warning: Occasionally, the dkms package might not compile against the newest kernel packages in Arch. Using the linux-lts kernel may provide better compatibility with out-of-tree kernel modules, otherwise zfs-dkms-staging-gitAUR backports compatibility patches and fixes for the latest kernel package in Arch on top of the stable zfs branch

So... my system might fail to boot after updates. If I use linux-lts, it might break less often. Or I can use zfs-dkms-staging-git, and my system might break even less often... or more often, because it looks like that's installing kernel modules directly from the master branch of some repo.

As a practical matter I could care less if my system fails to boot because of "license issues" or some other reason, I just want the lawyers to sort their shit out so I don't have to risk my system becoming unbootable at some random inopportune time. Until then, I've never hit a btrfs bug, so I'm going to keep on using it for every new build.


Xfs ? Bcachefs ? Whatever you like because those features may not be implemented at the filesystem layer ?

xfs doesn't have data checksumming.

> Note: Unlike Btrfs and ZFS, the CRC32 checksum only applies to the metadata and not actual data.

https://wiki.archlinux.org/title/XFS

---

bcachefs isn't stable enough to daily drive.

> Nobody sane uses bcachefs and expects it to be stable

—Linus Torvalds (2024)

https://lore.kernel.org/lkml/CAHk-%3Dwj1Oo9-g-yuwWuHQZU8v%3D...


Linus is not a filesystem guy

You can get data checksumming for any filesystem with dm-integrity.

(read my comment again)

honestly I think btrfs isn't bloated enough for today's VM-enabled world. ext4 and xfs and hell, exfat haven't gone anywhere, and if those fulfill your needs, just use those. but if you need more advanced features that btrfs or zfs bring, those added features are quite welcome. imo, btrfs could use the benefits of being a cluster filesystem on top of everything it already does because having a VM be able to access a disk that is currently mounted by the host or another VM would useful. imagine if the disk exported to the VM could be mounted by another VM, either locally or remote simultaneously. arguably ceph fills this need, but having a btrfs-native solution for that would be useful.

Running VMs (and database servers) on btrfs performs really bad so you have to disable CoW for them.

Otherwise you'll get situations where your 100GB VM image will use over a TB of physical disk space.

It's a shame really that this still isn't solved.


What’s the underlying issue? I used VMs with ZFS for storage for well over a decade with no issue.

I don't think thin provisioning btrfs makes a lot of sense. Before disabling CoW I'd rather use a different filesystem.

Are you sure your TRIM is working and the VM disk image is compacting properly? It's working for me but not really great for fragmentation.


Checksum self healing on ZFS and BTRFS saved my data from janky custom NAS setups more times that I can count. Compression is also nice but the thing I like most is the possibility of creating many partition-like sub volumes without needing to allocate or manage space.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: