Hacker News Clone

Hacker News Clone new | comments | show | ask | jobs | submit | github repo

		Linux Kernel Performance Bottlenecks Spotted by Mold Developer (www.phoronix.com)
		25 points by rbanffy 2 hours ago \| hide \| past \| web \| 18 comments \| favorite

dietr1ch 1 hour ago [-]

> - Writing to a fresh file is slower than writing to an existing file

> mold can link a 4 GiB LLVM/clang executable in ~1.8 seconds on my machine if the linker reuses an existing file and overwrites it. However, the speed decreases to ~2.8 seconds if the output file does not exist and mold needs to create a fresh file. I tried using fallocate(2) to preallocate disk blocks, but it didn't help. While 4 GiB is not small, should creating a file really take almost a second?

1s difference here is insane these days. There must be something weird going on, even if the physical disk is one of those ancient spinning things.

Sesse__ 1 hour ago [-]

He doesn't specify what file system he's using, but offhand, you would assume that what actually takes time isn't to creating the file, but rather allocating all the blocks. A good first step would be to reproduce the issue and take a profile of both cases?

the8472 1 hour ago [-]

xfs, ext4 and btrfs all have delayed allocation. so they should only synchronously allocate the blocks if there is memory pressure or if they're triggering those well-meant (but in this case counterproductive) auto_da_alloc heuristics.

vlovich123 1 hour ago [-]

The full LKML post in question: https://lore.kernel.org/lkml/CACKH++baPUaoQQhL0+qcc_DzX7kGcm...

seba_dos1 27 minutes ago [-]

It must be stressful to write to LKML knowing that Phoronix is hiding behind a corner waiting to sensationalize every little thing that sounds like could give it free clicks you write even before any sensible discussion on what you wrote has a chance of happening.

rbanffy 2 hours ago [-]

Relevant link from the text: https://lore.kernel.org/lkml/CACKH++baPUaoQQhL0+qcc_DzX7kGcm...

stefanos82 1 hour ago [-]

Our boy @rui314 developed such fast linker that the kernel can't catch up! LOL ^_^ such an amazing job, well done!

devmor 1 hour ago [-]

The issue with slower writing to newly created files is extremely interesting- I hope I get to see a follow up on that once it’s figured out.

the8472 1 hour ago [-]

presumably the existing file is already backed by pages in the page cache while the new one still has to be allocated (+ whatever the io subsystem is doing).

magicalhippo 1 hour ago [-]

I'm interested in knowing what kind of workloads this is targetting with multi-GB executables being built at such a pace a 0.2 second wait between them is unacceptable.

NobodyNada 1 hour ago [-]

Performance optimization is not always just low-hanging fruit. When you start trying to optimize something, there's often large bottlenecks to clean up -- "we can add a hashmap here to turn this O(n^2) function into O(n) and improve performance by 10x" kind of thing. But once you've got all the easy stuff, what's left is the little things -- a 1% improvement here, a 0.1% improvement there. If you can find 10 of those 1% improvements, now you've got a 10% performance improvement. 0.2 seconds on its own isn't that much, but the reason mold is so fast is because the author has found a lot of 0.2 second improvements.

And even disregarding that, the linked LKML post mentions LLVM/clang at a case of building a 4GB executable. If you've ever built the LLVM project from source, there's about 50ish (?) binaries that need to be linked at the end of the build process -- it's not just clang, but all sorts of other tools, intermediate libraries and debugging utilities. So that is an example of a workload with "multi-GB executables being built at such a pace" -- saving 0.2 seconds per executable saves something like 10 seconds on the build.

magicalhippo 39 minutes ago [-]

I'm well aware of the joys of optimization, I just haven't come across someone building multi-GB executables at a pace where milliseconds spent linking mattered.

To me that's an exotic workload which sounds interesting, hence why I'm curious.

NobodyNada 18 minutes ago [-]

Well, keep in mind that the full linking step has to be done at the end of an incremental build. So if you're a developer actively working on a project with a 4GB executable, that linking time is part of your edit-compile-test cycle, and you have to wait for it every time you change a line of code.

The benchmarks on mold's README show that GNU gold takes 33 seconds to link clang, whereas mold takes 1.3 seconds. If you're a developer working on Clang, that's a pretty serious productivity improvement.

not_your_vase 1 hour ago [-]

  > 0.2 second wait between them is unacceptable.

It's more about sending a message, and I support the idea very much.

It always starts with "it's only a fraction of a second" or "just 100kb more Javascript to load", and suddenly every website pulls in 25MB JS at least, and starting Windows calculator shows a splash screen (on a modern machine), because it takes that long to start up.

Down with shitty, bloated crapware. Mold FTW.

magicalhippo 1 hour ago [-]

Sure, as I mentioned I'm just genuinely curious what the use-case is.

For example, running a compiler test suite I could understand, that would be quite impacted. But those tests wouldn't be multi-GB builds for the most part.

And I could understand being annoyed by it, but the author took steps to work around it, hence finding it unacceptable.

Far from my day job, so curious to know.

Sesse__ 1 hour ago [-]

Incremental builds of projects with several large binaries (the examples the README gives are MySQL, Clang and Chromium).

wmf 1 hour ago [-]

I guess obidos no longer exists so mold came too late for that.

teeray 1 hour ago [-]

Having no familiarity with Mold, I assumed this was somehow improving linux using observations from slime molds.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact