zlacker

[return to "NFS: The Early Years"]
1. smarks+od[view] [source] 2022-06-21 00:35:07
>>chmayn+(OP)
Of course, the first commenter "willy" repeats the canard that statelessness makes no sense:

> The very notion of a stateless filesystem is ridiculous. Filesystems exist to store state.

It's the protocol that's stateless, not the filesystem. I thought the article made a reasonable attempt to explain that.

Overall the article is reasonable but it omits one of the big issues with NFSv2, which is synchronous writes. Those Sun NFS implementations were based on Sun's RPC system; the server was required not to reply until the write had been committed to stable storage. There was a mount option to disable this, but if you enabled it, it exposed you to data corruption. Certain vendors (SGI, if I recall correctly) at some point claimed their NFS was faster than Sun's, but it implemented asynchronous writes. This resulted in the expected arguments over protocol compliance and reliability vs. performance.

This phenomenon led to various hardware "NFS accelerator" solutions that put an NVRAM write cache in front of the disk in order to speed up synchronous writes. I believe Legato and the still-existing NetApp were based on such technology. Eventually the synchronous writes issue was resolved, possibly by NFSv3, though the details escape me.

◧◩
2. chasil+cy[view] [source] 2022-06-21 03:23:50
>>smarks+od
There is a historical document by Olaf Kirch that addresses many aspects of the stateless design.

It also has some discussion of the indempotent replay cache that is also in the original article.

https://www.kernel.org/doc/ols/2006/ols2006v2-pages-59-72.pd...

◧◩◪
3. smarks+qz[view] [source] 2022-06-21 03:39:06
>>chasil+cy
“Why NFS Sucks” (2006), picking on a protocol that was over 20 years old at that point. Also cites “The Unix-Haters Handbook” in the abstract. Two strikes against its credibility already.

However, I did skim the paper, and it seems halfway reasonable, so I suppose I should read the whole thing. Of course nothing is above criticism, and there are many valid criticisms of NFS; but leading with “sucks” is just lazy.

◧◩◪◨
4. chasil+Gz[view] [source] 2022-06-21 03:43:11
>>smarks+qz
If you think that was bad, just listen to what Theo de Raadt had to say.

"NFSv4 is a gigantic joke on everyone....NFSv4 is not on our roadmap. It is a ridiculous bloated protocol which they keep adding crap to. In about a decade the people who actually start auditing it are going to see all the mistakes that it hides.

"The design process followed by the NFSv4 team members matches the methodology taken by the IPV6 people. (As in, once a mistake is made, and 4 people are running the test code, it is a fact on the ground and cannot be changed again.) The result is an unrefined piece of trash."

https://blog.fosketts.net/2015/02/03/vsphere-6-nfs-41-finall...

◧◩◪◨⬒
5. smarks+lH[view] [source] 2022-06-21 04:59:43
>>chasil+Gz
OK, I read the Olaf Kirch article, and the "NFS Sucks" title is mostly clickbait. There are indeed a bunch of shortcomings in NFS that he points out, that are partially addressed by NFSv4. He also admits that (as of 2006) there isn't anything better.

Locking has historically always been a problem in NFS. Kirch mentions that NLM was designed for Posix semantics only. I frankly don't know if NLM is related to `rpc.lockd` which appeared in SunOS 4 and possibly even SunOS 3 (mid 1980s at this point) which well predates anything having to do with Posix. Part of the problem is the confused state of file locking in the Unix world, even for local files. There was BSD-style `flock` and SYSV-style `lockf` and there might even have been multiple versions of those. Implementing these in a distributed system would have been terribly complex. Even at Sun, at least through the mid 1990s, the conventional wisdom was to avoid file locking. If you really needed something that supported distributed updates, it was better to use a purpose-built network protocol.

One thing "willy" got right in his comment is that NFS is an example of "worse is better". In its early version, it had the benefit of being relatively simple, as acknowledged in the LWN article. This made it easy to port and reimplement and thus it became widespread.

Of course being simple means there are lots of tradeoffs and shortcomings. To address these you need to make things more complex, and now things are "ridiculous" and "bloated". Oh well.

[go to top]