So I think that at this point, it is important to clear up some misconceptions that people usually and generally have about the adaptive replacement cache (ARC) (Sources: here and here (a.k.a. adaptive responsive cache (cf. here)) – in that it per the blog post, it is almost exclusively a read cache. (Again, consult the Megiddo and Modha (IBM) paper that was published in FAST 2003.)
(If you google it, there will also be a lot of places where they will erroneously call it “adaptive read cache” even though that’s primarily what it is used for.)
Also via the blog post (which was originally hosted here, but no longer exists at that location) is where they describe how the adaptive replacement cache works, pictorially. Again, even the pictures show that it’s primary mode of operation is for reads, rather than for writes.
Yes and no.
For reads, your statement would be correct.
For writes – from what I have learned ever since ZFS was invented back in 2005-2006 – it get a little bit more complicated because writes, again, primarily go to the ZFS intent log rather than to the adaptive replacement cache. (Again, referring to the Megiddo and Modha paper, “A commonly used critierion for evaluating a replacement policy is its hit ratio – the frequency with which it finds a page in the cahce.” (Megiddo and Modha, p.1).
Thus, when it comes to writes, from reading about what people have written about this – I think that this is a HUGE misconception that people have, where a lot of times, people will tend to think of the adaptive replacement cache as being for both reads and writes, when that’s not really how it works (given that writes are sent to the ZFS intent log) rather than to the adaptive replacement cache. (cf. Steve Tunstall, Principal Storage Engineer, Oracle - here)
Also cf. Oracle ZFS Storage Appliance Analytics Guide (here).
Also per Steve Tunstall, quote: “(Logzillas,
on the other hand, are for fast synchronous write acknowledgements, and have nothing to do with ARC at all).”
Likewise, this is reflected on page 51 of the ZFS Administration Guide:
"The ZFS intent log (ZIL) is provided to satisfy POSIX requirements for synchronous transactions. For example, databases often require their transactions to be on stable storage devices when returning from a system call. NFS and other applicatiosn can also use fsync()
to ensure data stability.
By default, the ZIL is allocated from blocks withint he main pool. However, better performance might be possible by using separate intent log devices, such as NVRAM or a dedicated disk.
Consider the following points when determining whether setting up a ZFS log device is appropriate for your environment:
-
Log devices for the ZFS intent log are not related to the database log files.
-
Any performance improvement seen by implementing a separate log device depends on the device type, the hardware configuartion of the pool, and the application workload. For preliminary performance information, see this blog:
http://blogs.oracle.com/perrin/entry/slog_blog_or_blogging_on"
Basically, what this means is that the amount “dirty data” that’s sitting in the buffer, for the transaction group, for writes, is sitting in the ZFS intent log, rather than the adaptive replacement cache.
I think that it is important for people to realise and understand the distinction between these two as it is one of the most common misconceptions about ZFS and how it works.
Reads ==> adaptive replacement cache.
Writes ==> ZFS intent log.
Well…that was what I was trying to test for by trying to see if I can hammer a Solaris 10 ZFS install. But I ran into issues running the zhammer.sh
as there were some commands/options that were still rather Linux-centric, that isn’t available on Solaris natively.
re: 1 in 1e16 or 1 in 1e18 – that’s just the bERR (bit error read rate) - that means that if you read 1e16 or 1e18 bits, you should expect an error rate of 1 bit out of that to be wrong, out of that many bits read.
But that’s just the storage device itself – and not the FS/OS nor anything else that is running on top of it (and I would also bet that would be analyzed in a lab environment, with who knows how many layers of lead shielding ) vs. an actual deployment environment, where we aren’t trying to advertise the lowest possible number.
Agreed.