ZFS

Is ZFS a meme? Does this shit work? I'm considering using this, instead of ext4 with journaled data (100% write penalty).

I benchmarked it vs raid10,far 2, with two disks, and it's okay I guess. raid10,far2 gives Nx read speed, for 2 drives, 2x read speed, with 1x write. ZFS is only giving about 1x read in mirrored mode. Maybe with 100 threads hitting it the total throughput would increase some. I can deal with the reduced read speed, in exchange for all the data consistency checks, but this means nothing if the system is going to crash at any moment, or the next time a new ZFS update comes out or the next time a new kernel update comes out.

Is this reliable enough for production? Is bit rot paranoia overblown? I think journaled data ext4 should take care of problems if the system crashes or the power goes out with no ups, but it does nothing for bit rot, neither does raid unless your using a mode with parity disks.

Attached: panic.png (538x253 179.74 KB, 38.23K)

Other urls found in this thread:

hooktube.com/watch?v=J9P4UadRdNA
phoronix.com/scan.php?page=article&item=linux-412-fs&num=2
postgresql.org/docs/8.4/static/wal-intro.html
twitter.com/SFWRedditVideos

Yes
Yes
On Solaris and Fr*eBSD yes, no idea what it is like on Linux.
Depends on how important your data is and what scenarios you're comparing

If you don't want to waste 8 gibibytes + 1 gibibyte of ram per terabyte of disk just use BTRFS.

Are you actually using those shitty terms? Computer science is binary, hence it doesn't use base-10 modifiers.

i've looked at btrfs but that looks even less stable. these are going to be storing vm images, nothing particularly critical but they will take days of time to regenerate if one of those files get corrupted.
What I'm thinking of doing now is not having a file system on this at all, just use lvm across the partition, and pump the lvm volumes straight through to the vm's, which will then use ext4 with journaling. I don't see the point of using ext4 with no data journaling, to store vm image files, which then will use data journaling, seems like the cache would fuck this all up on power failure anyway, and ext4 times two both with data journaling, on the partition that stores the image files and inside of the vm, would cut the write performance down to 1/4, which seems stupid.

yes

/thread

A good meme, like thinkpads and gnu / linux.

but disk storage is done in decimal. It would have been confusing to mix both of them.

I've been running it on my laptop since last January and haven't run into problems with it. I've even run out of power multiple times (when writing files) and it still worked fine afterwords. Do note that the raid 5 and raid 6 are marked as unstable.

it's part of the kernel that actually looks like a better option, maybe for using inside of the vm's?

what's killing me with this rabbit hole that i've dived into is the multiple layers of caching going on at these levels.

figure
sda
sdb
--raid10
----lvm-vg
------lvm-lv
--------libvirt passthrough
----------guest file system

if the guest is using either ext4 with data journaling or btrfs or zfs or something that ensure data is actually written, what happens when any one of these levels lies about when data is actually written, when it is infact only written to cache? i've come to the conclusion that this is all paranoia at this point and to stop digging this hole.

the question then is, is there any point to having a file system with consistency checks hosting VM images with their own file system, or should i just pass an lvm partition through and do it in the vm, or both?

Yes it's reliable. If it fits your benchmarks and supports your use-cases, use it. You should have complete and comprehensive backups anyway, so a catastrophic crash should in a worst case lose you an hour or two of data.
BTRFS is less stable, but far more promising, and has a lot more active development, especially on Linux. I've been running BTRFS for a few years and have only run into a few issues (my filesystem became corrupted once when it maxed out about 3 or 4 years ago. I could still pull my files off, but I had to rebuild the filesystem. I had my snapshot-based backups become corrupted when my power went out during a backup a year ago, and had to manually fix the snapshots. Other than that, it's been rock solid).

Regardless of what your filesystem is, your system is only as reliable as your backups. Running without backups is begging for catastrophy.

No it's not. That's just a marketing gimmick to make the drives seem bigger than they are. Hard drives and other storage are binary at both the hardware and software levels.

For simple personal desktop use case it is overengineered solution. Just wait for hammer2 or bcachefs. Until then ext4, ext2 or FFS.

Other file systems don't even have basic features like file level checksumming or error correction. This is trivial shit that should exist on everything today.

True, but they build them to hold a storage contents of a figure that is decimal. If what you are saying is that you would buy a 1 tibibyte drive and they would say on the box that it is 1.1 terabytes. In actuality though they have on the box 1 terabyte and you get 0.9 tibibyte.

Yeah it works. I've been using ZFS on Linux alongside a ProxMox hypervisor, and it's pretty spiffy. I went full consumer-whore with it and got all the properly spec'd hardware and shit, with 2 parity drives for comfy failover capacity. Me gusta.

ZFS loves ram, and will make great use out of whatever you give it, but it doesn't actually need that much ram to function fine. What you've been told is the same kind of bullshit as it "needing" ECC ram. You can easily deal with several terabytes of data on just 2gb of ram (4gb is preferable though). I've done it before with a previous setup, as have many other people. I use ZFS with proxmox (which is also fucking great), and after spending a few weeks prior reading up on it, it's been a breeze to deal with. Snapshots kick ass, and the data integrity scrubbing soothes my autism.

The true issues that ZFS have to expanding/shrinking/rebalancing arrays, and dealing with a bunch of "gotcha's" and testing when trying to maximize performance for a particular kind of workload. And stay away from deduplication, which actually requires a fuckton of memory in exchange for very little benefit.

The better question is:
If my HDD starts to die can I recover the fucking data?

If the mechanical parts fail, you can take it to Louis and Jason for a platter swap.

hooktube.com/watch?v=J9P4UadRdNA

raid10 is supposed to cover this. the problem is if the hard drive starts spitting out bad data instead of marking the sector as bad and sending no data.

from what i can tell if you scrub the raid10 if there's a conflict, one drive reports a 1 and the other reports a 0, mdadm is just going to go with the first drive and call it a 1.
it doesn't matter if you have 50 drives with 50 mirrors, if the first drive says it's a 0, it's a 0, it doesn't check for consensus.

this is part of the reason why i'm iffy on using it. there's bug reports and general shit all over the internet that ZFS will not relinquish the cache ram it's using when it's supposed to.

If you have a 16GB ram system, and ZFS eats 8 of it, and you try to spin up a VM that wants 10GB of ram, you'll get a fucking kernel panic instead of ZFS giving up it's cache ram and resizing.

I looked at btrfs, which I would hope doesn't have this problem because it's actually a part of the kernel, it should play nice with it, but the benchmarks show it's shit compared to ext4, you literally take a 100% read/write penalty on most loads.

Wrong. That would mean btrfs does not read or write data at all. Why would people use it if you can't write or read files from it.

i mean't it's cut's the read/write performance in half compared to ext4

phoronix.com/scan.php?page=article&item=linux-412-fs&num=2

Attached: sqlite.png (559x453 41.77 KB, 39.66K)

Now try

If you are going to be working with a database or a VM disk, you need to disable COW on that specific file to improve performance for btrfs.

if there's no copy on write or journaling on that database file and the power gets cut mid write isn't the database file still fucked?

are you saying all safe guards should be removed on databases because the database engine will deal with it?

Yes actually this is very standard. (I am not op).

thanks i'll have to look into that, ultimatily i have to work with a vm, which is dealing with a database, i was going to pass the vm a straight lvm lv, which is sitting on a raid10, then put ext4 on it with data journaling, but if this shit is unnecessary for the database, you saying i can stick this database on a partition or volume or whatever, and tune all the knobs for performance, caching, no journaling, barriers off, etc, and it's still perfectly fine if power dies at some point, postgres or sqlite will handle it fine.

postgresql.org/docs/8.4/static/wal-intro.html

looks like this, thanks

This.

homosexual detected