Lustre solves the problem by getting out of the way. The metadata server simply tells the client on which of the nodes the pieces of the file resides. Reading the documentation, when it comes to Lustre, split brain can't happen because the moment any of the meta controllers go offline, the whole thing stops.
Definitely with ZFS if you don't start with a plan you're going to have a bad day.
The current design for a node is a 1U HP DL360G7 server with ~144 or 288GB of ram.
Internal storage is handled by the smart array controller. The next one I build I think I'm going to buy a 9200-8i and route the cables so that all the internal and external storage is JBOD.
My current HBA card dejour for external storage is the LSI 9200-8e.
As for external JBOD enclosures, there are LOTS of choices. Generally I've stuck with the promise J610s. It's essentially the "expansion" cabinet for a "smart" promise array. It's beauty is in it's abject stupidity.
As for drives, currently all my stuff is running 4TB Seagate SAS drives.
For "mainline" nodes, I configure them as raid10 pools. This gets me ~29TB of storage per pool
For "backup" nodes, I configure them as raidz2 pools. All my backup nodes are 32 disk boxes so the pools are 116TB.
Obviously this design is a balance between cost and performance.
If I need more slots I'll use a 380 instead of 360 and I'll stick in an Intel X540-T2 10GigE card.
It's fun to imagine if you scaled up the disks and the interconnects what a filesystem would look like.
Tim.