i run multiple arrays. You can implement a RAID in software or hardware. What you should be guided by: a) What level of failure you want to protect for. This should be determined by the value of the data. b) How much storage you require c) How much money you would like to spend d) How technical you are
Things you need to know: Raid hard drives are different than commodity hard drives. Most of this is just programming, but what the hard drive does when it encounters an error is VERY important. Example: with a commodity hard drive (not a RAID hard drive) encounters an error, it will enter into a deep recovery cycle to attempt to repair the error, recover the data from the problematic area, and then reallocate a dedicated area to replace the problematic area. This process can take up to 2 minutes depending on the severity of the issue. Most RAID controllers/software RAIDS allow a very short amount of time for a hard drive to recover from an error. If a hard drive takes too long to complete this process, the drive will be dropped from the RAID array and marked as "bad" even though only part of the data is bad. With a RAID drive, when it encounters something that it can't read, it will simply quickly just report back that it couldn't read the data, and allow the controller to repair that part of the RAID using the other disks. This is is VERY important to understand because in a RAID 5 configuration you only have 1 parity drive, which means that you can only afford to lose 1 disk at any given time and still be able to recover your data. This difference is important to understand because when you do lose a disk the array must rebuild it - which entails reading every single bit of data from the array to rebuild the replaced disk. If you have a commodity drives and any one of them run into an issue, the array is likely to mark the drive as bad, and drop it from the array - meaning that it won't be able to then rebuild the array. It's kind of silly, but this single programmatic difference between the way these disks operate means that the drive manufacturers charge a slight premium for it.
As for software versus hardware, I use software RAID with truenas https://www.truenas.com/. This is an open source NAS operating system that uses ZFS - a storage format that is intended specifically for protecting data and recovering from errors in a RAID. RAID controllers are nice if you intend to just plug your storage devices into a windows box, however, If you use a RAID controller with truenas, it doesn't allow for the software to fully communicate directly with the disks when there is an issue as all of that functionality is offloaded to the RAID controller by design. That said, when building a box for use with truenas, it's important to understand that you need to use ECC memory because it will be using memory heavily for caching reads and writes. Any random cosmic ray can flip a bit in your ram and corrupt your disk reads/writes, and ECC memory automatically detects and corrects for these types of occurrences. This sort of a thing happens all the time. You need to protect for it if using a software RAID. So, in the end, you save on not having to purchase a RAID card, but you spend those potential savings on a motherboard that can accept ECC ram and the ram itself. With truenas, you can also set your array to "scrub" the data on any given schedule, so that it can try to regularly detect and correct any disk issues at the sector level before you encounter a larger hardware issue with your disks- so that rebuilds will go smoothly when you need to perform them. Again, when you rebuild, the machine will need to read every single bit from every single drive, so if something in a seldomly used file is bad, you can detect and correct it before you are in a vulnerable rebuild state by regularly scrubbing the data. With trunas, you're also going to want as much ram as possible, I run minimum 32GB setups with the ones that I have built and I run. I am also a HUGE fan of this motherboard for my NAS builds: https://www.newegg.com/asrock-rack-c3758d4u-2tp-intel-atom-c3758-series-processor-8-core-25w/p/N82E16813140020 It comes with an atom cpu. This board is energy efficient, supports ECC, has tons of SATA connections, 10GB links, supports AES instructions within the CPU (if you're encrypting your data) and has more than enough cores for supporting lots of virtual machines for use in playing wit the data store.
I also run my arrays in RAID 6 - which means I could lose two disks at any given time and still be able to recover from it. What you need to understand is that as you add disks to an array, it becomes a multiplicative of encountering hardware failures. If you have 8 disks, your chances of having hardware failures at any given time is much more than if you only have 4 disks. Also, RAID 5 means you could only lose 1 disk and still be able to recover.
If you're just playing around, windows supports building software raids from disk manager within the system administrator tools- you just need to plug them in (with the box turned off) and tell it to build them. Not all versions of windows supports building RAID 5 (server almost always does). Some versions of home edition don't support RAIDS 0 / 1 at all though. Easy to google though.
Any other questions, I'm happy to answer and help. Good luck!
(post is archived)