By Paul Comfort. CISSP, Lead Engineer, Chi Corporation
A little background before diving in. A SAN is a Storage Area Network. In x86 server environments, before there were SANs, servers themselves contained or controlled the storage that they used in an isolated manner the same way that your PC has a hard drive inside of it. Typically these were internal or directly attached external storage units with spinning platters and heads that they either controlled directly or later offloaded through a processor on the storage device itself. This configuration had several advantages and many disadvantages that the SAN attempted to overcome. If you are a storage admin, you may notice I am simplifying a lot, but I welcome the opportunity to discuss any storage-related items in greater depth than I could write about – so please reach out!
One advantage of this model was separation or segmentation of storage and performance levels to individual workloads. This means that a server primarily servicing network shares to a small network, for example, would need much less storage performance than, say, a server running a database that an entire ERP system depends on. It also means that if a batch job is being run against the ERP database, the network shares would not slow to a crawl. Install guides of the day told people to put log volumes on separate drives, sometimes also segmenting applications from data, and so on.
So there is one advantage, but the disadvantages of the pre-SAN storage model are numerous. Performance in most storage arrays is achieved by striping many disks together depending on the number of operations per second the storage needs to be able to provide. This means that if the ERP database requires 1200 IOPS and your individual disks can provide 120 IOPS, then you need a stripe size of 10 disks in order to achieve this performance. RAID overhead also then factors into the equation – trying to keep things simplified. If you intend to use RAID 5 or 6, then a parity calculation or two plus associated parity writes, are required for every write requested by the system. So you may end up wanting 12 disks in your stripe for 1200 IOPS, but let’s ignore RAID overhead for the time being.
You find that your smallest available disk size is 73GB, but your database itself is 100GB large. You end up with 876 GB of RAW disk space (12*73) when you only needed 100GB. The rest of this storage is unusable because you can’t afford the IOPS plus you prefer to not make your ERP server also provide network shares. Now take the network share example. You expect to need 100 IOPS maximum let’s say, and latency is not a big issue if the requests are bursty at times. Except that you need 1.5TB of storage. Now we’re striping more than 20 of the same 73GB disks together just to get the capacity that we need. Suddenly we have more IOPS on the file server than we know what to do with. So you purchase the 147GB disks instead and only need 10 in a stripe. You still have many times the IOPS available than you really needed.
Enter the SAN: You purchase a SAN with a stripe size of 12 or 14 of the 147GB drives and not only does your ERP system have the IOPS it needs, but you have that extra capacity for your network shares on the same device. Everyone is happy and you have saved yourself the pain of buying 2-3 times more disks, monitoring them in separate systems, and so on. Now, of course, multiply this effect over a typical environment with potentially dozens or hundreds of servers and you will see the clear advantage over local storage.
SANs are built with redundancy for uptime and the prevention of data loss as primary goals. There are two controllers, either in Active/Passive or Active/Active configuration. Ignore what most vendors say when they start talking about which of those configurations is better. Both of them have downsides if the system is underpowered for the load, and both of them are just fine if the system is properly configured and managed. Both of them also have advantages as well, but Active/Active is the technically superior solution and usually has an associated premium cost. However, in either case, properly sized, the system can continue working without any interruption if either controller fails at any time. Multipathing is used to allow the servers to see the storage through both controllers. This increases the path redundancy of the storage connections. Multiple power supplies, advanced RAID configurations, larger caching, and ultimately more performance can be had by using a SAN rather than separate physical disks for every server.
A SAN also facilitates technologies such as VMware’s vMotion, which has the ability to move a running virtual machine from one physical server to another without any downtime. It also can be used for shared storage in a number of Microsoft cluster scenarios. All of these increase availability. Some SANs also feature mirroring capabilities which can protect against SAN failure. It should be noted that, with few exceptions for specific vendors, the vast majority of SAN outages are due to power issues and not the failure of the SAN itself.
A SAN also allows for easier adjustments of disk storage capacities and performance. Logical drives can be expanded independently of the physical storage capacity available. In other words, if your server needs 10GB more capacity, you do not need to rebuild your local RAID array or purchase another RAID set for that server, you can just expand the disk presented from the SAN.
SANs use a combination of cache memory, SSD, and/or Flash to increase the performance of writes and frequently accessed data. These more complicated configurations would be even more costly to configure on individual servers.
More recently, vSAN has started becoming popular. vSAN is essentially a virtual SAN built out of the local disks on individual servers. Hyperconverged solutions use this in order to seamlessly expand storage without the need of a separate SAN as new servers are added. VMware’s vSAN is the primary example here, but there are several competing flavors of technology that provide this functionality. The advantage of vSAN over SAN is primarily its expansion simplicity. All vSAN solutions require more overhead on the hosting server than a similar solution configured with a traditional SAN. It should be considered whether rapid or frequent expansion is a primary concern before purchasing a hyperconverged solution. If it is not a concern, then a traditional SAN may suit the situation better.
I hope this has made it clear that there are significant advantages to SAN over local disk. The difference between vSAN and SAN are slightly more complicated, and there are virtual SAN solutions that are better than others. Choosing between them is something that Chi can help with after learning about your environment and needs.
Have questions or just want to discuss technology? Please feel free to contact me directly at pcomfort@chicorporation.com.