Last month we started talking about SANs and why they make sense in an enterprise environment. This month I am going to discuss a SAN feature called Thin Provisioning. As before, this article will simplify some concepts in order to be relevant to more people. If you wish to discuss this more in depth, please contact me or continue the conversation in comments.
I mentioned that in a basic SAN you can grow the size of a logical disk independently of the physical disk that backs it. The next step in that feature set is thin provisioning. Thin provisioning allows you to allocate a logical disk the same way, except that the storage reports actual usage of the bits in that disk rather than the fully allocated capacity. So a newly allocated 1TB disk uses zero actual disk space.
There are two places where thin provisioning can come into play. The first is at the VMware level and the second at the SAN level. Both of these places can specify thin or thick provisioning independently of the other, assuming your SAN has the ability to do thin provisioning. The rest of this article is a discussion about the consequences of doing so assuming a SAN that can do thin provisioning and a VMware environment.
If you are doing a combination of thin and thick provisioning between the SAN and VMware, your available free space will be different between the
VMware is going to report capacity free based on how the virtual disks are created:
- Thin provisioned disk: Used space only.
- Thick provisioned disk: Provisioned space.
If VMware is thick provisioned and your SAN is thin provisioned, both VMware and the SAN are correct about how much free capacity you have even though they show different values.
Consider the following examples and definitions on a single layer of abstraction:
- Thick provisioning: I have 100G of capacity. A 100G disk is marked as used and cannot be allocated to any other server whether or not there is data in that disk. A 100G OS disk that has 15G of data on it is reported as using 100G.
- Thin provisioning: I have 100G of capacity. A 100G disk is marked as allocated, but only the bits in it that are used are marked as used. A 100G OS disk that has 15G of data on it is reported as using 15G.
- Overprovisioning: I have 100G of capacity. I thin provision three 100G disks to three servers. Each disk has 15G of data on it and reports as using 15G. The total capacity is reported as 300G allocated, 45G used. You are overprovisioned by 300%.
The above is the same for VMware as it is on your SAN.
When we then consider two layers of abstraction, VMware and the SAN, it can become confusing. The best way to think of it is to imagine two separate systems that don’t really understand each other, or even that the other one exists. I’ll try to make a simple example:
Your SAN says I have 34TB of capacity in this pool and that’s a fact. I can allocate that capacity to hosts as I please. In fact, I can allocate quantity ten of 4TB thin provisioned disks to VMware. Now I’m 117% overprovisioned, but I expect that the disks I allocated won’t be filled, so it is fine. Whatever VMware does is fine so long as the total used capacity of all of those 4TB disks doesn’t exceed 34TB.
In VMware, if you thick provision a 500G disk from a 4TB datastore, the datastore immediately reports 3.5TB free. Your thin provisioned SAN will still report 4TB in that logical unit and 34TB total capacity.
Note – if you thin provision in both VMware and your SAN your storage environment does become easier to damage – it is easier to overprovision to the point that you run out of disk space.
In order to reduce that likelihood, some people choose to only overprovision in their SAN and not in VMware, or only in VMware and not on their SAN. The consequence then is that VMware and the SAN report different numbers for capacity, but you only have one place to look to understand where you may have overprovisioned too much.
When both environments are
Should I overprovision, or how do I calculate overprovisioning?
If you don’t overprovision at all, or too little:
- You will never run out of capacity on a server without the server admin knowing.
- Simple to understand and do accounting at the storage level.
- Once allocated, you can’t (easily) shrink disks. A lot of disk (and money) ends up wasted.
The right amount of overprovisioning:
- Server admins have some ability to grow their data without asking for disk expansions from storage admins because you can allocate more free space to server’s virtual disks.
- Most efficient use of actual disk capacity. You are not paying for disks just so it can be filled with zeros.
- It is impossible to find the optimal amount of overprovisioning.
Too much overprovisioning:
- Server admins almost always get the disks allocated that they ask for and everyone thinks that we have plenty of storage
- The business is able to rapidly grow and change within the limits of physical capacity
- At some
pointthe physical capacity is going to become critical and you need to purchase disk immediately. Business needs to understand this and maintain budgetfor this eventuality.
If physical capacity runs out, all disks in the pool will go read-only until
diskis added. This will crash servers and sometimes irreparabiltybreak some things like databases.
- At some
The goal is optimal overprovisioning to get the best value out of your SAN, but as I mentioned, its impossible to find optimal. If you do happen to find what is optimal today, that may change tomorrow when someone adds a new server or dataset or someone in accounting uploads their personal photo album to their home drive. You have to lean towards too little overprovisioning because the consequence of too much is just not acceptable. Some businesses make the conscious choice to not overprovision at all because the consequences are too great and/or the procurement process to acquire more disk is too long.
Generally what is most recommended level over overprovisioning when starting out is between 120-130%.
If your VMware admin is a different person from your storage admin, make them aware that thin provisioning is going on. Also, make your server admins aware that you are thin provisioning. The threat of their disk going read-only should be enough to convince them to check with you before they massively increase the amount of data that they are using. However, best practice would say to keep virtual disk sizes reasonable. If you have a server that plans to use 200GB, don’t allocate 1TB just because they might need it. Allocate 250GB and let the server admin know they can have more when they need it. If every disk has that much slack space on it, you just make it easier for an accident on one server to take down all the servers.
If you have any questions or would like to discuss thin provisioning for your environment, please reach out to me at firstname.lastname@example.org.