A virtual machine scale set is an Azure compute resource that is used to scale up a set of identical virtual machines. If the demand arises, for example, a lot of traffic on a web server, you can distribute compute power using scale sets. You can also implement an upgrade policy whereby a certain number of virtual machines are updated in a given order while still serving up your application.

In this example, we will create two virtual machine scale sets, one based on a Linux VM running Ubuntu server 18.04 LTS and another based on Windows VM running Windows Server 2016.

The commands above will create a scale set with the name web server based on the Ubuntu server 18.04 LTS image with one running instance in the tiptoe-rg4 resource group and create the admin user ssh keys to used access the scale set VM's remotely.

When our scale set is done creating, we shall then see it in the Azure portal. For this example, we will simulate a load based on the CPU percentage to test our scale set. Remember that to scale up a virtual machine scale set, you can use different metrics according to our application needs.

The custom scale condition above represents increasing the machine count by 1 every time the average CPU load is above 70% for every two minutes and generates instances up-to a maximum of five incase the load metric stays up.
To generate a CPU load, run the following Linux command on the first instance of the virtual machine scale set and wait for about 5 minutes.

dd if=/dev/zero of=/dev/null

The command will copy random bytes of data and throw them in the null directory infinitely, thus generating a CPU spike over time.
In the image below, we can see the instances increasing according to our custom scale condition.

Part Two - Virtual Machine Scale Set on  Windows Server 2016

In this second part, we shall use the same scale condition for the scale set that we defined in our first example. I will use the portal to create this scale set, but you can still use Azure CLI.

I will skip ahead to the scaling tab and set the Initial instance count to 1, with a minimum to the maximum number of VMs 1 to 10. The scale-out condition allows us to scale by 2 VMs every five minutes in a case the average CPU threshold is held at an average of 70% and scale in by one VM in a case the average CPU threshold is 25%.

I will leave the upgrade policy to Manual

When my scale set is done creating, I will remote into it. VMSS are usually placed behind load balancers.
To generate a CPU performance spike, I will create a notepad file with the following command and then save it as a .vbs file.

While True
Wend

When the .vbs file is created, execute it N number of times, where N is the number of logical processors that your CPU has. In my example, I have two, so I'll execute two times.

We'll then wait for 5 minutes, and when I head back to the portal at the instances blade, we shall see the number of instances increase.