Drastically Reduce Azure PaaS Hosting Costs in Non-Prod Environments With Scheduled Vertical Scaling

Jump to Sitecore specifics >>

Quick link to repository with one-click deployment to Azure: https://github.com/jraps20/jrap-AzureVerticalScaling

The Problem

Back in the Infrastructure as a Service (IaaS) days, an engineer could schedule Virtual Machine’s (VM’s) to turn off/on during non-work hours for non-prod environments. It was a quick way to save money on hosting costs in non-prod environments. If you have multiple non-prod environments (Int, Dev, Stage, UAT, etc.), you could turn on the associated VM’s at 8am and turn them off at 5pm, for example. This could quickly drop your bill to half+. With the industry move to Platform as a Service (PaaS), this luxury doesn’t exist.

There are countless ways to trim costs in Azure PaaS mostly revolving around tweaking resources. These savings are not insignificant, but they aren’t necessarily significant either.

What is needed is the equivalent VM off/on mechanism for App Service Plans.

Scheduled Vertical Scaling of App Service Plans

My solution to this problem was to create a one-click deployment to Azure to add runbooks to Scale Down and Scale Up all App Service Plans. The runbooks automatically perform the same series of manual steps that would be required to scale the resources individually. There are a few options available with these runbooks:

  • Supply no parameters
    • The runbooks will traverse your entire subscription, find all Resource Groups and traverse all App Service Plans and scale accordingly
  • Supply a Resource Group Name parameter
    • The runbooks will traverse all App Service Plans within the supplied Resource Group and scale accordingly
  • Supply a Resource Group Name and App Service Plan Name parameter
    • The runbooks will only scale the supplied App Service Plan and scale accordingly

For all down-scaling operations, the original values of the resource are stored in Automation Variables. These include important metadata such as:

  • Current service tier, e.g. S1, S2, P1, etc.
  • Current Service Tier name, e.g. Basic, Shared, Standard, Premium
  • Unique settings per App Service (within an App Service Plan)
    • AlwaysOn
    • Use32BitWorkerProcess
    • ClientCertEnabled

During up-scaling, these variable values are read in order to scale the resource back to its original settings.

Sitecore/App Service Plan Testing

Video tutorial for how to use this solution

The jrap-AzureVerticalScaling solution has been tested on Sitecore 9.1.0 XP Scaled. This deployment creates 8 App service Plans. Depending on the “size” chosen, the pricing tiers can vary from Basic 1 (B1) to Premium 2 (P2) and everything in between. At level B1 the cost per month is $54 and P2 is $292. There is nothing preventing manual changes of these tiers either. I have seen Staging infrastructure at P3, for example ($584/mo).

All of the App Service Plans deployed with Sitecore contain App Services with unique settings that prevent an immediate drop to the Free tier:

[Legend]

  • App Service Plan
    • App Service
      • Setting(s) preventing an immediate drop to Free tier

Default Sitecore Specifics:

  • cd-hp
    • cd
      • AlwaysOn
      • Use32BitWorkerProcess
  • cm-hp
    • cm
      • AlwaysOn
      • Use32BitWorkerProcess
  • exm-dds-hp
    • exm-dds
      • AlwaysOn
      • Use32BitWorkerProcess
  • prc-hp
    • prc
      • AlwaysOn
      • Use32BitWorkerProcess
  • rep-hp
    • rep
      • AlwaysOn
      • Use32BitWorkerProcess
  • si-hp
    • si
      • AlwaysOn
      • Use32BitWorkerProcess
  • xc-basic-hp
    • cortex-reporting
      • AlwaysOn
      • Use32BitProcess
      • ClientCertEnabled
    • ma-ops
      • AlwaysOn
      • Use32BitWorkerProcess
      • ClientCertEnabled
    • ma-rep
      • AlwaysOn
      • Use32BitWorkerProcess
      • ClientCertEnabled
    • xc-search
      • AlwaysOn
      • Use32BitWorkerProcess
      • ClientCertEnabled
  • xc-resourceintensive-hp
    • cortex-processing
      • AlwaysOn
      • Use32BitWorkerProcess
      • ClientCertEnabled
    • xc-collect
      • AlwaysOn
      • Use32BitWorkerProcess
      • ClientCertEnabled
    • xc-refdata
      • AlwaysOn
      • Use32BitWorkerProcess
      • ClientCertEnabled

This list can be distilled to 3 key settings within App Services that prevent the reduction of an App Service Plan to the Free tier:

  • AlwaysOn
  • Use32BitWorkerProcess
  • ClientCertEnabled

All of these are accounted for in the Scaling Down of the App Services- settings are stored in Automation Variables. Upon Scale Up, the same Automation Variables are read and then used to reapply the previous settings.

When Sitecore is in a scaled down state (i.e. off) it truly is OFF. The site is inaccessible for a few reasons:

  • The swap to a 32-bit worker process results in DLLs becoming unreadable to the application
  • Disabling Client certs causes all connections to xConnect to fail

As the goal of this solution is to “turn off” the App Services, then it succeeds, even though the site is inaccessible. If the complete disabling of the site is unacceptable, then there are ways around it. The 64-bit process, the ability to rely on client certs and enabling AlwaysOn are available at the Basic tier level. While not free, it can provide a significant drop in costs. This feature is NOT implemented in the solution currently, though would only require a single line change.

Conclusion

Since Azure tracks cost per hour, each hour scaled down is money saved. If you do not use your non-prod environments in all hours of the day SCALE THEM BACK! If you are not comfortable converting your solution entirely to the Free tier, the deployed runbooks can be updated to suit your needs. If you have scheduled tasks that run during off-hours, consider modifying them to allow you to turn off the environment. In my opinion, this solution can result in a drastic reduction in hosting costs. If warm-up time is important, consider creating accompanying runbooks to warm-up the apps that were turned off.

If you try out this solution, please reach out to me on Sitecore Slack (jraps) or Twitter and let me know your experience. If there are unaccounted for edge cases, consider opening an Issue.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.