Platform Engineering has a Problem – It’s Infrastructure-as-Code

Published: February 10th, 2023

In the context of cloud operations, the last decade was ruled by DevOps and Infrastructure-as-Code. But what is true DevOps? Is it developers running their own operations? Is it a different job role, where operators learn development skills to automate infrastructure provisioning using Infrastructure-as-Code (IaC) tools like Terraform?

Now that DevOps is widely understood in the latter context, software developers are far from understanding or operating infrastructure. IaC is purely a DevOps tool. When an organization claims to have automated their infrastructure, they mean automation for the efficiency of the DevOps teams, not automation for developers. Developers are still waiting on DevOps for days, for even small infrastructure changes.

While there were efficiencies gained by the current DevOps model, there is an increasing acknowledgement that we require a better approach to enable developer self-service. This has led to the Platform Engineering Discipline.

The high-level goals of platform engineering are:

Developer self-service for significant parts of infrastructure updates without DevOps subject matter expertise.
Built-in security and compliance controls.

Infrastructure-as-Code Limitations in Platform Engineering

From our survey of 40+ enterprises that have made substantial investments in platform engineering, we can see that the prevalent approach is for DevOps teams to build a DevOps platform with IaC as a core underlying technology. They create templates for the organization’s use cases and publish them in a CI/CD pipeline or a self-service catalog.

Here are the top reasons why platforms that build on top of IaC are failing these platform engineering goals:

IaC templates are rigid and fall short of changing developer requirements

It is true that DevOps teams can anticipate cloud infrastructure topologies to some extent and have IaC templates for those with a few customizable parameters. But in a microservices world there are thousands of other workflows and topologies possible, based on changing application needs and security controls. A manual approach, relying solely on DevOps personnel to constantly build and update myriad combinations in static scripts simply can’t scale.

Scripting tools can’t build lifecycle management

In cloud operations, people-triggered changes are only a subset of possible use cases. Many asynchronous operations need to be continuously performed. These range from detection of configurations, from desired state to reverting, or complex configuration scenarios where individual components have to be set up asynchronously and brought together later. It could be as simple as a certain component going down and needing to be restored. IaC is a script and runs when triggered to completion. it has no active lifecycle to operate continuously in the background.

Inability to build a concept of an environment

In any orchestration system, users have a concept of the environment they want to build. When they login to the platform and navigate to their environment, they expect to update, as well as view, the state and aspects of resources in that environment, be it the provisioning status, metrics, logs, faults, audit logs, compliance posture, etc. They may want to perform debugging functions, such as restarting services, SSHinginto a VM, or accessing a resource’s cloud console (S3 Console, for example) using access control boundaries within that environment. For example, in Kubernetes you choose the namespace to be the environment and management software like Rancher on top of K8S to provide these functions. If we must replicate this same concept across a broad infrastructure level platform, we cannot do it with only IaC. We need something like Kubernetes, but one whose scope spans all cloud operations. Terraform is a configuration updating and management software, not an orchestration system.

Many platform teams have tried to work around these problems with point solutions by using a disparate set of jobs for certain aspects of lifecycle management, building a thin UI shim on top of cloud accounts for visualization of resources while redirecting to other systems like DataDog for logging, metrics and alerts. But for most of the use cases, the DevOps team is still very much in the operations workflow. This completely defeats the concept of developer self-service as well as continuous compliance goals.

Learning from Other Successful Platforms

Two recent examples of successful cloud platforms are Amazon Web Services (AWS) in the context of Infrastructure-as-a-Service and Kubernetes for Container Orchestration. These are distributed system implementations using higher-level programming languages like Java and Go. You can’t build such complex systems using scripts and jobs.

Building a true DevOps orchestration platform requires a systems design approach. It also takes expert systems engineers and many years to build and mature, like it did for K8S and IaaS in the Public Cloud.

Article Tags

DuploCloud, IAC

About Venkat Thiruvengadam

Venkat Thiruvengadam is founder and CEO of DuploCloud

View all posts by Venkat Thiruvengadam

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_WTGVKVXEZJ	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_107693958_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
_heatmaps_g2g_101137905	10 minutes	No description
cf_7167_id	20 years	No description
cf_7167_person_last_update	session	No description
GoogleAdServingTest	session	No description
prism_252377639	1 month	No description
querylyvid	3 months	No description
xtc	1 year 1 month	No description

Platform Engineering has a Problem – It’s Infrastructure-as-Code

Infrastructure-as-Code Limitations in Platform Engineering

IaC templates are rigid and fall short of changing developer requirements

Scripting tools can’t build lifecycle management

Inability to build a concept of an environment

Learning from Other Successful Platforms

Article Tags

Subscribe to SDTimes

About Venkat Thiruvengadam

Related Articles

It’s time to embrace Monitoring-as-Code

Infrastructure as Code: The case for continuous delivery

AWS CloudFormation introduces the ability to disable automatic rollback

5 tips for securely adopting infrastructure as code