Platform Engineering produces a platform that functions as a crucial software layer used by internal end users, such as developers, operations teams, and other stakeholders. Therefore, it’s essential to manage the platform as a software product, with the same level of rigor, planning, and user-centric focus that you would apply to any customer-facing product.
1 Treat the Platform as a Product
Platform Development and Management
- Dedicated Platform Environments
The platform team should build and maintain the platform in isolated platform development environments. This separation ensures that platform development doesn’t interfere with the regular application development and allows the platform team to experiment, innovate, and iterate without impacting the stability of production environments.
- Versioning and Release Management
The platform should be released and versioned over time, following a clear and consistent versioning convention. This practice ensures that users—your internal customers—are always aware of changes, improvements, and potential deprecations. Regularly communicating these updates helps users understand how the platform evolves and how it benefits their work.
- Incremental Capability Integration
Capabilities are incorporated into the platform gradually, with each addition offering different levels of maturity. This incremental approach allows the platform to evolve in alignment with user needs and feedback, ensuring that new features are both valuable and stable.
- Traceability and Lifecycle Management
The platform team must establish clear traceability over the platform’s evolution, capabilities, and overall lifecycle. This traceability includes documenting changes, tracking the maturity of each capability, and maintaining a roadmap that outlines future developments. Such transparency helps build trust with users and ensures that the platform remains aligned with the organization’s goals.
Promoting Platform Versions Through the Delivery Pipeline
- Development Consistency
Development teams should build their applications on top of the platform, ensuring that the code they produce is compatible with the systems deployed in production. This practice eliminates the "it works on my machine" problem, as the development and production environments are consistent.
- Continuous Integration and Quality Assurance
The platform must be integrated into the Continuous Integration (CI) process to perform quality assurance. By running tests and validations on platform-specific environments, the platform team can ensure that all changes meet the required standards before they are released.
- Product Compatibility and Testing
Products developed on the platform must release versions that are compatible with specific platform versions. These products should be thoroughly tested against the platform to ensure they work seamlessly, reducing the risk of issues in production.
2 Implementing Golden Paths
Building Abstractions for Internal Customers
- Golden Paths as Abstractions
The platform team creates Golden Paths—standardized, best-practice workflows and tools—as abstractions for internal customers (developers and other teams) to consume platform capabilities. These Golden Paths simplify complex processes and provide a clear, efficient way to leverage the platform’s features.
- Managing Golden Paths as APIs
Golden Paths should be treated as contracts between the consumers (internal teams) and the platform team. Like APIs, these contracts define the expectations, inputs, and outputs of the platform’s capabilities. This approach ensures that Golden Paths are reliable, maintainable, and can evolve without disrupting the teams that depend on them.
- Exposing Golden Paths
Golden Paths are exposed to users through high-level resources, such as code libraries, templates, or via a developer portal UI. The developer portal serves as a centralized hub where users can access documentation, examples, and tools to interact with the platform efficiently. This user-centric approach ensures that internal customers can easily discover and use the platform’s capabilities, leading to higher adoption and satisfaction.
What capabilities should a platform have?
The Platform Engineering Whitepaper released by the CNCF provides a good approach on this topic:

Here are capability domains to consider when building platforms for cloud-native computing:
Capability | Description | Example CNCF/CDF Projects |
---|---|---|
Web portals for provisioning and observing capabilities | Publish documentation, service catalogs, and project templates. Publish telemetry about systems and capabilities. | Backstage |
APIs for automatically provisioning capabilities | Structured formats for automatically creating, updating, deleting and observing capabilities. | Kubernetes, Crossplane, Operator Framework, Helm, KubeVela |
Golden path templates and docs | Templated compositions of well-integrated code and capabilities for rapid project development. | ArtifactHub, Score |
Automation for building and testing products | Automate build and test of digital products and services. | Tekton, Jenkins, Buildpacks, ko, Carvel |
Automation for delivering and verifying services | Automate and observe delivery of services. | Argo, Flux, Keptn, Flagger, OpenFeature |
Development environments | Enable research and development of applications and systems. | Devfile, Nocalhost, Telepresence, DevSpace |
Application observability | Instrument applications, gather and analyze telemetry and publish info to stakeholders. | OpenTelemetry, Jaeger, Prometheus, Thanos, Fluentd, Grafana, OpenCost |
Infrastructure services | Run application code, connect application components and persist data for applications | Kubernetes, Kubevirt, Knative, WasmEdge CNI, Istio, Cilium, Envoy, Linkerd, CoreDNS Rook, Longhorn, Etcd |
Data services | Persist structured data for applications | TiKV, Vitess, SchemaHero |
Messaging and event services | Enable applications to communicate with each other asynchronously | Strimzi, NATS, gRPC, Knative, Dapr |
Identity and secret services | Ensure workloads have locators and secrets to use resources and capabilities. Enable services to identify themselves to other services | Dex, External Secrets, SPIFFE/SPIRE, Teller, cert-manager |
Security services | Observe runtime behavior and report/remediate anomalies. Verify builds and artifacts don't contain vulnerabilities. Constrain activities on the platform per enterprise requirements; notify and/or remediate aberrations | Falco, In-toto, KubeArmor, OPA, Kyverno, Cloud Custodian |
Artifact storage | Store, publish and secure built artifacts for use in production. Cache and analyze third-party artifacts. Store source code. | ArtifactHub, Harbor, Distribution, Porter |
Our experience though is that there are other important capabilities in addition to what the CNCF suggests, such as platform upgrades, disaster recovery or API management.

An important caveat regarding the scope of the platform team is highlighted in the whitepaper:
While platforms provide essential capabilities, it's critical to note that platform teams should not always implement these capabilities themselves. Managed service providers or dedicated internal teams can maintain the underlying implementations, while the platform serves as the thinnest reasonable layer that ensures consistency across these implementations and meets the organization’s requirements.
Therefore, it’s crucial to establish a clear understanding of ownership for each capability to avoid any grey areas in the stack that could lead to misunderstandings about which team is responsible for what.
Comments are moderated and will only be visible if they add to the discussion in a constructive way. If you disagree with a point, please, be polite.
Tell us what you think.