The Staging Environment Best Practices These Engineering Leaders Swear By

As technology continues to evolve, users have increasingly less patience for when it performs poorly. So, in order to provide the best possible product for users, engineers need to follow the right staging environment practices.

But what works best? For two Colorado companies, the approaches differ.

Engineers at luxury retailer Nordstrom follow a continuous delivery system that allows them to test and deliver the various configurations in development at once, Director of Engineering Josh Maletz said. At healthtech company Hear.com, engineers use an isolation method to test and break lower environments in order to get the most utility out of them, DevOps Engineer Jack Cusick said.

Both continuous delivery and isolation have their benefits when it comes to developing staging environments. Built In Colorado caught up with Maletz and Cusick to learn more.

Jack Cusick

DevOps Engineer • Hear.com

The engineers at Hear.com, which provides customers with hearing care, use an isolation method when developing staging environments. One of the results? “Isolation gives our engineers the confidence that any outages or stress testing occurring on staging won’t affect our end customers,” DevOps Engineer Jack Cusick said.

What’s a critical best practice your team follows when developing staging environments?

Isolation is a best practice that our team follows for our staging environment. We use different Kubernetes clusters and AWS accounts for our staging and production environments. There are myriad benefits of this practice, but one worth highlighting is a reduced blast radius. Engineers need to feel comfortable testing and breaking lower environments to get the most utility out of them. Isolation gives our engineers the confidence that any outages or stress testing occurring on staging won’t affect our end customers.

What processes does your team have in place for monitoring and maintaining the staging environment?

Our team monitors our staging environment with an elegant combination of Kubernetes, New Relic and Slack. However, the tools themselves aren’t terribly important. What’s most important is that we monitor staging in the same way that we monitor production. Our monitoring, alerting and (on our best days) our gusto for troubleshooting staging outages is identical to production. This ensures we catch as many bugs as possible before production.

What’s most important is that we monitor staging in the same way that we monitor production.”

What’s a common mistake engineering teams make when it comes to staging environments?

A common mistake teams make when using their staging environment is incorporating it too late and too briefly in their release cycle. Oftentimes, code is rolling out late in the sprint under tight deadlines. This can lead to corners being cut during QA, smoke testing, etc. In these situations, a staging environment is not used to its fullest and may even do more harm than good. If you’re rolling quickly from testing to production, it may be time to revisit why you have staging and what you use it for. If necessary, look into development processes that pair well with light staging environments, like continuous deployment.

Get Alerted for Jobs from Hear.com

Josh Maletz

Director of Engineering • Nordstrom

For Nordstrom, a luxury fashion retailer that has more than 100 stores and a sizeable e-commerce footprint, there are a lot of systems that need support. As a result, the engineering team at the company follows a continuous delivery system that allows them to test multiple types of configurations in development at once, Director of Engineering Josh Maletz said.

What’s a critical best practice your team follows when developing staging environments?

Due to the variety of systems that we support, we are sticking to the basics and believe that our most critical practice in developing staging environments is continuous delivery. We support the credit services organization in Nordstrom, so we support a wide range of systems. We have staging environments for 20-year-old monolithic applications with hard-wired dependencies to vendor test environments. We also have serverless applications with infrastructure-as-code deployed to the cloud where we employ robust service virtualization techniques to exercise our system against simulated erratic behavior.

When battling these types of dependencies, we seek to test in isolation while also needing deep integration testing. We need flexible staging environments to support various configurations and ways to simulate vendor behavior. The practices supporting continuous delivery — deployment pipelines, automated testing, continuous integration, etc. — provide the optionality we need to build our staging environments for our different types of applications. Having our teams adopt the mindset of continuous delivery — focusing on getting fast feedback on new changes to the systems — has given us clarity to focus on learning how those changes affect our environments.

What processes does your team have in place for monitoring and maintaining the staging environment?

We use the same logging, monitoring and telemetry tools in our staging environment that we use in production. This allows us to track how new changes affect the performance of the systems both in isolation and as part of the workflows. We can see if we just introduced a new bottleneck, if the validity of the system degrades or if systems halt properly if dependencies are not available, and ensure the proper alerting occurs when needed. We use our staging environments for learning and getting fast feedback on our latest changes, which includes knowing our integration and the use of our health support systems is working, too. There is an extra cost associated, but the peace of mind gained from this insight is well worth knowing your systems will perform for your customers.

Having our teams adopt the mindset of continuous delivery has given us clarity to focus on learning how those changes affect our environments.”

What’s a common mistake engineering teams make when it comes to staging environments?

Our staging environments tend to be shared with other teams — business, product, analysts, compliance, etc. These teams may be using the staging environments to evaluate the latest changes or may be getting a demo from the team. We’ve had issues where we will build feature toggles to support the release of a feature, or where we have methods for supporting zero-downtime deployments, but we don’t use them in staging — this is a big mistake.

When pushing changes this fast, we need to ensure the same tools we use to not disrupt the customer experience in production are also employed in the stage when we have possible ‘customers’ using the system. Not doing so only creates confusion and undermines our partners’ confidence in our systems. If one needs to test changes in isolation, that is achievable using sandboxes per developer or other methods. If you can get to a place where you can build a brand new environment on demand for a single developer, that means you can build a new environment on demand for anyone, including for staging. As it is with all things, communication is key, and using the tools we already have for moving fast in production should be used for staging as well.

Get Alerted for Jobs from Nordstrom

The Staging Environment Best Practices These 2 Engineering Leaders Swear By

Recent Articles