Fast software release cycles – how to avoid accidents at high speed

Why are fast release cycles so important for software development – and what strategies can help to avoid accidents although the team is producing at high speed.

Blurr - fast software releases
Photo by chuttersnap on Unsplash

Fast release cycles create customer value

The goal of every software development team should be to deliver new functionality to the users as soon as possible. Why? Finished software that sits in the shelf waiting for the next release is not usable. It is incomplete work, wasted effort and money. To add value you need to put that shiny new feature in the hands of the customer. Only then the new features make a difference in the real world. This means your software is only complete after the release and deployment. The entire process from development and testing to deployment needs to be optimized for speed.

Fast release cycles enable flexibility

Or think about a situation were your tests have discovered a security problem in your software. Now you need to be able to fix it quickly. Or you may need to adapt to a breaking change in some other consumed service that is not even in your own hands. Things happen, and in Cloud World you need to be flexible and able to adapt quickly. Once again – you need to be able to fix fast, but this only helps if you are also fast at testing and deployment. However nobody wants to be reckless. Jump and see how it goes? You want to be sure that your fix works.

Fast release cycles - but no reckless jump into the unknown
Photo by Victor Rodriguez on Unsplash

Why incremental changes are your friend

The good news is that you won’t change the entire product from one day to another. If planned accordingly the team can break down the work into small steps. Ideally these can be tested individually to get immediate feedback. It works? Great. There’s a new problem? Ok, we should know pretty well where it comes from since only a small number of changes occurred since the last good version. And the developers will have these changes fresh in their minds. Fixing the problem should be much easier compared to yesterday’s approach were many changes came to test a long timer after implementation -and all at the same time. So let’s assume the new small change is implemented and tested separately.

Incremental step wise changes help to shorten release cycles
Photo by Lindsay Henwood on Unsplash

 

 

 

 

 

 

 

 

 

 

The next and final step is to deploy this incremental change and we’re done? Sounds too good to be true and indeed… How can you assure that the small change didn’t cause any side effects and broke something within the existing overall system? This is called a regression.

The new bottleneck: regression testing

So you need to test for regressions. And this basically means that you need an overall test of the entire system, which often is a huge effort. If you want to be on the safe side you will have to repeat this exercise over and over again for each small incremental change. Now if such an overall tests would take days or weeks it kills the nice-and-small incremental approach. It would just be too slow and too expensive.

Software test lab
Photo by Ani Kolleshi on Unsplash

The only way out of this dilemma is…

Test automation – the enabler for high speed releases

Imagine a setup where you could prove with a click on a button that your software is doing what it is supposed to do. That today’s changes did not introduce any regression. Test automation aims at achieving this. Manually clicking through an application may still have its place within an overall testing concept. But in no way this should be your normal approach. Put the test procedures in code and execute them automatically. This is what enables quick feedback to code changes – and therefore fast release cycles. This automated approach has the added benefit of repeatability – and test status reports that your the test framework will create automatically if set up accordingly.

Does this mean that testers are not required any more? Not at all, rather the opposite is true. Test automation won’t save cost – this is about saving time and improving quality. Or in other words: about avoiding regressions although the team is going at high speed. This is where testers play a key role. However with test automation the tester’s focus and know-how profile changes completely. Yesterday testing meant manually executing test procedures over and over again. Today it means development and coding of tests. Testers are becoming developers – or developers take more and more responsibility for testing. Welcome once more to DevOps world. (more here).

Fast software release – mission accomplished?

So let’s assume you the team works with incremental changes. You have automation in place to quickly test the changes for functionality and regressions. We are good to go – now what needs to happen to put the new version to production – into the hands of the users? This will be covered in the next article about Deployment automation. Stay tuned.

Cattle or Pet – what IaC means and why you shouldn’t use admin-UI’s

Why manually installed servers are like pets

Before we look at new approaches let’s see how IT infrastructure was managed in the past. Life of any IT system usually started with basic server setup. Joe the admin would plug in the new hardware, configure hard drives and network and then install the operating system. Then on top of that whatever software or applications were required. He would do that manually via scrips that got adapted to the new infrastructure because names, IP addresses etc. would have to be changed for each new system. Then Joe would check if everything worked, maybe fine tune and add whatever was required before putting the shiny new machine to production. In case of a problem he would troubleshoot it and correct his setup. And over time Joe would take care of his machine. Patch it with newer versions of the OS and system software. Look after backups. Maybe extend disks or memory. The server would be like Joe’s dog – a pet.

Pets are unique

If Joe’s pet had a problem Joe would find out the cause and fix it. Maybe he would have to experiment with one or two settings. Look left and right. However, finally it always worked. But the machine would get more unique over time – more unlike any other server in the world. A pet. Much needed to happen before Joe would re-setup his beloved server. A disaster, like maybe a virus. Then poor Joe would have to go through all his setup steps again, trying not to forget anything.

Fast forward to cloud world. Remember, you don’t own the hardware any more? You basically just rent it. Or you don’t even rent the hardware but just consume services (see: IaaS vs. SaaS). In no way you can continue to handle your infrastructure as Joe did. Well –

You can do that but then the sky shall fall onto your head !

Why? First of all you probably need to set up your cloud infrastructure more than once. You will need a productive system, but you won’t use that for testing during development. So you need another environment for development. Or you may need to set up your entire infrastructure at another region, or with another provider. Or one day you may want to experiment with a new approach or run separate tests in parallel – yet another system required. It is crucial that all these systems have the exact same configuration. Otherwise be prepared for big surprises during release…

Cloud infrastructure is cattle

The only reasonable way to handle this is by automated infrastructure setup. Don’t go down Joe’s road. He could do it manually and survive because he owned the hardware – in cloud world you don’t. Yours is more like this:

A horde of cattle. Your machines and all other cloud building blocks are standardized, and there are potentially many available. They don’t have their own personality, at least they should not. If one disappears some other will take its place. Means you need to be prepared to replace it quickly and this is where automated infrastructure setup comes in. You will have scripts to set it up and to configure it without ever touching an admin UI. Your infrastructure becomes code. Infrastructure as Code: IaC.

What infrastructure as code means

People often use the term “cattle vs. pets” when talking about IaC. This goes hand in hand with the “immutable server” concept: you will never change infrastructure configuration once it is in place. Consider it as immutable. Instead fix your IaC code and run it again to set up a new machine. Delete the old one. This is fast and reliable and always gets you to the exact same state – 100% guaranteed. You can do this as often as required, and in the end you have a piece of (IaC) code that is tested and can be reused whenever required later on. You will probably have parameters that are specific to a certain environment, like e.g. the URL’s for your development and your productive system. Make sure to keep them separate from your IaC code. You want to run the identical code for each variant of your system. This is how you ensure they are all identical. IaC also solves another challenge: manual changes to your production system. Consider these as “high risk” activity – don’t do it. Instead, pre-test and run your updated IaC code. This will reduce chances for human errors, ensure reproducible results and at the same time provide full traceability of the production changes.

How to write IaC code

How would you manage and run this IaC code? Since it’s code you will have it under version control and you should execute it via a build pipeline – see this article on tooling. And how exactly would you write the IaC code? There are several approaches. The large cloud providers each have their own standards like e.g. CloudFormation for AWS or ARM Templates for Azure. Or take a look into Terraform which has some nice additional features. If you need to automate the setup of individual virtual machines (which you should avoid, consider serverless instead) then Puppet, Chef or Ansible are probably the most popular options.


Check out these related articles:


 

IaaS vs. SaaS – why the difference is very relevant for cloud software

Find out what the difference between IaaS (Infrastructure as a Service) and SaaS (Software as a Service) is – and what this has to do with cloud software architecture.

Lift and shift?

You may have heard of the term lift and shift – which means you take some existing system or application that has been living happily in a data center for years, rip it out and move it over to some cloud provider. Usually the idea behind this is saving on data center infrastructure and management cost. And usually lift & shift means that you spin up the required number of virtual machines in the cloud and reconfigure their connections and backup settings. Sounds simple, but is this giving you the benefits that true cloud solutions could provide? Most likely not. VM’s are great, and they have revolutionized app and service provisioning. They have decoupled the 1:1 relationship between soft- and hardware and allowed for easy sharing of servers between different systems. And they are still going to be around for a long time. Using VM’s from some cloud provider would mean consuming Infrastructure as a Service (IaaS).

However we are in 2019, and VM’s are not the latest-and-greatest technology any more – rather look at SaaS (Software as a Service) as the new mainstream. What does that mean?

Assume you need a database to store your customer and sales data. In ancient times you would have bought a server and installed your operating system and database software. You would have planned for regular updates of your systems to cater for security patches and bug fixes. If your server broke down you were in trouble and hopefully had a disaster recovery plan that allowed for fast re-installation once the hardware was fixed or replaced. Then the VM concept entered the arena. You could just backup your entire server to a single file. Move it to another machine if required. And the virtual machine concept enabled better use of existing hardware by running several VM’s on the same physical machine – a huge step forward.

The IaaS approach

With the cloud you can now use the exact same technology but without owning any physical hardware by yourself. Just sign up with a cloud provider and book as many cloud based VM’s as you need – with size and performance as required. This approach is called IaaS – Infrastructure as a Service. But it still leaves you with maintenance work for the VM’s operating system and database. You still need to maintain and manage those setup procedures and maybe fine tune your database system parameters. If you need high availability you’ll have to set up several VM’s and manage the cluster. And if your database is sitting idle because there’s not much activity during non office hours you still pay for every single minute all your VM’s are up and running. You could do better: go and buy not the VM for the database – rather buy the database service itself! Skip that middle layer. The cloud providers offer a broad range of database services – from the classical relational DB’s to NoSQL and Caches, all fully managed, with high-availability and data backup rather simple configuration options.

The SaaS approach

This is the SaaS approach – software as a service. It will be much simpler to set up and maintain, and it will typically be more cost efficient. And easier to scale up if required. Can you do this for existing legacy software that you can’t or don’t want to touch? Probably not, at least not if your legacy software requires a specific version of database xyz with specific settings and configuration. Can you go that way for a new development? Yes of course – pick the most suitable option for your use case, and don’t forget to take a look at resource pricing before you make your choice. Your system architecture and your selection of cloud components as building blocks will have a huge impact on your future operation cost.

For many use cases the SaaS approach will be more interesting. Go that way if you can. Especially for new developments consider using SaaS over IaaS approaches, and if possible serverless over containers. The ‘lift & shift’ approach for existing applications could for many companies be the first step towards cloud based IT. However don’t stop there – at least you may want to investigate a more in-depth approach over time, where the existing application is restructured and optimized to leverage the cloud capabilities.


Check out these related articles:


 

Cloud Native? What it means to develop software for the cloud

What is cloud native – and what does it mean to develop software for the cloud? Find out what makes software teams life very different in cloud world. What is so special about cloud software? 

Is there anything special at all? If ‘cloud’ would only mean booking a virtual server and installing the application there – then the answer would be no. However this would only leverage a small part of the what-is-possible. Hidden within the term “cloud native” is the assumption that you want to realize on-demand scalability, flexible update capabilities, low administration overhead and be able to adapt infrastructure cost dynamically to performance demands – just to name a few. Sounds wonderful… however these things won’t come all by themselves. Your software needs to be designed, built and deployed according to…

The cloud-rules-of-the-game

Huh, what’s this? First of all don’t get me wrong: there is nothing bad about having a monolithic application residing on a single server or cluster – and from development perspective this may be the simplest way to realize it. However this monolith may get difficult to extend over time without side effects, it may be difficult to set it up for high availability and to scale it with growing number of users, data, or whatever may be the reason for your scaling needs. And it may be hard to update your monolith to a newer version without impacting the users during the update.

Software monolith vs microservices

Now consider the same application based on what is called a microservice architecture: the monolith gets split up into smaller, decoupled services that interact with each other as providers and/or consumers of stable API’s.

What makes microservices well suited for cloud applications?

Let’s assume that each service may exist not only once but with multiple instances up and running simultaneously. And let’s assume the consuming services are fault tolerant and can handle situations where a provider service doesn’t respond to a call. Wouldn’t that be cool? The overall system would be robust, and it would be very well suited to run on cloud infrastructure.

  • Because now if service xyz is starting to become a bottleneck you can simply create more instances of that service to handle the extra load. Or even better, this would happen automatically according to rules that you have configured up-front. This approach is called “scaling out” (compared to the old-school “scale-up” approach where you would get a bigger server to handle more load).
  • Next up imagine that you need to update service xyz to a newer version. One way of doing this would be to create additional service instances with the new version and remove the old ones over time, an approach called “rolling update“.
  • Or you decide to add a new feature to your application. Since only 2 of your 5 services are impacted you will only need to update these 2. Less change is easier to handle and means less risk.

Getting microservice architecture right is not easy, but once you have it the advantages are huge. Note that microservices architecture as such has nothing to do with the cloud as such. However both go very well together because in a cloud environment you need to be prepared to micro outages of single services anyway. You don’t really control the hardware any more, at least you shouldn’t want to. In order to be cost efficient with a cloud approach each service should be enabled to run on commodity infrastructure. From that take and pay for as much as required.

Where will your services live?  – serverless vs. containers

For hosting your workers or compute loads consider a serverless approach over a VM or container based one. This basically means that you only write the service code as such and determine when and how the logic will be triggered for execution. All the rest is handled by cloud infrastructure. At Amazons AWS this technology is called Lambda, Microsoft named it Azure Functions, Google calls it Cloud Functions. The principle is always the same. There’s no virtual machine any more, not even Docker or Kubernetes containers – means less things to manage and look after, means less operations and maintenance effort. And you only pay for the execution time. If nothing happens there’s no cost. If you suddenly require high compute performance your serverless approach will be scaled automatically – if you have done your architecture homework. Serverless will e.g. require that your services are stateless, means whatever information they need to keep between two executions of the service must be stored externally, e.g. in a cache or database service. As with microservices the advantages need to be earned. Serverless will not solve all problems, but make sure your team (at least the architect) understands the concepts and can make informed decisions about its use.

Other game changers in cloud world

What else has changed in a world where dedicated physical servers seem to have disappeared? Very trivial things need to be handled differently.

  • There’s no local hard drive any more. If your service runs in a VM or Docker container it may feel so, but remember: in cloud world machines are cattle. A VM/container may die or disappear and will be replaced by a new one. Bad luck if you had data on a local drive of that machine. Now you’ll need to think about alternative ways for storing away your data or settings. In cloud world you may want to use a storage service for that purpose, a central configuration service, a database, environment variables… the choice depends as always on the requirements. Make sure the team knows the available cloud building blocks.
  • If you are a Microsoft shop: there’s no registry any more. See above comments.
  • For logging there are no local files any more. These would anyway not make much sense in a world where services are distributed. You’ll rather send log output to a central logging service. That will consolidate all logs of the various services in one central place, making troubleshooting much easier. There are many open source solutions for logging, our you may just use the one your cloud provider provides.
  • And finally the term infrastructure will get a whole new meaning in cloud world. Infrastructure still exists, but now it needs to be managed very differently. You should strive to set it up automatically, based on scripts that you can re-run any number of times. This is crucial because you will need more than one cloud system. At least you should have one for development and testing which is separate from the real one – your production system. The two environments should be as identical as possible, otherwise your test results are not meaningful and you will chase after phantom problems that are just caused by some infrastructure misconfiguration. Those scripts will set up your required cloud resources.  Means you describe the infrastructure in code just as your service logic. Infrastructure as Code (IaC) is the term for that. Check out this article for more.

So what is Cloud Native after all?

Hopefully it became clear that software needs quite a few specific considerations to feel comfortable in cloud world. Which is the reason why “lift and shift” for existing legacy software is a valid approach, but won’t leverage the full cloud potential. In order to run efficiently in the cloud, software must be designed for that purpose – this is the meaning of cloud native – software that is architected and optimized for a cloud environment. For legacy software that usually means: for efficient shifting to the cloud major refactoring or even a complete rewrite may be required.

So welcome to cloud world. Tremendous power and flexibility is at your disposal. However you’ll need to architect your software with the cloud and its building blocks in mind. Decompose your application into services. Consider serverless approaches to reduce operation effort and improve scalability and availability. Focus on your domain knowledge and your specific value-add. For the basics use existing building blocks wherever it makes sense.


Check out these related articles:


 

What does cloud mean – and what are the real advantages?

We’re in 2019 and it seems like new software projects are designed for the cloud. Seems like. Maybe this it not true yet despite of all the hype – but what does ‘cloud’ mean? What are the drivers to use it, and what are the benefits?

Once more the internet is the big game changer. Network bandwidth at close-to-zero cost and with high availability enabled the shift of IT workloads. Software that had been running on company servers and internal data centers is gradually transferred to cloud providers – e.g. Amazon Web Services (AWS), Microsoft Azure and the Goggle Cloud, to name the 3 largest ones.

Data Center
Photo by Tanner Boriack on Unsplash

What is “the cloud”?

These companies run huge data centers and sell their compute power in small slices to the masses. Of course IT and software still run on server hardware – but for the users it doesn’t feel like this any more. Users of cloud services are completely shielded from the physical hardware. They don’t need to think about all the basic installation and maintenance work that was usually required to get a large number of computers up and running. To backup their data, upgrade and patch the operating system etc. The ‘cloud’ has established a high-level abstraction for all this and made it easy to consume compute power as required.

What started more than 10 years ago with some simple storage services as a sideline business of an online bookstore has since then grown into an incredibly versatile web of services ranging from virtual machines and networks to databases, caches, load balancers, streaming engines and many more basic building blocks of current cloud systems. All of this is available on demand within seconds or minutes. Performance KPI’s can be selected and scaled as required. The cloud provider takes care of all the necessary heavy lifting in the background – highly automated and with redundant infrastructure that usually spans multiple data centers. The user just consumes, and only pays what he needs.

Unlimited scalability

Need to set up a website for 200 users? 5 minutes and we’re online. Need to scale up to 1 million users? Just a few more minutes and here we go. Need to add a terabyte-size database cluster? Minutes again and the system is ready. Compare this to the weeks of planning, ordering of hardware, system installation and configuration that would have been required to set everything up locally. And now assume that the system was only needed for a 3-week marketing campaign. No problem, let’s stop and remove all services when it’s over, and the cost goes back down to zero immediately.

This is the power of the cloud. Virtually unlimited resources and flexibility, on-demand consumption of services and pay-as-you-go pricing models.

Are Cloud systems cost efficient?

Is the use of cloud resources always cost efficient? It depends. The huge system for the 3-week marketing campaign is most likely very cost efficient. Now if you think about replacing your well managed local company servers and databases by cloud resources – depending how it is done you may end up with lower or even higher cost than today. Adopting the cloud usually means much more than just re-locating existing servers as they are – this would be called lift & shift. Check out this article for more. One thing that is not always easy with cloud services is exact pricing. Since you pay for the resources ‘as required’ and resource cost may depend on load, data volume and other dynamic factors the overall system cost may vary over time. Use the cost estimation tool of your favorite cloud provider and fill it with 2 or 3 typical load scenarios. This should give you a good idea about monthly cost and also highlight your major cost drivers.

Should all software run in the Cloud?

Cost aside, is each and every workload suitable for the cloud? Again it depends. Despite high reliability of networks and globally distributed redundant infrastructure – it is nearly impossible to achieve the latency and robustness of a local system with a cloud based approach. And you would rather not consider a cloud based approach if your compute requirements are extremely simple and need to be very low cost. You would most likely not shift the control program of your washing machine to some cloud, just put it onto a 5$ chip and use it for the next years. Which also eliminates the need to connect your washing machine in some way or another to the cloud…

Are Cloud Systems secure?

Finally let’s talk about security. For many decision makers this is the most critical question when deciding for or against a cloud based solution. Is critical data safe in some providers cloud data center, from which you don’t even know where exactly it is located? And where your data is stored side by side with data of millions of other customers? The answer is most likely yes, but as always when it comes to security it also depends. Security always requires measures on multiple levels. It starts with the operating system of the servers, the configuration and administration of the various application layers, the approach for authentication and authorization, backup, encryption etc.

Security on autopilot?

When consuming cloud services you can count on your provider to do his part if the job. But you still have to look after yours. An example: the cloud database service you are using may provide these encryption and backup features. But it’s up to you to turn them on and manage the keys and access permissions appropriately.

In any case all capabilities for building of highly secure solutions exist – but they need to be used appropriately. Compared to some locally managed PC your data will most likely be much more secure in the cloud. And the overall system can have higher availability and automatable backup if configured correctly.

When it comes to sharing and redistributing of data cloud solutions really shine. It is far easier to provide managed access to data located within some cloud resource compared to securely opening up your local system to outside access via the internet. So security is an area of concern, but nothing that should keep you away from using cloud systems.


Check out these related articles:


 

Team productivity and cloud software development

This blog is for you if you want to set up your software development team for success. Produce steady, high quality outcomes based on best practice processes. Leverage the cloud to your advantage.

We’ll talk about team setup. How to organize work, track progress and avoid disasters. You’ll find some basic introductions, best practices, guidelines and checklists.

What I write about has been covered in books, articles, conferences. This blog condenses my personal views and learnings from what has worked and what hasn’t. You may look at it as sharing some practical experience. My views will not be true for everybody. Take whatever is helpful for you and leave the rest aside. This blog will not make you an expert in any of the topics covered – but it can provide an overview and guide you toward further reading.

This blog is for the software adventurers of our time. Join the ride.

Start on the path - towards team productivity, cloud software and other adventures of our time

Some recommendations to start reading:

 

Photo by Lawrence Walters on Unsplash