Network Automation Foundations

In my previous post, I covered the infrastructure required for getting you off the ground with network automation. Even though this is the last post (and a very long one) on the network automation journey, it does not mean your journey finishes here. In fact, from this point onwards, you will be looking in evolving your network automation to a scalable and reliable network automation. If you want to push even further, you will be looking into bringing the elements required to turn your network into an autonomous network. Watch out this space for future blogs on autonomous networks.

In this post, I will cover the foundations elements for the network automation. I decided to call it foundation elements because these are fundamental building blocks for any type of network automation you are going to design and implement. The building blocks are organised by function and each function has few examples of tools and frameworks.

A common problem with these tools and frameworks is that there are far too many options out there for you to choose. Quite often, the discussion around which tool should be selected is done by familiarity instead of technical merits. There is nothing wrong in selecting it by familiarity as long as the tool delivers the function it supposed to deliver. My advice in this situation is always to select the right tool for the job.

Every network automation architecture will have some common elements. The common elements are essential components required to build a production-ready network automation architecture. These common elements and their examples are presented in the table below.

OrchestrationUser InterfacesCommunications
Data StructuresProgramming LanguagesSoftware Packaging
Fundamental building blocks for network automation

Let’s analyse the function and examples of each of these common network automation elements.

Note: this is not an exhaustive list of examples for each of the elements. Specific network automation use cases will require different elements and examples than the ones listed here.

Orchestration

An orchestration system is responsible to execute your distributed, isolated and idempotent automated workflows in a particular sequence and manner to achieve a specific outcome. Sounds complicated, huh ? In very few words, the orchestration system is the glue that will put together your disparate network automation elements.

Quite often, the orchestration system doesn’t deserve the required attention until the things either get out of control or too big. And sometimes, this happens too late in the process and generates a considerable amount of rework. Fortunately, the appearance of Ansible and SaltStack in the networking space have provided a simple orchestration system to get the things off the ground quickly. And for many, this is more than enough. So, it wouldn’t come as surprise that Ansible and SaltStack are listed as examples of orchestration systems.

Many network engineers won’t see Jenkins as an orchestration tool. In fact, Jenkins won’t do what Ansible and SaltStack do. However, Jenkins provide you with a different type of orchestration: the pipeline orchestration. Often, Jenkins is used to trigger Ansible playbooks. For instance, if you have a workflow to build a service, each stage of the workflow could become a stage in Jenkins: collect service information, configuration rendering, configuration validation, configuration push, service validation, and close service order. Jenkins will keep track of the success on each stage of the pipeline. If a fail occurs in a particular stage, actions can be programmed in Jenkins to either attempt to fix or rollback the rollout of the service.

It is worth noting that there are other commercial available tools not mentioned in this post. Some are quite good and others convoluted. Depending on your business case, they may be a perfect fit. However, these solutions come with a price tag. It’s a trade off that ultimately comes down to a risk management and time to market conversation.

User Interfaces

Developing network automation usually relies heavily on making API calls. Hence, many of us forget that, sometimes, a UI is required throughout the network automation. A UI could be a web page to present reports, to push buttons, to visualise workflows, etc. In this section, I have listed few frameworks that will help to build nice UIs with little effort. There are many other frameworks out there, but the ones listed here are a good way to start.

Django and Flask will provide you with the infrastructure for the UI as they are rapid application development frameworks. React.js can be used in both Django and Flask to provide a nice touch in the look and feel of your UI.

If your goal is to present data in graphs, Grafana is the way to go. Grafana is a powerful tool to quickly generate graphs and dashboards with various types of data.

Communications

Communications play an important role on your automation architecture. Without it, no orchestration can be implemented. Even the most simplistic automation architecture (scripts) will, at least, require communications south bound, i.e., communication with the network devices.

The communications elements of your network automation architecture can, essentially, be broken down in two types: protocols and frameworks. On the protocols, you will find NETCONF, REST, RESTCONF, and gRPC. On frameworks, you will find Kafka, ZeroMQ, RabbitMQ, and others. Like the other elements, this is not an extensive list nor a definite answer for all communication problems you will have to deal with. These elements generally address all your communications requirements. When you analyse in detail your automation use case, you may find the need to incorporate other communication tools into your architecture.

NETCONF is the standard protocol for network device configuration. With very few exceptions, most of the networking vendors support NETCONF today.

REST is the most used protocol for system integrations. System-to-system communications will often rely on REST.

Similar to NETCONF and using REST principles, RESTCONF is a protocol to deal with network configuration using HTTP/S based methods. Similar to NETCONF, RESTCONF also provide CRUD (Create, Read, Update, Delete) operations.

gRPC is a universal and feature rich RPC framework. It can be used for both device communication and system-to-system communication.

Ultimately, what will dictate the protocols and frameworks you choose is what is available in the network devices and systems you have to deal with. In a green field environment, I would try to use as much as possible gRPC.

Data Structures

Data structures is an important item in any network automation. Whether you are automating service provisioning, operational tasks or build tasks, you are always dealing with data. Therefore, it is important to choose a robust data structure. More importantly, choose the right data structure for the job.

There are many options of data structure out there. For instance, XML, JSON, YAML, GPB, OpenConfig are just few examples. From observations of how our industry is using these, they generally will be used like this: JSON will be used in northbound interfaces as most of the northbound interfaces rely on REST protocol, XML is primarily used for southbound communication, especially when pushing configurations, GPB is primarily used for streaming telemetry (OpenConfig and general gRPC), and last but not least, YAML is primarily used for infrastructure configuration.

There are many other types of data not mentioned here. Though, they usually fit on very specific use cases. For example, my preference for representing a topology is using Graph Theory. I generally try to use libraries from the programming language of choice for this that enables me to export the graph into something other libraries could read (e.g., text). A good example of graph library is Python’s networkx library.

The most important thing while choosing the data structures of your automation solution is to keep in mind that you are developing automation for machine-to-machine interaction. Hence, the data structure must be easily readable by machines and not by humans. It happens more often than you think that we choose a data structure that is very easy for us to read but it make it difficult for machines to read. Being difficult for machines to read means that it is difficult (or not computational cost effective) to represent the relation among data entries. Short story is: choose it wisely!

Programming Language

Choosing a programming language for your automation can be tricky. It is tricky because it involves multiple professional and personal aspects. For instance, if you are the solely responsible for the coding (very unlikely), you are free to choose the language you are more familiar with. On the other hand, if there is a team, you need to consider the team’s preference. The important thing to remember here is that most of the network engineers are not like software developers who can transition from language to language in a blink of an eye.

Whatever the language you choose, it does need to meet the technical requirements. My recommendation is to choose a language that you are comfortable with, that meets the technical requirements and has a rich library of functions. Additionally, if you are developing your automation using containers, you can always hide the intricacies of the language inside of a container. Sometimes, this is key for you to shorten the development time of your project.

With that, I am not going to suggest you to pick language A or B. However, what I have seen out in the field is a lot of network automation being developed in Python. Additionally, Go is getting a lot of attention and traction these days.

Software Packaging

While you are doing ad-hoc automation, software packaging isn’t really a problem as most of the time you will be pulling things from git. However, when you start to do really serious automation, you need to consider how you are going to deploy your automation. More importantly, how it is going to scale (preferably, horizontally) and how resiliency will be achieved. Most of the scale and resiliency comes out of the software/automation architecture. However, its packaging plays a crucial role in enabling that scale and resiliency.

There are many options for packaging. However, the go-to option for packaging are containers with deployment on Kubernetes. If you are using controllers and orchestrators, each of them may have their own packaging solutions.

When choosing your packaging option, consider where you are with your automation journey and what the next steps for you are. If scale and resilience for your automation are in the horizon, then you should consider containers and Kubernetes. But if scale and resilience are still far away for you, you can use a much more simple packaging solutions for your automation.

Putting all elements together

Once you have all foundation elements of your network automation identified, it is time to put them together. As mentioned previously, this depends on how far you are on your journey. If you are doing simple ad-hoc automation (early stages of the journey), then a Linux cron on an automation server will do the job.

If you have gone past the ad-hoc automation point, you will be looking to a simple orchestration system that helps you with workflows, execution scheduler and event driven automation. Ansible Tower, SaltStack and StackStorm are few examples in this area. There are many others and all of these have their open source version as well as its commercial version that often has extra features to add value on your network automation.

The next step on the journey is evolving your network automation to leverage network controllers and orchestrators. When you get to that stage, it means you are probably looking for having higher levels of automation on your network. Now, just because you use an orchestrator or network controller, it doesn’t mean you have the magic button. Usually, there is a lot of work involved to get these things well oiled before you start to see the benefits.

Whatever stage of network automation explained above you are, consider what you have developed so far, what is your next stage, what is the approach you want to take. In this analysis, you need to consider the lifecycle of the things you have developed: who created, who maintain them today and tomorrow, and how you are going to evolve them. Is it worth throwing everything out of the window and replace with something else ? Sometimes, the answer is yes (unfortunately). That is why it is important to incorporate elements of DevOps (agile) on your network automation development (e.g. CD/CI, faster release cycles, faster iterations, etc). This will not only enable you to deliver robust and resilient network automation but also enable you to experiment things much faster and in a safe way.

This post ends the network automation journey sequence of blog posts. I hope you have enjoyed a bit of not-so-technical conversation. Stay tuned for future posts and feel free to suggest topics through the contact form of the blog.