Troubleshooting Tips for Networking

Cisco Troubleshooting techniques like Divide and Conquer or Resolve or Escalate

Share This Post

Any job in networking requires good troubleshooting skills. You need to identify problems and fix them as fast as you can. To do it, you need to rely on great technical knowledge. However, this is simply not enough. Troubleshooting can be a stressful task, especially when solutions do not come to mind. To save your network – and your job – there are several techniques that will keep you on track even on difficult problems. Practicing them will grow your technical skills over time.

Troubleshooting step-by-step

Locate the Problem

When working in a network, you have to deal with the entire stack. That is, you need to ensure that everything is working: from the cable up to the application. This involves plenty of different technologies: framing, routing protocols, packet filtering to name a few. To check them all, you need to follow a specific order.

Depending on the troubleshooting technique you use, there are several orders you can follow. However, the final goal is always the same: exclude layers progressively. Ideally, you start by just knowing that there is a problem. Then, you will need to identify where is this problem. Is it at the physical layer? Or at the network layer? To check this, you need to verify each layer and, when you are sure there is no problem, move to the next one.

In the following sections, we will see different approaches to locate the problem.

Bottom-up and Top-down Troubleshooting

To start this journey in troubleshooting, we should name bottom-up and top-down approaches. These two are dramatically simple to execute, yet effective.

With the bottom-up approach, you start by verifying the physical layer. First, you need to check cabling to see if everything is correctly plugged in. Then, you move to error counters on the interfaces to see if you have interferences. If no problem exists at layer 1, then you can move to the data-link layer and check here. At this point, you might want to check encapsulation or ARP resolution. No luck even there? You will have to move to the network layer, checking for routes. You will move your way into the OSI stack up to the application configuration itself.

The opposite of the bottom-up is the top-down approach. With that, you will start checking the application configuration. If it is OK, you can move to the transport layer and check for firewalls or packet filters. If there is no problem, you can check the network layer, and continue on this path until you reach the physical layer.

Both approaches are great, and they ultimately find the problem. However, there is a major drawback in both: time. Both approaches require a lot of time. If the problem is at the top of the stack, and you started from the bottom (or vice versa), it will require a lot of time. To overcome that, we designed a different troubleshooting method.

Divide-and-Conquer Troubleshooting Approach

The Divide-and-Conquer approach is probably the fastest one. Instead of starting from the physical layer, or from the application layer, it starts in the middle. Your first step is to identify if the network layer is working, and act accordingly. More or less, you will have to troubleshoot two applications on remote devices talking with each other. You connect to one device, and if you can ping the other than you don’t have a routing problem. So, based on the ping, you can decide if you need to check the data link layer or the transport layer.

Cisco Divide and Conquer troubleshooting approach
Cisco Divide and Conquer troubleshooting approach to isolate a problem fast.

Take a moment to truly understand the visualization of the Divide-and-Conquer troubleshooting approach. Imagine that you need to check why an FTP connection is not working. Here’s what you need to do.

  • Connect to the FTP client and ping the FTP server. If it doesn’t work…
    • Check for routes in routers along the path
    • Verify encapsulation and ARP problems
    • Check interface errors and, ultimately, cabling
  • If it works…
    • Check for application filters with Access Lists, or firewall in the path
    • Check the configuration on firewalls (if you have them)
    • Verify the configuration on both FTP client and FTP Server

This simple example can tell you how to move in troubleshooting a complex flow like FTP. You can apply the same logic to any troubleshooting process you need to make.

Other approaches

There are two-three approaches we might want to know about: shoot from the hip, follow the path, and replace components. While the first may not be recommended for a fresher networker, the other two are.

The shoot from the hip approach is just what the name says it is. It is quick impulsive action and this is the fastest troubleshooting method possible. This is what experienced networkers can do, as they may recognize patterns for the problem. Typically, a new problem arises, and you think “I’ve seen this, I must do X to fix it”. If you are experienced enough, you are likely to fix it at the first shot. Maybe you will need to try a few different things, but in the end, you will fix it. However, if you are not confident you may blow up the network. The key is to understand when it’s time to move from this approach to the divide-and-conquer.

With the follow the path approach, if the ping doesn’t work you do a traceroute. This way, you can know where in the network the problem is. Then, you can hop onto the last device of your trace and continue with divide-and-conquer from here. Rather than being an alternative for divide-and-conquer, it is a handy side-approach.

Finally, we have the replace the components approach. This is even easier, you try to replace something to see if the problem is generalized or exists for a specific user. A common application is to ask the user: “Are you the only one to have this problem? Do your colleagues have it too? Have you tried from a different device?”. Again, this is a side-approach for the divide-and-conquer.

Troubleshooting in a Team

Resolve or Escalate

Networks are just to complex for a single person to handle. Therefore, there is a good chance you are going to work in a team. It’s natural that not everyone has the same skill set and experience level, so someone might be better in troubleshooting than someone else. To be even more precise, troubleshooting is a task of the Operation team. That team is divided into two groups, first and second level (if the company is big enough). Each level has a specific role and mission.

The first level is the entry point for problems, their role is to gather all the information from the users and perform basic troubleshooting. If they can fix the problem, that’s great. If they don’t, they need to pass the problem to the second level. The act of moving a problem to a more “advanced” team is called escalation.

The second level team has better skills. Unlike the first level team, which often works with procedures, this team has a deep understanding of the technologies and the infrastructure itself. They are likely to fix the problem or engage the vendor in case they can’t fix it. Here’s a snapshot of the approach.

Cisco Resolve or Escalate troubleshooting approach.
Cisco Resolve or Escalate troubleshooting approach.

The key concept behind that is called Resolve or Escalate. Ideally, a problem should be in the hand of someone able to solve it, and this is what “resolve or escalate” is all about. When you have to handle a problem, you need to gather all the information you need to understand whether you can fix it or not. If you can, then go ahead and fix it, otherwise just pass it to the next level. There is no reason to wait, so you have to escalate as soon as you understand you won’t fix it.

Documenting problems

If you are working in a team, the same problem is going to be handled by multiple people. Maybe you work on shifts, and you have to leave an unsolved problem for your colleagues at the end of the shift, or maybe you are just escalating. No matter the reason, the other person should not be doing the same troubleshooting you did.

To avoid that, you need to document everything. When you start doing troubleshooting, open a document (a Word, or even a Notepad document) and write down everything you do. For example, you can tell that you excluded the network layer because you checked X, Y, and Z. If you find something that just doesn’t feel right, but that you can’t relate to the problem, highlight it. Maybe a more experienced engineer will relate it.

If you prepare a clear document, the next person doing the analysis will be able to pick it up where you left. This saves a lot of time to both of you, as you don’t need to explain to him over and over the same things.

Conclusion

In this article, we have seen some troubleshooting tips that will help you in your daily jobs. Furthermore, these concepts are part of the CCNA certification, and some of them also of the CCNP TSHOOT exam. Here’s what you absolutely need to take with you.

  • Use the divide-and-conquer approach, excluding the network layer as a first step
  • Combine it with follow the path and replace the components side-approaches to be even faster
  • If you can’t solve a problem, escalate it to the next level (resolve or escalate)
  • When troubleshooting, write down everything in a document, in case you have to hand the activity to a colleague

With this knowledge, we are now ready to dive into more advanced concepts of routing and switching, which will part of the second exam of your CCNA.

Don't fail the CCNA Exam!

Failing the CCNA exam equals wasting $300. Don't do that, be prepared instead.
Together with our free course, we offer a companion book with Questions and Answers. And it's only $27.50 if you are following the course.
Picture of Alessandro Maggio

Alessandro Maggio

Project manager, critical-thinker, passionate about networking & coding. I believe that time is the most precious resource we have, and that technology can help us not to waste it. I founded ICTShore.com with the same principle: I share what I learn so that you get value from it faster than I did.
Picture of Alessandro Maggio

Alessandro Maggio

Project manager, critical-thinker, passionate about networking & coding. I believe that time is the most precious resource we have, and that technology can help us not to waste it. I founded ICTShore.com with the same principle: I share what I learn so that you get value from it faster than I did.

Alessandro Maggio

2017-07-20T16:30:05+00:00

Unspecified

Free CCNA Course

Unspecified