Skip to content
This repository was archived by the owner on Sep 17, 2019. It is now read-only.

Added a framework to be able to diff shitty/ios config #277

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
189 changes: 189 additions & 0 deletions napalm_base/indent_differ.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
from napalm_base.utils import py23_compat

import re


def parse_indented_config(config, current_indent=0, previous_indent=0, nested=False):
"""
This methid basically reads a configuration that conforms to a very poor industry standard
and returns a nested structure that behaves like a dict. For example:

{'enable password whatever': {},
'interface GigabitEthernet1': {
'description "bleh"': {},
'fake nested': {
'nested nested configuration': {}},
'switchport mode trunk': {}},
'interface GigabitEthernet2': {
'no ip address': {}},
'interface GigabitEthernet3': {
'negotiation auto': {},
'no ip address': {},
'shutdown': {}},
'interface Loopback0': {
'description "blah"': {}}}
"""
parsed = IndentedConfig()
while True:
if not config:
break
line = config.pop(0)
last = line.lstrip()
leading_spaces = len(line) - len(last)

# print("current_indent:{}, previous:{}, leading:{} - {}".format(
# current_indent, previous_indent, leading_spaces, line))

if leading_spaces > current_indent:
parsed[last] = parse_indented_config(config, leading_spaces, current_indent, True)
elif leading_spaces < current_indent:
config.insert(0, line)
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break are hidden traps in general, it's good practice to avoid.
I would suggest having a simple for loop going through the lines.

Copy link
Member Author

@dbarrosop dbarrosop Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are break hidden traps? Is there a PEP or something I can look at? I don’t think you can build this with a for loop but if you have ideas I am all ears.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what is the argument here a break is very common pattern. Why should they be avoided?

Copy link
Member

@mirceaulinic mirceaulinic Jul 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using / avoiding break tends to be very subjective (everyone has a different opinion: https://stackoverflow.com/questions/3922599/is-it-a-bad-practice-to-use-break-in-a-for-loop, https://ubuntuforums.org/showthread.php?t=810001), so this is more like a suggestion.

I tend to avoid, in particular when the length is deterministic, like here (i.e., you are not waiting for some input you don't know how much it is going to take, but the config has a fixed size).
In this case, I think it is even simpler, as you can replace the break with return parsed.

I've dug a bit and finally found what I was looking for: http://wiki.c2.com/?IsBreakStatementArchaic, although speaking mainly about the switch-case, it covers also the usage inside the loops; it's a good article, well explained.

Some snippets:

It should be noted that break originally existed because C was originally (and for the most part still is) essentially a mid-high-level portable assembler.

I don't recommend that it be removed from the language, only that a better alternative is also offered. In other words, extend the language with a better construct. The old one is simply deprecated, but supported.

And I think there are very good alternatives in Python.

Anyway, the main reasoning on why better return in the favor of break:

Use a return statement, which causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called (return address).

Which is a bit more optimized approach rather than a forced interrupt.

Copy link
Member Author

@dbarrosop dbarrosop Jul 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

particular when the length is deterministic, like here

That's where you are wrong. The length is not deterministic. I don't know where the block ends until I find the end of the block. This is a parser, I am consuming tokens and building an object as I make sense of the unstructured text. I have no prior knowledge or clue of how the original text look like until I look at it and I certainly can't predict how the resulting object will look like.

In this case, I think it is even simpler, as you can replace the break with return parsed.

Consider this case:

while True:
    if not data:
        break
    token = data.pop(0)
    # things happen
    if condition:
        break
return a

What you are proposing is to do this instead:

while True:
    if not data:
        return a
    token = data.pop(0)
    # things happen
    if condition:
        return a

Which is IMHO duplicating code because I have the same return statement twice and if I now have to process a before returning I will have to duplicate more code:

while True:
    if not data:
        # code to process a
        return a
    token = data.pop(0)
    # things happen
    if condition:
        # code to process a
        return a

With a break statement the code is as clear and I avoid duplicating code:

while True:
    if not data:
       break
    token = data.pop(0)
    # things happen
    if condition:
        break
# code to process a
return a

This is an extremely common pattern in parsers and search algorithms:

https://github.com/python/cpython/blob/6f0eb93183519024cb360162bdd81b9faec97ba6/Lib/email/parser.py#L42

If you look at the stdlib and search for parse you will see most parsers replicate that pattern.

And I think there are very good alternatives in Python.

For example?

Which is a bit more optimized approach rather than a forced interrupt.

I tested that statement:

def test_break():
    while True:
        break
    return

for _ in range(0, 10000000):
    test_break()

python test_break.py  1.71s user 0.15s system 99% cpu 1.874 total
def test_return():
    while True:
        return

for _ in range(0, 10000000):
    test_return()

python test_return.py  1.62s user 0.15s system 99% cpu 1.778 total

So, every 10.000.0000 executions we save 0.1s. I think we can leave with the extra cost of break if we save a few bytes of memory and avoid duplicating code :P

else:
if not nested:
parsed[last] = parse_indented_config(config, leading_spaces, current_indent, True)
else:
config.insert(0, line)
break

return parsed


def _can_have_multiple(command):
Copy link

@sjtarik sjtarik Jul 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a platform dependent list, although most Cisco rip-offs use pretty much the same syntax. For NXOS "ip access-list", ""object-group ip" should be added. I wonder if we can scan through the show running config lines or candidate config lines, split the words and check if one word, two words, and three words key have collusion (multiple occurences) if so build this EXACT_MATCHES list dynamically from this collusion list. This will only work if existing config (running or candidate) have collusions though.

"""
This method returns true if a command can have multiple instances of itself.
For example; interface Fa0, interface Fa1, neighor 1.1.1.1, neighbor 1.1.1.2, etc.

It is important this is up to date or the diff might fail to detect changes properly.
"""
EXACT_MATCHES = [
Copy link
Member

@mirceaulinic mirceaulinic Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have two questions:

  1. What happens if a command that can have multiple instances does not appear in this list?
  2. How long is this list going to be?

Copy link
Member Author

@dbarrosop dbarrosop Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It might fail to generate a proper diff.
  2. Only god and John Chamber know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am worried that item 2 might be huge and that might cause this diff to be unreliable and quite a bit of work to maintain.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, maintenance is easy, we just have to add the missing statements as we find them. Lots could actually be autodiscovered as the identifier is going to be an IP address or a number so you can regex the command and try to figure out if there is an identifier. For example, logging syslog 1.1.1.1, ntp server 1.1.1.1, ip address 1.1.1.1. Not sure if it's worth adding the extra complexity/computation as commands are finite, at some point you will have a complete list.

Also, this is best effort. People is free to not trust our diff and go and yell their vendors until they provide a reliable mechanism.

"interface",
"router",
"access-list",
"policy-map",
"ip prefix",
"ipv6 prefix",
"neighbor",
"ip address",
"ipv6 address",
]
return any([command.startswith(e) for e in EXACT_MATCHES])


def _expand(d, action, indent):
"""
Returns a list of (action, subcommand)
"""
result = []
for k, v in d.items():
k = "{}{}".format(" " * indent * 2, k)
result.append((action, k))
result += _expand(v, action, indent+1)
return result


def merge(running, candidate, negators, indent=0):
"""
This method reads a running and a candidate config and returns a list of
(action, command) that are needed to converge.
"""
result = []
for command, subcommands in candidate.items():
if any([command.startswith(n) for n in negators]):
ncmd = " ".join(command.split(" ")[1:])
remove = running.find(ncmd)
for r in remove:
result.append(("remove", "{}{}".format(" " * indent * 2, r)))
result += _expand(running[r], "remove", indent+1)
elif command in running:
r = merge(running[command], subcommands, negators, indent+1)
if r:
result.append(("change", command))
result += r
elif command not in running:
result.append(("add", "{}{}".format(" " * indent * 2, command)))
result += _expand(subcommands, "add", indent+1)

remove = running.find(command)
for r in remove:
result.append(("remove", "{}{}".format(" " * indent * 2, r)))
result += _expand(running[r], "remove", indent+1)

return result


class IndentedConfig(object):

def __init__(self, config=None, comments="!", negators=["no", "default"]):
self.config = config if config is not None else ""

if self.config:
# let's get rid of empty lines and comments
lines_no_blanks = [line for line in self.config.splitlines()
if line.strip() and not line.startswith(comments)]
self.parsed = parse_indented_config(lines_no_blanks)
else:
self.parsed = {}
self.negators = negators

def items(self):
return self.parsed.items()

def keys(self):
return self.parsed.keys()

def values(self):
return self.parsed.values()

def __setitem__(self, item, value):
self.parsed.__setitem__(item, value)

def __getitem__(self, item):
return self.parsed.__getitem__(item)

def __contains__(self, key):
return key in self.parsed

def to_dict(self):
result = {}
for k, v in self.items():
if v:
result[k] = v.to_dict()
else:
result[k] = {}
return result

def find(self, command):
"""
Find commands in self that look like command.

This method relies on _can_have_multiple. When working properly it lets you do things like:

1. If you do `switchport mode trunk` and try to find it with _can_have_multiple
returning False, you would be able to match on `switchport mode access` as well,
which will allow you to realize you are changing one by the other.
2. Similarly as above, you can run `switchport trunk vlan 1,2,3,5` and realize you
are changing the existing command `switchport trunk vlan 1,2`.
3. You can also do `no neighbor 1.1.1.1` and match if _can_have_multiple is True, all
the commands you may have like `neighbor 1.1.1.1 remote-as 12345`,
`neighbor 1.1.1.1 route-map blah in` while not matching at all other neighbors.
"""
if not _can_have_multiple(command):
cmd = " ".join(command.split(" ")[0:-1])
command = cmd if cmd else command
regex = re.compile("^{}.*".format(command))
return [c for c in self.keys() if regex.match(c)]

def diff(self, candidate):
"""
Returns the diff of the configuration when applying the candidate on top of self.
"""
if isinstance(candidate, py23_compat.string_types):
candidate = IndentedConfig(candidate)
result = merge(self, candidate, self.negators)
m = {
"remove": "-",
"add": "+",
"change": " ",
}
return "\n".join(["{} {}".format(m[r[0]], r[1]) for r in result])
83 changes: 83 additions & 0 deletions test/unit/test_indent_differ.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
from napalm_base import get_network_driver
from napalm_base import indent_differ

import pytest

import os


BASE_PATH = os.path.dirname(__file__)


driver = get_network_driver("mock")

test_cases_differ = [
"test_case_1",
"test_case_2",
]


class Test_Indent_differ(object):
"""Test Mock Driver."""

def test_parser(self):
candidate = '''
enable password whatever

interface Loopback0
description "blah"
interface GigabitEthernet1
description "bleh"

fake nested
nested nested configuration

switchport mode trunk

interface GigabitEthernet2
no ip address

interface GigabitEthernet3
no ip address
shutdown

negotiation auto'''

parsed = indent_differ.IndentedConfig(candidate)

expected = {'enable password whatever': {},
'interface GigabitEthernet1': {
'description "bleh"': {},
'fake nested': {
'nested nested configuration': {}},
'switchport mode trunk': {}},
'interface GigabitEthernet2': {
'no ip address': {}},
'interface GigabitEthernet3': {
'negotiation auto': {},
'no ip address': {},
'shutdown': {}},
'interface Loopback0': {
'description "blah"': {}}}

assert parsed.to_dict() == expected

@pytest.mark.parametrize("case", test_cases_differ)
def test_basic(self, case):
path = os.path.join(BASE_PATH, "test_indent_differ", case)
optional_args = {
"path": path,
"profile": ["mock"],
}

with driver("blah", "bleh", "blih", optional_args=optional_args) as d:
running = d.cli(["show running config"])["show running config"]

with open(os.path.join(path, "candidate.txt"), "r") as f:
candidate = f.read()

with open(os.path.join(path, "diff.txt"), "r") as f:
expected = f.read()

diff = indent_differ.IndentedConfig(running).diff(candidate)
assert diff.strip() == expected.strip(), diff
17 changes: 17 additions & 0 deletions test/unit/test_indent_differ/test_case_1/candidate.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
enable password whatever
default logging

no interface Loopback0

interface GigabitEthernet1
description "bleh"
switchport mode trunk
switchport trunk vlan 1,2,3,5

interface GigabitEthernet2
no ip address

no interface GigabitEthernet666

router bgp 65000
no neighbor 1.1.1.1
Loading