Route Explorer
enables network operators and engineers to deliver
a more predictable IP network infrastructure,
by providing unprecedented visibility, troubleshooting
and modeling of Layer 3 and routing dynamics in
IP networks. Whether enhancing the efficiency
of day to day network operations, reducing MTTR
for complex routing problems, or preparing IP
networks for new converged applications, Route
Explorer provides a tangible return on investment
by helping network engineering and operations
departments meet stringent SLAs for network services,
and increasing their responsiveness and productivity.
Enhanced IP Routing Operations
and Management
Route Explorer provides network operators and
engineers with the ability to understand and see,
for the first time, the dynamic routing operation
of their IP network, improving network availability
and performance, while reducing operating costs.
Route Explorer enables them to monitor their routed
(layer-3) network in real-time, become aware of
and pinpoint the root cause of problems sooner,
analyze historical routing events to troubleshoot
intermittent or past problems, and ensure that
their network is operating as intended.
Below are some examples of common ways in which
our customers use Route Explorer on a day-to-day
basis to monitor, maintain, troubleshoot and analyze
their IP networks. These brief animated examples
provide annotated, step-by-step Route Explorer
screen shots that walk you through using the product
to resolve some typical networking challenges.
Monitoring and Alerting on Changes in
Redundancy
Network redundancy is critical to service availability.
Redundancy is achieved by connecting critical
servers and workgroups to multiple routers. When
a primary route fails, an alternative, redundant
route takes over to provide connectivity. But
how do you know when a route has failed or when
only one path to a destination remains? The best
way to look for impaired redundancy is at the
IP layer. The routing protocols provide the fastest
and most accurate reflection of connectivity in
the network. This application example shows how
Route Explorer can help you monitor the redundancy
in your network. View Flash Demo
Detecting and Alerting on Flapping Routes
Flapping routes are bad for network availability
and bad for business. IP networks are generally
very robust. If a link or router should fail,
traffic would be rerouted automatically. But when
a route repeatedly goes up and down (“flaps”),
traffic will be rerouted continuously. Network
and application performance may be impacted, leading
to lost productivity and customer dissatisfaction.
While SNMP-based device managers might detect
flapping routes, their polling cycle may result
in detection of the flap after it has been occurring
for some time, or the flap could go completely
unnoticed. Detecting flaps at the IP layer is
the most timely and accurate method. This application
example shows how Route Explorer can alert you
to flapping routes as soon as they happen. View
Flash Demo
Verifying Proper Network Operation After
Maintenance Activities
Routine maintenance is a fundamental part of
keeping your network running smoothly. But maintenance
activities can make the network susceptible to
the introduction of configuration errors. A conservative
estimate by Yankee Group puts misconfiguration-induced
network outages at 30% of all network problems.
Manual verification of proper network operation
after a maintenance window is equally prone to
human error, and waiting for trouble tickets to
highlight configuration errors is not good operating
practice. This application example shows how Route
Explorer can provide the quickest and most effective
way to ensure that your network is back up and
running as intended after a maintenance window.
VoIP Network Readiness
Route Explorer helps network engineers ensure
that their IP networks are ready and will remain
steady for VoIP roll-outs, by reducing one of
the chief causes of VoIP quality and availability
problems--network and routing instability. Route
Explorer complements VoIP service management tools
by providing real-time, network-wide insight into
VoIP-affecting Layer 3 dynamics in the underlying
IP network:
Route and Link Flapping
According to studies conducted in carrier-class
networks that are engineered for toll-quality
VoIP delivery, the number one cause of VoIP quality
degradation was link failures and resulting routing
problems such as route flapping. Unstable, flapping
links can be even more disruptive since they cause
constant rerouting of packets. Link and route
flapping cause disruptive levels of latency, as
well as dropped packets. Route Explorer provides
real-time monitoring and alerting on link and
route flapping, as well as preserves a completely
accurate historical record of routing events in
the network, enabling faster troubleshooting and
root cause analysis, as well as proactive
Call Path Tracing
One of the key steps in troubleshooting VoIP
problems is to understand the precise path that
the troubled calls took across the network when
the problem occurred. Before Route Explorer, there
was no practical and reliable way to perform this
analysis, since the state of routing could have
changed since the problem was seen. Route Explorer
can accurately identifying routes between identified
IP addresses through the network at any point
in time, aiding problem resolution.
Route Optimization
Route Explorer enables network engineers to examine
their entire network to ensure that routing is
stable and optimized for VoIP delivery, across
protocols (OSPF, EIGRP, IS-IS, BGP and MPLS VPNs),
areas and Autonomous Systems. By proactively monitoring
and analyzing the state of routing, network engineers
can see problems as they emerge in the routing
plane, before they affect the forwarding plane.
Route Explorer also provides sophisticated modeling
of failure scenarios and routing metric changes
on the as-running routed topology, allowing network
engineers to ensure that there is sufficient redundancy
and optimal routing.
Internet and Inter-Domain Analysis
and Troubleshooting
Route Explorer's BGP Root Cause Analysis capability
provides network managers a way to identify and
diagnose complex BGP issues affecting mission-critical
Internet or inter-domain connectivity. BGP Root
Cause Analysis provides macro-level visibility
and automated analysis of the causal events that
can trigger millions of BGP routing updates, significantly
decreasing mean time to repair (MTTR) and increasing
service uptime. One of the tools available within
the BGP Root Cause feature is a Root Cause Animation
feature, that shows a dynamic topology visualization
of the macro-level dynamics that are indicated
by the raw BGP event stream. Upon selecting an
event timeline of interest and launching the animation
tool, the user can play, slow-down, fast-forward,
and rewind the animation to view how the multi-domain
peering structures and routes changed over time.
A multi-domain map and graphic representation
of route volume per router peering provides insight
into how external peers and next-hop peers have
affected IBGP peers and overall routing behavior.
Isolation of root-cause events such as peering
flaps, MED (Multi Exist Discriminator) oscillations,
misconfigured community tags and unwanted back-door
paths is performed in minutes rather than days.
The animations can be saved in SVG (Scalable
Vector Graphics) format, a W3C
standard for producing high quality graphics.
Adobe has a free SVG browser plugin available
for download. Download Adobe SVG plugin .
Below is a brief guide to the graphic representation
of the network in the BGP Root Cause Analysis
animation:
- The thickness of a peering indicates how many
prefixes are routed over that peering, rather
than how much traffic is flowing.
- Link colors indicate how the routes are changing:
- Black means the routes are not changing
- Blue means the peering is losing prefixes
- Green means the peering is gaining prefixes
- Yellow means the prefix count is flapping
too fast to animate
- A peering that has lost prefixes also
has a gray shadow that indicates the largest
number of prefixes it ever carried.
- The Route Explorer is shown as a rectangle
on the left. It (passively) peers with all the
site's BGP edge routers (or core route reflectors
if used), exactly as an interior router would
iBGP peer with them. (I.e., the recorder's view
of the BGP information is exactly the same one
seen by all members of the site's iBGP mesh.)
How to use the SVG animation:
At the bottom left of the window is the animation
clock (what point in the timeline of the event
is currently being shown). Below it is a large
"Start/Pause" button (click it once
to start the animation and again to pause). Below
that are buttons that take you to the beginning
or end of the animation. To the right of the Start
button are animation speed controls: the center
square selects "normal" speed (a value
built into the animation at the time it was created).
Each click on the upper triangle will double the
current speed and each click on the lower will
halve it. Below the speed controls is a button
that toggles between one-shot and continuous loop
playback mode. The plot to the right of the controls
shows how the prefixes varied with time on whichever
link is selected in the topology graph (most animations
have an "interesting" link selected
at startup but click on any link to select it).
Click on any point in this graph to take you to
that time in the animation. To the right of the
plot is various information about the currently
selected edge.
Detecting BGP Failover and Slow Convergence
from High Volume BGP Updates
BGP issues can generate an overwhelming number
of routing updates, that are beyond human ability
to analyze effectively in a timely manner. This
application example shows how Route Explorer can
greatly simplify the root cause diagnosis of a
large volume of BGP routing updates, resulting
in more rapid response to critical errors or more
proactive network optimization. The example shows
an animation of the U.C. Berkeley network's BGP
routing, when a 500,000 event incident occurred.
During this incident, 30,000 prefixes failed over
twice from CalrenN-Qwest to Level-3 via a sub-optimal
6 AS-hop backup path. Convergence time was very
long--twenty minutes for each of the fail-overs,
and one minute for the fail-backs. Without Route
Explorer, it could take hours of analysis to determine
what happened.
Diagnosing BGP MED Oscillations
Floods of BGP updates caused by random routing
behavior such as Multi-Exit Discriminator (MED)
oscillations, can create an operationally disruptive
level of BGP routing traffic, impairing even a
large network. This application example shows
how Route Explorer can diagnose a huge volume
of BGP updates generated due to an actual MED
oscillation at a Tier 1 ISP. The animation utilizes
anonymized network numbering, and shows four core
route reflectors--two in each of two PoPs. Both
pairs of route reflectors, Core1-a/b and Core2-a/b,
each have paths to 4.5/16 via AS2. Core1-a/b also
have a path via AS1. The ISP is accepting MEDs
from AS2 and Core1 has the better MED. Core2-a/b
announce superior metrics then withdraw their
AS2 route randomly and rapidly, on the average
of every 10 microseconds (100,000 times per second
each--the links are colored yellow since the event
rate is too fast to animate. This flood causes
Core1-a and Core1-b to randomly switch paths on
the average every 10 milliseconds (100 times/second),
a rate so rapid that it shows as blue flashes
that occasionally happen during the animation
which indicate that the instantaneous announce
/ withdraw cycles are happening in less than a
millisecond. The animation shows 10 seconds of
this issue, with a time scale in milliseconds.
The actual event lasted for at five days, continuously,
and accounted for 95% of the ISP's BGP traffic.
In other words, this one prefix generated 20 times
more iBGP traffic than all the rest of the Internet
combined, making diagnosis extremely difficult.
With Route Explorer's Root Cause Analysis capability,
diagnosis and problem resolution can be effected
within minutes of recording and analyzing the
BGP routing updates. |