Answers to your Questions
Router Resiliency
Distributed Network
Performance
Router Resiliency
How can I measure how
long route processing is "down" while a router is restarting
or "failing over" from the primary to a backup route processor?
The best way to measure routing downtime is to continually
withdraw active routes and measure the duration for which
the router does not reroute traffic. For a complete description
of this test scenario, see Journal Test Case JTC
082 Non-stop Routing.
What is the difference
between Hitless OSPF Restart and Nguyen OSPF Restart?
These are two different sets of IETF draft recommendations
for OSPF Restart. Hitless OSPF Restart is specified in draft-ietf-ospf-hitless-restart-xx.txt
(Moy) and draft-lindem-ospf-hitless-extended-restart-xx.txt
(Lindem, Oswal). Nguyen OSPF Restart, also known as Graceful
Restart, is described in two drafts: OSPF Restart Signaling
is specified in draft-nguyen-ospf-restart-xx.txt (Zinin, Roy,
Nguyen) and OSPF Out-of-band LSDB resynchronization is specified
in draft-nguyen-ospf-oob-resync-xx.txt (Zinin, Roy, Nguyen).
A major difference between these two drafts is the way the
Restarting Router reacts to an external topology change that
occurs during the restart process. Hitless OSPF Restart may
abort the "Hitless Restart" procedure and fall back to a regular
restart. Nguyen OSPF Restart may ignore the topology change
until the "Graceful Restart" procedure has successfully completed.
Both procedures have advantages and disadvantages. For a discussion
of these procedures and a description of how to test them,
see Agilent's technical paper "Demystifying and Evaluating
Router Resiliency", one of the white papers in this edition
of INSIGHT.
What are the major test
cases for verifying and measuring Graceful/Hitless Restart?
To test Graceful/Hitless Restart, first verify continuity
of packet forwarding during restart. Secondly, check that
the Restarting Router and its "cooperating neighbor" routers
behave correctly when an external topology change occurs during
restart. Finally, measure the duration of the restart procedure.
These test cases are specified in the following Journal Test
Cases:
Why do I need three or
four routers to test Graceful/Hitless Restart?
During the Graceful/Hitless Restart procedure, the immediate
neighbors of the Restarting Router cooperate with the Restarting
Router to accelerate routing database resynchronization and
to shield the rest of the network from route updates (thereby
preventing potential route flapping). The "cooperating neighbors"
are intimately involved in the restart procedure and directly
affect the test results (such as restart duration). Therefore,
the "cooperating neighbor" routers must be part of the system
under test (SUT) and a total of at least three or four routers
are required, as specified in the following Journal Test Cases:
What do I need to do to
the Restarting Router during router resiliency testing?
Depending on what you are trying to test, you could force
router failure (or failover to a backup route processor) by
extracting a router control card or by issuing a router operating
system command to restart route processes. Alternatively,
you could reload the router software to verify "live" or "in-service"
hitless software upgrade.
Distributed Network Performance
What is the most accurate
method of measuring one-way packet latency between remote
sites?
The most accurate method is to synchronize all of the test
systems using GPS receivers. GPS provides a measurement resolution
of 100 nanoseconds (1x 10^-7 seconds). While it is feasible
to synchronize test equipment using other methods, these other
methods are generally far less accurate or less reliable.
For example, the Network Timing Protocol (NTP) provides an
accuracy of about 5 milliseconds on a LAN or about 30 milliseconds
(3 x 10^-2 seconds) across a WAN. For almost all applications,
this is not accurate enough. National and international packet
latencies are typically in the order of 10 to 100 milliseconds.
Delays within metro and regional networks can be even smaller.
For more information, read the Network
Performance section introduction of The Journal of Internet
Test Methodologies.
Where is distributed network
performance testing used?
Distributed testing is used for both in-service testing (within
a "live" network carrying real customer traffic) and out-of-service
testing (within an experimental or lab network, or within
a real network before deployment). For more information, read
the Network
Performance section introduction of The Journal of Internet
Test Methodologies.
What are the advantages
of GPS-synchronized testing?
By synchronizing multiple local and remote test systems using
GPS receivers, one-way packet latency and latency variation
can be measured from end to end. If the tester's instrumented
test packets are uniquely labeled, packets that have been
misdirected across the network can also be reliably detected.
Furthermore, with GPS-synchronized testing it is possible
to synchronize the starting and stopping of measurement and
capture systems and to accurately correlate measurement statistics
across multiple ports. For more information on measuring end-to-end
IP performance, read Journal Test Cases JTC
083 End-to-end In-service IP Performance 12-3 and JTC
084 End-to-end Out-of-service IP Performance of The Journal
of Internet Test Methodologies.
How can distributed testing
help me test VPNs?
Two or more customer sites can be simulated to verify network
performance and QoS. This can be done before network deployment,
before VPN service activation, or for service assurance and
troubleshooting. At each site, the tester is used to simulate
the customer edge (CE) router and the customer's local network
behind the CE router. This is done by advertising the topology
of each simulated network using a routing protocol such as
OSPF or IS-IS. At the same time, traffic is sent between the
simulated sites (from edge to edge) and performance (latency,
loss, throughput) can be verified in real time. Performance-based
capture triggers can be used to detect when QoS exceeds a
user-specified threshold (such as "latency > 100ms") and to
then diagnose the cause of any performance bottleneck. For
more information on verifying BGP/MPLS VPN performance and
measuring VPN network capacity, read Journal Test Cases JTC
085 End-to-end BGP/MPLS VPN Performance and JTC
086 BGP/MPLS VPN Network Capacity of The Journal of Internet
Test Methodologies.
Are there any special
requirements to perform distributed testing?
You will need a support network (WAN) to interconnect the
test equipment at each site. Generally, this should not be
the same as the network that you wish to test. You will also
need sufficient rack space at each Point of Presence, Central
Office, or lab site. Finally, you will need a suitable position
at each site to mount the GPS antenna. For more information,
read the Network
Performance section introduction of The Journal of Internet
Test Methodologies.
I want to test the potential
impact of a router failure or link failure on network availability.
How can I verify network reconvergence and measure IP route
convergence duration?
Use the "Network Convergence" test scenario, as specified
in Journal Test Case JTC
087 Network Convergence of The Journal of Internet Test
Methodologies.
|