Login

 

Insight

-
For Next Generation Multi Services Testing

Answers to your Questions

Router Resiliency

Distributed Network Performance


Router Resiliency

How can I measure how long route processing is "down" while a router is restarting or "failing over" from the primary to a backup route processor?

The best way to measure routing downtime is to continually withdraw active routes and measure the duration for which the router does not reroute traffic. For a complete description of this test scenario, see Journal Test Case JTC 082 Non-stop Routing.

What is the difference between Hitless OSPF Restart and Nguyen OSPF Restart?

These are two different sets of IETF draft recommendations for OSPF Restart. Hitless OSPF Restart is specified in draft-ietf-ospf-hitless-restart-xx.txt (Moy) and draft-lindem-ospf-hitless-extended-restart-xx.txt (Lindem, Oswal). Nguyen OSPF Restart, also known as Graceful Restart, is described in two drafts: OSPF Restart Signaling is specified in draft-nguyen-ospf-restart-xx.txt (Zinin, Roy, Nguyen) and OSPF Out-of-band LSDB resynchronization is specified in draft-nguyen-ospf-oob-resync-xx.txt (Zinin, Roy, Nguyen).

A major difference between these two drafts is the way the Restarting Router reacts to an external topology change that occurs during the restart process. Hitless OSPF Restart may abort the "Hitless Restart" procedure and fall back to a regular restart. Nguyen OSPF Restart may ignore the topology change until the "Graceful Restart" procedure has successfully completed. Both procedures have advantages and disadvantages. For a discussion of these procedures and a description of how to test them, see Agilent's technical paper "Demystifying and Evaluating Router Resiliency", one of the white papers in this edition of INSIGHT.

What are the major test cases for verifying and measuring Graceful/Hitless Restart?

To test Graceful/Hitless Restart, first verify continuity of packet forwarding during restart. Secondly, check that the Restarting Router and its "cooperating neighbor" routers behave correctly when an external topology change occurs during restart. Finally, measure the duration of the restart procedure. These test cases are specified in the following Journal Test Cases:

Why do I need three or four routers to test Graceful/Hitless Restart?

During the Graceful/Hitless Restart procedure, the immediate neighbors of the Restarting Router cooperate with the Restarting Router to accelerate routing database resynchronization and to shield the rest of the network from route updates (thereby preventing potential route flapping). The "cooperating neighbors" are intimately involved in the restart procedure and directly affect the test results (such as restart duration). Therefore, the "cooperating neighbor" routers must be part of the system under test (SUT) and a total of at least three or four routers are required, as specified in the following Journal Test Cases:

What do I need to do to the Restarting Router during router resiliency testing?

Depending on what you are trying to test, you could force router failure (or failover to a backup route processor) by extracting a router control card or by issuing a router operating system command to restart route processes. Alternatively, you could reload the router software to verify "live" or "in-service" hitless software upgrade.


Distributed Network Performance

What is the most accurate method of measuring one-way packet latency between remote sites?

The most accurate method is to synchronize all of the test systems using GPS receivers. GPS provides a measurement resolution of 100 nanoseconds (1x 10^-7 seconds). While it is feasible to synchronize test equipment using other methods, these other methods are generally far less accurate or less reliable. For example, the Network Timing Protocol (NTP) provides an accuracy of about 5 milliseconds on a LAN or about 30 milliseconds (3 x 10^-2 seconds) across a WAN. For almost all applications, this is not accurate enough. National and international packet latencies are typically in the order of 10 to 100 milliseconds. Delays within metro and regional networks can be even smaller. For more information, read the Network Performance section introduction of The Journal of Internet Test Methodologies.

Where is distributed network performance testing used?

Distributed testing is used for both in-service testing (within a "live" network carrying real customer traffic) and out-of-service testing (within an experimental or lab network, or within a real network before deployment). For more information, read the Network Performance section introduction of The Journal of Internet Test Methodologies.

What are the advantages of GPS-synchronized testing?

By synchronizing multiple local and remote test systems using GPS receivers, one-way packet latency and latency variation can be measured from end to end. If the tester's instrumented test packets are uniquely labeled, packets that have been misdirected across the network can also be reliably detected. Furthermore, with GPS-synchronized testing it is possible to synchronize the starting and stopping of measurement and capture systems and to accurately correlate measurement statistics across multiple ports. For more information on measuring end-to-end IP performance, read Journal Test Cases JTC 083 End-to-end In-service IP Performance 12-3 and JTC 084 End-to-end Out-of-service IP Performance of The Journal of Internet Test Methodologies.

How can distributed testing help me test VPNs?

Two or more customer sites can be simulated to verify network performance and QoS. This can be done before network deployment, before VPN service activation, or for service assurance and troubleshooting. At each site, the tester is used to simulate the customer edge (CE) router and the customer's local network behind the CE router. This is done by advertising the topology of each simulated network using a routing protocol such as OSPF or IS-IS. At the same time, traffic is sent between the simulated sites (from edge to edge) and performance (latency, loss, throughput) can be verified in real time. Performance-based capture triggers can be used to detect when QoS exceeds a user-specified threshold (such as "latency > 100ms") and to then diagnose the cause of any performance bottleneck. For more information on verifying BGP/MPLS VPN performance and measuring VPN network capacity, read Journal Test Cases JTC 085 End-to-end BGP/MPLS VPN Performance and JTC 086 BGP/MPLS VPN Network Capacity of The Journal of Internet Test Methodologies.

Are there any special requirements to perform distributed testing?

You will need a support network (WAN) to interconnect the test equipment at each site. Generally, this should not be the same as the network that you wish to test. You will also need sufficient rack space at each Point of Presence, Central Office, or lab site. Finally, you will need a suitable position at each site to mount the GPS antenna. For more information, read the Network Performance section introduction of The Journal of Internet Test Methodologies.

I want to test the potential impact of a router failure or link failure on network availability. How can I verify network reconvergence and measure IP route convergence duration?

Use the "Network Convergence" test scenario, as specified in Journal Test Case JTC 087 Network Convergence of The Journal of Internet Test Methodologies.



Network Services Infrastructure Devices Under Test Technology Industry Solutions