白皮书
|白皮书|

A NEW PARADIGM FOR
NETWORK PROBLEM SOLVING

其中 48% 的组织解决故障单的时间平均超过半天
这是矛盾的。网络可靠性变高之后,企业仍然需要花费大量时间进行故障排除,并且被迫需要减少解决问题的时间。This white paper discusses state-of-the-art network problem solving and how a new approach – based on the NETSCOUT OneTouch™ AT Network Assistant – can reduce troubleshooting time by one full week each month.

Are Problems a Thing of the Past?

每一年,网络都会变得更可靠。新标准保证互操作性,新设备简化配置,先进的监控解决方案在用户受到影响之前就检测到问题。IT 部门已经踏入所有问题已成为历史的年代。

这是真的吗?A recent research study of over 300 network professionals in large - and medium - sized organizations found that:
  • 其中 48% 的组织解决故障单的时间平均超过半天
  • 百分之 46 的组织面对需要减少故障单处理时间的压力。
  • Network professionals spend about 25 percent of their time solving problems
既然这么多的 IT 技术进步都是为了解决问题,为何还会出现这种现象?其中一种解释是,可靠性和简易性的每一次进步都面临另一方面的退步,以此抵消技术进步,这导致事情变得更为复杂:unified communications, 802.11ac, cloud computing or IPv6.不论原因是什么,提高解决问题的效率仍有许多益处。

To increase productivity, troubleshooting tools not only need to keep up with technology changes, but must continue to improve processes used to solve problems.

How Troubleshooting is Done Today

The vast majority (72 percent) of organizations do not follow a standardized troubleshooting process.Not only does this process vary within an organization but the tools used to troubleshoot problems vary substantially.Survey respondents reported using eight different types of tools to solve problems.47% 的情况需要两种或更多的工具。由于故障排除方法和工具各式各样,63% 的故障排除持续时间超过一小时也就不足为奇了。

There is a part of problem solving that is worth considering separately.在许多情况下,技术人员无法独自解决问题。Sometimes they need additional help with especially difficult problems.In other instances, it’s because the problem lies outside their domain of responsibility, and they need to work with a separate group inside (server management or application developers) or outside (service providers or equipment vendors) the enterprise.这种情况并不少见 – 我们的调查表明,41% 的问题都需要此类协作。This can take too long for at least two reasons.首先,负责方并不总是轻易就能发现出现的问题。Second, the technician may not have the ability to easily capture the trace files that are often required (19 percent of the time) for these problems.
To increase productivity, troubleshooting tools not only need to keep up with technology changes, but must continue to improve processes used to solve problems

问题解决技巧研究

This white paper refers to a NETSCOUT research study of 315 network professionals in April 2012.The respondents came primarily from medium- to large-sized networks in a variety of industries.Most of them were top-level networking support staff.

图 1:您曾经使用以下哪种工具对最近出现的用户问题进行故障排除?

图 2:您平均每月处理多少故障单?

图 3:What is your group’s average time to close a trouble ticket?注意,48% 的调查对象表示需要四个小时以上的时间。

图 4:您上次解决的用户问题的根本原因是什么?(可提供多种回答。)

The survey asked respondents to identify the root cause of their most recent user-reported problem (respondents could select more than one root cause).The most common cause was network problems (wired or Wi-Fi), occurring in 27 percent of instances.However, end-user configuration or operation problems combined were at fault in 42 percent of cases.

The OneTouch AT identifies the most common network problems in about one minute1

Changing the Problem-solving Paradigm

At NETSCOUT, we looked to shorten the entire problem-solving process.The process, as described earlier, traditionally consists of two steps – solo troubleshooting and collaboration when necessary.To streamline troubleshooting, NETSCOUT developed a three-step process and designed a new tool based on it.The three steps are:

1.自动化测试
2.故障排除
3.协作

The OneTouch™ AT Network Assistant, enhances each of the steps and greatly reduces the time to solve problems.

步骤 1:自动化测试

It may seem counterintuitive that adding a step reduces time.But if the additional step actually saves more time in subsequent steps, the total time is reduced.That’s the idea behind automated testing.

The OneTouch AT identifies the most common network problems in about one minute1.It performs a thorough network analysis from the end-user’s point-of-view.Such an analysis, performed manually, would take roughly one hour2.The OneTouch AT performs that same testing in about a minute (tests are configurable and can take anywhere from ten seconds to a few minutes, with most being less than a minute).Results are then compared against user-defined limits to provide a simple pass/fail result.This approach allows technicians to find the most common problems that result in end-user complaints.

自动化测试有几大优点。首先,它比典型试错测试快得多。其次,它比手动方式更全面,这意味着它可以找到技师可能完全没有考虑过的问题。第三,任何人员(不论技术水平如何)均可进行这些测试,确定问题。

步骤 2:故障排除

While the AutoTest uncovers a wide variety of problems on its own, not all problems can be found that way.The OneTouch AT provides a veritable arsenal of troubleshooting power to reduce the time spent in this phase.

AutoTest – even if the initial AutoTest does not identify the problem, all the measurement results are ready and available to help the technician understand what is happening.Further, the AutoTest can be modified in seconds and re-run to test a different server, application or wireless connection.

Wired Tests – a complete set of tests provide information on the cable, Power over Ethernet, the nearest switch and network services.The OneTouch AT features a web browser, Telnet and a SSH client to assist with configuration of network devices including switches and access points.It features a toner and can flash switch port lights to help locate unmarked cables in congested closets.A video probe can be connected to the OneTouch AT to inspect the endface of fiber optic connectors for contamination.

The OneTouch AT can pay for itself within 4 months with time savings while troubleshooting and validating networks
Wireless Tests – OneTouch AT provides more analysis of the Wi-Fi network than a library of wireless freeware and shareware tools, and provides answers in an easy to understand format.The OneTouch AT discovers all the networks, AP’s and clients in range and quickly identifies problems such as improper security, interference, bandwidth hogs, overloaded channels, unauthorized devices and more.

步骤 3:协作

如前所述,网络技术员需要定期与其他人合作解决问题。The process of getting the right information to the right people, however, can drag on for days.Even if the technician is able to work on other problems during this period, that’s little comfort to an end user who can’t get their job done or the IT manager missing targets for trouble ticket times.

The OneTouch AT includes features specifically designed to expedite collaborative troubleshooting with the Link-Live Cloud Service.With Link-Live Cloud Service, everything is always captured with zero-touch reporting, providing you with complete records of your work and results, at all times.这一全面、可搜索的数据库可使您方便地识别网络中的变化。它没有空间限制,没有时间限制,不需要在一天结束时检索和传输测试设备上的数据到存储位置。This cloud service is available to all Handheld Network Tools:LinkSprinter, LinkRunner AT, and the AirCheck G2, all automatically upload their test results at the moment they happen.If you don’t have connectivity at that drop, the testers can buffer results, which are pushed to the Link-Live Cloud Service when connectivity returns.技师不再需要成为测试数据的保存人和传输人,甚至不需要记住按下保存按钮。这样,他们可以提高一倍的生产力,只用几天时间便可完成几个星期的工作。In addition, managers can have access to the test results dashboards.

Reporting – a detailed report of everything that the OneTouch AT tested and observed will be recorded in the Link-Live Cloud Service.This allows the tech to show a colleague exactly what is happening when they are observing the problem.These results include results a less-experienced technician might not have looked at but are there for more knowledgeable team members to evaluate.

In-line Packet Capture – a trace file is indispensable for very difficult problems or as evidence to an outside group such as application developers, service providers or equipment suppliers.收集此类信息一般需要对交换机或网络分流器进行重新配置。这可能需要 30 分钟或更长时间。Worse, many techs may not have access to switch provisioning or a tap.这意味着将问题转给其他人员时便会耽误更多时间。

OneTouch AT 无需访问交换机或分流器即可在线通过几次触摸执行数据包捕获。This means the tech can capture the problem packets immediately while the user demonstrates the problem.

Web Remote Interface – while it’s not always possible to get a colleague physically in the location of the problem, the OneTouch AT can be accessed and controlled through a remote device such as a PC, tablet or smartphone.Not only can the remote user see what the tech is seeing, but they can control the OneTouch AT and export trace files or reports to their device.

Camera – connect a webcam to the OneTouch AT USB port and the remote helper can see livevideo of the physical environment the tech is working in.This is useful if the tech is in a wiringcloset or a data center and the remote colleague needs to see the switch or patch panel, for example.

Savings

The first step in estimating savings from the OneTouch AT is to look at the time saved in each of the three parts of the test.

Automated Testing – Table 1 compares the amount of time it would take to perform the AutoTest functions with the actual time of the AutoTest.时间取决于技师技能以及需要测试多少应用。

Troubleshooting – it’s less straightforward to quantify these savings, as it is highly dependent on the actual problem and the skill of the technician.Users of other NETSCOUT Handheld Network Test Solutions generally report 30 percent to 40 percent faster troubleshooting, but we will set that aside and consider it a “bonus” savings in addition to what is demonstrated here.

Collaboration – to quantify the time savings in these scenarios, we compared time to set up a packet capture using port mirroring (roughly 20 minutes) versus the inline packet capture with the OneTouch AT (three minutes).如果技师无法重新配置交换机,时间节省会多得多。

更好的协作能减少结束故障单所需的整体时间,从而提供更大的优势。如果没有 OneTouch AT,通常难以将所有相关数据交给合适的人。这是故障单被拖延的重要原因,有时候会拖延数天。Armed with the OneTouch AT, the first responder easily generates a report or a trace file and shares these plus provides remote access for the rest of the team in real time.So while the total amount of time the staff spends on a problem may not be reduced, the time the end user spends waiting for a resolution is greatly reduced.

Again, such a benefit is hard to quantify – so we won’t – but for many organizations this may be more valuable than actual hours saved for the department staff.

结论

An estimation of the savings expected from using the OneTouch AT is presented in Table 1.Even ignoring the time saved in solo troubleshooting and purchase of the top of the line model, one would expect payback in less than six months.

AutoTest 节省
故障单每月每技师 20(介质)
分钟每故障单 90
AutoTest 时间 1 分钟
手动执行 AutoTest 功能的时间
(from Table 1)
60 分钟
每故障单节省的时间 59 分钟
每月节省的时间 19.7 每技师
协作节省
要求数据包捕获的百分比 19%(平均)
要求数据包捕获的数量 3.8
数据包捕获设置时间 20 分钟
使用 OneTouch AT 设置数据包捕获 3 分钟
每次捕获节省的时间 17 分钟
每月节省的时间 1.1 每技师
用户数量 2 每 OneTouch AT
节省的总时间 41 小时每月
节省的美元
小时费率  60 美元
每月总节省  2,489 美元
OneTouch AT 成本  10,000 美元
回报 4.0 个月

表 1:An estimation of time and cost savings provided by OneTouch AT.

图 5:Summarizes the concepts of this paper:the ways in which the OneTouch AT saves time throughout the entire problem-solving process as compared to traditional methods.

附录 A – 一分钟完成一小时的故障排除

了解根本原因是关键但还涉及额外步骤。例如,当您去看医生时,无论您的主诉是什么,都会先由护士来测量您的体重、体温、血压并通常会检查您的病历。这个步骤不仅节省了医生的时间,还经常抓住可能被忽略的问题。

这个概念也适用于测试网络。Since your team doesn’t usually include a nurse, the OneTouch AT automates a complete test of “network vital signs” into an AutoTest that compresses an hour of traditional testing into about a minute.The results are then compared against userdefined limits to provide a simple pass/fail result.这种方法不仅节省时间,还使技师能够解决更多问题。

测试步骤 OneTouch Network
Assistant (AutoTest)
传统方法
基本连通性(有线或 Wi-Fi) 1 分钟 线缆测试仪,计算机,Wi-Fi 实用工具 5 分钟
基础设施服务 计算机,实用工具 5 分钟
无线运行和性能 两台计算机,iPerf 10 分钟
网络服务和应用性能 数据包捕获,协议分析仪 40 分钟
(三个应用)

表 2:OneTouch AutoTest 可在一分钟内完成近一小时的手动测试。

Basic Connectivity – OneTouch AT tests both wired and Wi-Fi connectivity.On the wired side, it checks the physical layer (including cabling and Power over Ethernet), identifies the switch port, speed and duplex settings and identifies the switch port and VLAN.For Wi-Fi, it verifies connectivity and security settings for the nearest AP and tests connection speeds.

Infrastructure Services – OneTouch AT tests availability and response time of DNS and DHCP across both the wired and wireless network.

Wireless Operation and Performance – the exclusive test measures the actual wireless performance by sending a stream of traffic out the wireless port, through the nearest AP, the wired infrastructure and back to its wired port.The test runs simultaneously in the reverse direction with programmable upstream and downstream rates.The test provides measurement results for throughput, loss, latency and jitter in both directions.These latter two measurements are vital for quality performance of real time application such as streaming video or voice over Wi-Fi.

Network Services and Application Performance – these tests provide a detailed breakdown of application performance by analyzing an actual interaction with the server/service under test.Seven different tests/applications are supported and can be customized for specific sites or applications whether hosted locally or through cloud-based providers.

The detailed breakdown provides the results shown in Figure 7 and shows the results for both the wired and Wi-Fi network side-by-side for easy comparison.

While the overall measurement can be used to quantify the performance of the application, the detail can be used to determine the cause of the slow performance, whether it’s the network, application server, or the DNS server.While an expert can provide this level of analysis in ten or twenty minutes with a protocol analyzer (once they have a trace file), the OneTouch AT provides it under a minute as part of a standard AutoTest.

图 6:The test measures performance of the wired and Wi-Fi network.

图 7:The application performance tests provide a detailed breakdown of the response time of servers and services.

Appendix B:List of problems that can be discovered by the OneTouch AT Network Assistant AutoTest

Fiber problems Wrong SFP/vendor mismatch, Dirty fiber endface*, Dead port/broken fiber, Low power
Twisted pair problems Open cable, Bad cable mapping, Shorted cables, Mislabeled/undocumented cables*, Too long cable, Dead port
PoE Not present or disabled, Switch unable to supply adequate power, Wrong pins, Low voltage, Non-Ethernet voltage, Low power under load, Class 4 negotiation mismatch
链路 Polarity mismatch, Low link level, Receive pair issues (MDIX), Speed mismatch, Duplex mismatch
Switch port Incorrect switch, Incorrect port, Incorrect data VLAN*, Incorrect voice VLAN*, Unstable switch uptime*, Switch congestion*, Switch errors*, FCS errors*, Frame size errors*, Other frame errors (7)*, Excessive broadcast traffic*, Excessive multicast traffic
Wi-Fi Security settings wrong, AP missing, AP misconfigured, AP not connected, WLAN controller problems*, Excessive noise*, AP congestion*, Channel over utilized*, Too many APs on channel*, AP overlap on channels*, Roaming problems*, Unauthorized APs*, Finding rogue APs*, Ad hoc networks*, Insufficient network coverage, Slow connection, Bandwidth hogs APs and client*, Bad client NIC*
Veri-Fi QoS settings wrong*, MTU problems*, Port problems*, Upstream bandwidth issues, Downstream bandwidth issues, IPv6 issues*, Excessive loss, Latency issues*, Jitter issues*, Sequencing issues*
DHCP Missing, Slow, Out of addresses, Incorrect lease time, Rogue DHCP server, Wired versus Wi-Fi configuration problems, Duplicate static IP address, IP address hijacked, Incorrect address delivery, Incorrect subnet delivery, Incorrect router address, Incorrect DNS address
DNS Missing, Slow, No secondary server*, Wired versus Wi-Fi configuration problems*
网关 Missing / Failed, Not IPv6 capable*, Unstable gateway uptime*, Overloaded*, Bad traffic*, Incorrect routing protocols*
Discovery Wrong VLAN*, Wrong subnet*, Unexpected IPv4/IPv6 devices
Web (HTTP) DNS lookup failure, DNS lookup slow, Server unavailable, Slow connectivity, Server slow to start, Server slow to complete, Wired versus Wi-Fi transport problems, IPv4 versus IPv6 transport problems
Ping (ICMP) DNS lookup failure, DNS lookup slow, Server unavailable, Slow connectivity, MTU misconfigurations, Wired versus Wi-Fi transport problems, IPv4 versus IPv6 transport problems

Appendix B:(continued)

连接 (TCP) DNS lookup failure, DNS lookup slow, Server unavailable, Slow connectivity, Firewall misconfigured for ports, Wired versus Wi-Fi transport problems, IPv4 versus IPv6 transport problems
多播 (IGMP) Server not multicasting, Switch IGMP snooping disabled, Incorrect port configuration, Server authentication
文件 (FTP) Slow WAN, DNS lookup failure, DNS lookup slow, Server unavailable, Slow connectivity, Server slow to start, Server slow to complete, Wired versus Wi-Fi transport problems, IPv4 versus IPv6 transport problems
视频 (RTSP) Slow WAN, DNS lookup failure, DNS lookup slow, Server unavailable, Slow connectivity, Server slow to start, Server slow to complete, Wired versus Wi-Fi transport problems, IPv4 versus IPv6 transport problems

* May require a secondary test after the AutoTest.

1 See appendix A, “An Hour of Troubleshooting in One Minute”
2 See appendix B, “Problems That can be Discovered by the OneTouch AT Network Assistant AutoTest”

 
 
Powered By OneLink