This topic describes the troubleshooting tools that are provided with Wanware. Using them can help you save time and money. Time is saved by diagnosing problems in the field without special data communications instrumentation. Money is saved by using the traffic statistics gathering capabilities to provide information for traffic studies.
We don't expect everyone installing and administering this product to be a data communications expert. The information here may be more than you need. However, the tools are no less useful for having background information on data communications described along with them.
Three kinds of information can help you determine your problem:
statistics
Statistics are cumulative counts which can be examined at any time. These are often useful during isolation of both functional and performance problems.
protocol traces
Protocol traces show you the protocol exchanged with the remote system from the local system's point of view.
event logs
Event logs provide a history of events which occurred at particular interfaces.
You access most of this information with tsgstat. tsgtrace allows you to examine X.25 protocol information in real time, or store it for later analysis.
tsgstat and tsgtrace are installed in the /sbin/ directory. tsgtrace is installed as root accessible only. It should never be generally executable because it will give a line trace which can include passwords.
tsgstat is menu-driven and is easy to use. To invoke tsgstat, as root, enter the command:
tsgstat [ -h ] | [ [ -n netid ] | network_name]
tsgstat displays on standard output the status of the specified X.25 link, network_name. network_name is defined in file /etc/packetnets.
tsgstat presents a menu allowing easy one key selection:
Network Name: link0 ID: 0, The Link is UP 1: Serial Statistics 2: Frame Level Statistics 3: Packet Level Statistics 4: Link Configuration (non-Packet) 5: Link Configuration (Packet) n: Change Network ?: Choices q: Quit Choice: _
To display the choices press the "?" key. To quit, press "q" or the ESC key. To change the active network press "n" and choose one of the available networks presented in the menu by typing the network ID:
Available Networks:
ID: Name:
=== ============>
0 link0
1 link1
Enter Network ID: _
Counts are set to zero whenever X.25 services are started.
To display the non-packet level configuration parameters for the currently selected network, select option 4.
To display the packet level configuration parameters for the currently selected network, select option 5.
To display the frame level statistics for the currently selected network, select option 3.
The following table describes the information that appears.
|
Field |
Description |
||||||
|
The Link is status |
status is UP when the host's frame level is successfully communicating with the remote's frame level and link startup frames have been exchanged. If UP, then the host has sent SABM and received a proper UA or SABM. Otherwise, status is DOWN. |
||||||
|
Bad length frames |
Three counts of frames received with incorrect lengths are displayed:
|
||||||
|
Bad frames |
Two counts of bad frames are displayed: frames with illegal address bytes (that is, address bytes other than 01 or 03 hexadecimal), and frames with an unsolicited Final bit. Non-zero counts indicate a protocol problem at the remote. |
||||||
|
Local T1 timeouts |
The number of times that the X.25 re-transmission timer ran out. This is detected by the remote when a frame is received with the Poll bit set. A non-zero count indicates a link problem since the host had to poll the remote for a response. |
||||||
|
P bits received |
The number of frames received with the Poll bit set. When non-zero, this is an indirect measure of how often the remote's T1 timer timed out; the remote usually sends a command with the Poll bit set when the T1 timer times out. Non-zero counts may mean that the remote is keeping an idle link active by periodically sending an RR command with Poll bit set. If it occurs locally, it indicates a link problem. |
||||||
|
Disconnected link because of inability to transmit |
A non-zero count indicates that the host is not receiving expected responses from the serial transmission hardware. Often this occurs if a link is configured for external clocking, but is not receiving a clock signal from the modem. |
||||||
|
Disconnected link because of no remote response |
The number of times that the remote did not respond to a time-out recovery (transmission of RR with Poll bit set) for N2 retries. A non-zero count indicates an inactive remote, severed link/cable or modem problems. |
||||||
|
Frame type counts: Information and Supervisory Frames |
Counts are displayed for each type of information and supervisory frames transmitted and received. High I-frame and RR counts are normal since only I and RR frames are exchanged when the frame level is running successfully. Non-zero RRp, REJ and RNR counts are not unusual since X.25 protocol resolves minor problems using RRp, REJ and RNR frames. High REJ counts indicate that data transmission is succeeding, but that the link is subject to errors. A non-zero RNR transmitted count indicates that the host is occasionally running out of buffers. A non-zero RNR received count indicates that the remote is running out of buffers. |
||||||
|
Frame Type Counts: Unnumbered Frames |
Counts are displayed for each type of unnumbered frame transmitted and received. The HDLC protocol uses unnumbered frames when the link is DOWN (not able to transfer information) or to indicate an unrecoverable error. Non-zero FRMR transmitted or received counts indicate a protocol implementation problem or HDLC parameter mismatch. Non-zero SABM, DISC, UA transmitted or received counts indicate attempts to bring up the link. If only the transmitted counts are increasing or if only the received counts are increasing, then there is a link or cable problem. If the only non-zero received count is for DM, it indicates that the remote cannot bring up the link. BAD frames counts are the number of frames received and transmitted with an illegal address, or with an illegal or out-of-sequence control field (that is, caused an FRMR frame to be transmitted). |
||||||
|
I field of last FRMR rx |
This field contains the diagnostic information from the last Frame Reject (FRMR) frame received. Three or five bytes are displayed depending on whether HDLC is running in basic or extended operation. (Extended operations increase the size of some fields and therefore 5 bytes are returned in a frame reject rather than the 3 bytes returned in basic operation.) |
||||||
|
I field of last FRMR tx |
This field contains the diagnostic information from the last FRMR frame transmitted. Three or five bytes are displayed depending on whether HDLC is running in basic or extended operation. |
||||||
|
Frame level ran out of buffers |
The number of times that the frame level was unable to get a buffer from the free queue. A non-zero count means that the host's RNR transmission threshold should be increased. |
To display the channel statistics for the currently selected network, select option 1.
The following table describes the information that appears.
|
Field |
Description |
|
Frames received OK |
The number of frames received without a CRC error. A rapidly increasing count indicates an active receive data path. |
|
# bytes received OK |
The number of data bytes (protocol headers plus contents of X.25 packets) received. The average frame size is the bytes received OK count divided by the Frames received OK count. |
|
Frames transmitted OK |
The number of the frames transmitted. |
|
# bytes transmitted OK |
The number of characters transmitted excluding flags and CRCs. |
|
underrun count |
The number of frames unsuccessfully transmitted because characters were not supplied to the serial communications interface fast enough to maintain synchronization. Significant values (10% of frames transmitted OK), indicate X.25 is overloaded, possibly due to one or both links being clocked at higher than rated speeds. |
|
overruns count |
The number of frames rejected because characters were not read into memory as fast as they arrived. Usually indicates that the X.25 link is running above its maximum rated speed. |
|
receive aborts |
The number of abort sequences (strings of 7 or more contiguous 1 bits) received. A non-zero value is not a problem unless it is greater than 10% of good frames received. This may indicate a problem with the Received Data signal or with the remote transmitter. |
|
crc errors |
The number of frames received with bad FCS (Frame Check Sequence), that is, Cyclic Redundancy Check bytes received do not match the values calculated from the data received. A non-zero count indicates a noisy data link or problems with the remote transmitter. |
|
Frames < 2 bytes |
The number of frames that were too short. That is, less than 2 bytes long. A non-zero count indicates a noisy data link or problems with the remote transmitter. |
|
no recv buffer available |
The number of instances that a buffer was unavailable during a receive operation. If this count is increasing rapidly, X.25 traffic levels may be overloading the host's configured Streams resources. |
|
CTS changes |
The number of times that the Clear to Send (CTS) signal changed. A rapidly increasing count indicates a modem or cable problem. |
|
DCD changes |
The number of times that the Data Carrier Detect (DCD) pin changed. A rapidly increasing count indicates a data link or the modem problem. |
|
DSR changes |
The number of times that the Data Set Ready (DSR) pin changed. A rapidly increasing count indicates a problem with the modem or the cable between the modem and the communications controller. |
|
RI changes |
The number of times that the Ring Indicator (RI) pin changed. |
|
CTS |
The current state of the CTS signal. |
|
DCD |
The current state of the DCD signal. |
|
DSR |
The current state of the DSR signal. |
|
RI |
The current state of the RI signal. |
tsgtrace allows you to see X.25 frames exchanged between the remote and Wanware in real-time, as they are being transmitted.
You can redirect the standard output of an tsgtrace invocation to a disk file so it can be sent to support individuals for protocol analysis.
tsgtrace displays the contents of X.25 frames exchanged over the specified X.25 network, network_name. network_name is defined in file /etc/packetnets.
Consult the tsgtrace manual page for usage information.
The incoming call daemon, x25daemon,
logs its activity in /var/log/x25/x25dlog.
Note that a call disposed of by the daemon may still be cleared by other application software.
Once you have have begun the X.25 protocol stack, it automatically tries to start the HDLC (Layer II) protocol.
The first step in checking the link is to run tsgstat. The first line of the FRAME LEVEL information tells you explicitly whether or not the link is up. If the link is UP (ready to transfer Information, or I frames), you can proceed to making X.25 calls. If not, you have some more investigating to do.
The following is an example of statistics that are displayed when you select option 3 from the tsgstat menu.
< PRE>In the example shown above, the clue to the reason the link is not up is in the line:
Inability to transmit: 0403, no remote response 0000
This line indicates that the local system is unable to transmit. Either your modem is not providing clocks, the cable connecting the card with the modem is not passing them through, or you need to use internal clocking but have not provided a non-zero value for the speed parameter in the X.25 protocol parameters.
Once you have an X.25 link that is up, you can try to make calls to other hosts to make sure that your configuration is set up correctly.
Once you have an X.25 link that is running, you can use xpad to make a PAD (X.3, X.28, X.29) call to a network-connected host. The following is an example of the command used to initiate a call.
xpad -r -d 0302092100086 -u A
The first 0 in the address may need to be a 1 for your network's addressing scheme for making inter-network calls. The address above is that of the Datapac Information Service, a computer connected to Canada's Datapac network that provides information on the service.
When you invoke this command, a display similar to the following appears:
Call Made
Awaiting Call Acceptance ..Call Connected. Outbound packet size = 256
WELCOME to the Datapac Information System.
Your previous session was 1991-02-12 11:12:23 EST
Your packet size is likely to be 128 instead of 256, but you should be able to establish a connection. If you do not, you will either receive a clearing message with an indication of why the call was cleared, or the connection attempt will "hang". When a connection hangs, you see:
Awaiting Call Acceptance ..
until T21 times out (default 30 seconds).
The most common cause of a hung call is a mismatch between the number of virtual circuits locally configured and the number that your network provider is using. Reduce the parameter bwc_num until you can connect a call.
Flow control parameter negotiation causes a fair number of connection problems. The clue is usually in the Clearing Diagnostic associated with the call - 03/41 or 03/42, for example. 03/41 indicates an invalid facility in a call setup packet, while 03/42 indicates an invalid value for a facility parameter. Some common solutions to facility-related problems are:
Change force_negotiate from YES to NO.
By default, the software inserts all the flow control related negotiation facilities, as force_negotiate is YES, just to make sure that both ends explicitly know what values are being used. Some X.25 carriers or switch manufacturers allow only a subset of these parameters, and some do not allow them at all, depending on what the remote system thinks the link parameters ought to be.
Reduce the value for class_recv and class_send.
In the default configuration, an artificially high value for the throughput class negotiation parameters is set. This reduces the probability that the local node will clear incoming call packets requesting a particular value for throughput class, which is what it does when the remote system specifies a value in excess of class_recv and class_send.
This has the unfortunate side effect that outgoing call requests or call accepts may have an artificially high value for the throughput class parameters, and the remote call may clear if you specify a value in excess of the actual link speed.
|
Parameter Value |
Baud Rate |
|
7 |
1200 bps |
|
8 |
2400 bps |
|
9 |
4800 bps |
|
10 |
9600 bps |
|
11 |
19200 bps |
|
12 |
48000 bps |
|
14 |
64000 bps |
Using tsgtrace may be able to show you the call clears. The most important thing with call clearing problems is who sent the clear (was it transmitted or received), and what the cause and diagnostic were. Cause and diagnostic bytes are described in X.25 Cause and Diagnostic Bytes.
Copyright © 1997-2004 The Software Group
Limited. All Rights Reserved.
® Netcom is a registered trademark of The Software
Group Limited.