Troubleshoot the network

Display servers on the network

Every Wiretap Server has an unique ID (uuid). List all servers on a network and their IDs with: /usr/discreet/wiretap/tools/current/wiretap_server_dump. For example wiretap_server_dump -d "Backburner" returns one of:

Test Wire with sw_framestore_dump

At least two Wire hosts must be set up to test Wire connectivity. Run /usr/discreet/sw/tools/sw_framestore_dump to see a list of available framestores. If some hosts are not visible, check:
  • The filesystem and networking configurations in sw_framestore_map.
  • The sw_probed.cfg port number.
  • Use sw_ping to test the connection to other Wire hosts.
  • Check that each framestore has a unique Framestore ID.

Test network with ping and sw_ping

Ping machines' hostnames that should be accessible through Wire. If it fails, try using the machine's IP address instead of its hostname. If this is successful, verify how the machine resolves host names on the network. You should set the order of host name resolution to first look at the local setup file, then validate on the network. The /etc/nsswitch.conf file should include a hosts line that has the name validation process in the following order: hosts: files nis dns.

Run /usr/discreet/sw/sw_ping -framestore <framestore_name> -r -w -size <packetsize> -loop <n> where <host_name> is the name of the host to ping, <count> is the number of times to execute this test, <packetsize> is the size of the read/write buffer (in bytes). Syntax help is available with sw_ping --help.

Verify the local host has write permissions to the remote host

Try to access the clip library directory of the remote host: cd /hosts/<remote_machine>/usr/discreet/clip. If error messages appear:

Check NFS and automounting daemons are running

Run: chkconfig --list | grep nfs and then chkconfig --list | grep amd.

If NFS or AMD is off on any of those run levels, run: chkconfig nfs on and chkconfig amd on then restart your network: /etc/init.d/network restart. You might also consider rebooting your workstation.

Check network interfaces

Run ifconfig. If your network interface is up and running, an UP appears similar to: UP BROADCAST RUNNING MULTICASTMTU:1500Metric:1. Otherwise check the connections on your network card. A green light appears when there is a good connection between your network card and its destination. If you must reconnect cables on Linux, you must restart the network interface with ifconfig <interface_name> up

View InfiniBand statistics

View InfiniBand information for a workstation

The InfiniBand driver installed on your workstation provides two commands that output statistics and information about InfiniBand ports. In a terminal, as root:

cat /proc/iba/ <driver_id> /1/port <x> /stats
cat /proc/iba/ <driver_id>/1/port <x> / info

where <x> is the port number on the device, and <driver_id> is the HCA driver ID for your device, for example: mt25218. A report appears in the terminal for each command. To find out the HCA (Host Channel Adapter) driver ID, type ls /proc/iba/ | grep mt. The driver number, beginning with mt, is returned.

Generate InfiniBand log files

As root, in a terminal, run /sbin/iba_capture <path and name of output gzip file>. A GZIP file is generated that includes a number of log files from your system. The help for this command incorrectly indicates that the output is a TAR file, when it is in fact a GZIP file.

View port statistics for a Mellanox IS5030 switch

The Mellanox IS5030 switch is the recommended switch model for QDR InfiniBand networks. The nominal speed for QDR connections is 10 Gbps.

In a browser on the same subnet as the InfiniBand switch, enter the IP address of the InfiniBand switch (default 10.10.10.252). In the login page enter the credentials (default admin / admin). Click Ports and in the Ports page click a port to view information. If you have ports with DDR connections that appear to be running at SDR speed (2.5 Gbps instead of 5 Gbps), unplug the cable and then plug it back in. The connection should run at normal DDR speed afterwards. This issue occurs because of a bug in the switch firmware.

View port statistics for an InfiniCon InfinIO 9024 switch

The InfinIO 9024 InfiniBand switch is the recommended switch model for DDR InfiniBand networks. The nominal speed for DDR InfiniBand connections is 5 Gbps.

In a browser go to the IP address of the InfiniBand switch, usually 10.10.10.252. In the main page of the Device Manager click Port Stats to view the status of each port on the switch.