Troubleshooting NBU Media Servers

From World History Wiki
Jump to: navigation, search
World History Wiki is Brought to you by:
S.J.'s Adventures


Purpose

The NetBackup (NBU) Media servers are the machines that receive the client data and write it to the VTL, DeDupe appliance, or tape. When a media server is "down" the clients that use it to write data cannot be backed up. It is the responsibility of the NBU Administrators to ensure the availability of the media servers to provide clients with the means to have their data written to an allocated backup resource.

This document will show the procedures and tools used to monitor the NBU media servers.

Procedures

1) Login to THE MONOTORING SYSTEM and check the status of the NBU services.

1.a) After logging in, navigate .... services and hosts are up. If any hosts or services were down, the page will show which hosts have down services with Red colored boxes.
1.b) If you cannot get access and need to check the services for an environment where there is only one NBU server that functions as the master and media server, you can login to the NBU remote desktop management server for that environment, start the JAC (Java Administration Console), login and check the processes there.
1.b.1) Navigate to the Activity Monitor in the JAC by clicking on it.
1.b.2) Click on the Daemons tab.
1.b.3) Verify that the critical NBU daemons are running. The critical daemons are; ltid, nbsl, bprd, vmd, bpcompatd, nbstserv, nbvault, nbsvcmon, NB_dbsrv, nbrmms, bpdbm, nbjm, nbrb, nbemm, nbpem & nbevtmgr.
1.c) If all the daemons are running then proceed to the next step.


2) Next, check the current jobs to identify any possible error codes that could give you any information about what is causing problems for the media server.

2.a) To start the Activity Monitor for the environment where the problem is reported, login to the JAC (Java Adminitration Console).
2.b) To view the Activity Monitor click on it in the window.
2.c) Make sure the view is showing the Jobs information.
2.d) By scrolling up and down looking at the information on the recent jobs, check the start times, Media Server, State, Status, client, KB per second, and so on, a general idea of any potential issues can be deduced.


3) One common problem is the availability of media, which can be checked with the process below. The main item to look for is whether there are any tapes in the "Scratch Pools".

3.a) From the JAC, click on Media and Device Management.
3.b) Expand the options in Media by clicking on the item to the circle with a line, see highlighted items in the picture below.
3.c) Click on Scratch' to show the items in the Scratch Volume Pool.
3.d) Scroll through the list of media in the right hand pane and verify there are media in the correct robot with the correct media type.


4) Check the storage devices to verify there are no "Down" drives or robots.

4.a) From the JAC, click on Media and Device Management.
4.b) Click on Device Monitor and scroll to find items that do not show either "Active" or "TLD" in the "Control" column. In the picture below, there is one drive that is in a <Mixed> control state which indicates one or more media servers cannot access this device.



Back to NetBackup