MTBF 8.5.x – Mean Time Between Failures Sandbox app updated for Domino 8.x – Credit to John Paganetti
What is MTBF? It stands for Mean Time Between Failures. That’s a statistical term for describing how often your servers go down and for how long. In this case, it is also the name of a tool that captures vital information about every time a server is shut down or crashes anywhere in your environment. It tracks when it went down and for how long. It does this automatically whether you remember to post a change in change control or not. It even has a place to post comments about the event. It also provides % up-time measurements and other statistics. When you say your servers had an availability of 99.999%, over the past 5 years, now you can prove it. This is all stored in a database where you can let management see just how robust and solid your Domino environment really is. No longer do you have to take a beating for network outages that made it look like Domino was down. You’ve got the proof! And the nice thing is how easy it is to implement. A small .EXE is run every time the server is started and it is run once a day to collect stats.
This is total plagiarism, but I give full credit where it is due. John Paganetti of Iris Associates Inc. (IBM) is the developer responsible for this application. The original version (which worked on R5 to ND7) was posted in the Sandbox at LDD. But as you all know, The Sandbox was taken down, so there was nowhere for this updated version to be posted. A while back, I made a post on IdeaJam requesting that MTBF be updated to work on 8.x and John graciously fixed it and sent me the code. Now I am sharing it with you. It’s in the sidebar on the right, in the flash widget box.
Recently I posted on IdeaJam requesting this tool be added to the standard software. Please go there and vote for it. The complete instructions on implementing it on your servers can be found in the Help Using document of the database. Below is an excerpt of the first steps involved. Also check the Help About document and if you get the chance, drop John a note of thanks for a cool tool.
Mean Time Between Failure Installation Instructions
First Server Installation:
Place the MTBF executable in the Notes Program Directory on your Domino Server.
Place the MTBF.NTF template in the Notes Data Directory on your Domino Server.
From an Administrative Client: Create a New Database MTBF.NSF using the MTBF.NTF Template
At the Notes Server Console: > load mtbf -a
The -a is only necessary if this server has not yet been “added” to the list of servers to be monitored.
Wait until the task runs to completion. From the Notes Server Console check it’s status by doing performing > show task until MTBF task no longer appears.
Verify your first server has been added appropriately by opening MTBF.NSF and view 1) Server Information.
Now is a good time to update the MTBF.NSF ACL. It is recommended that the Server Group LocalDomainServers has Manager Access. The rest is up to your discretion.
Add MTBF to the ServerTasks = line in your NOTES.INI on your Domino Server.
(Crash and Shutdown information will be updated each time the server is restarted when MTBF runs the first time)
Add MTBF -F to the ServerTasksAt5 = line in your NOTES.INI on your Domino Server or create a Program Record for this server to run MTBF -F once a day or every other day or once a week … based on your desires.
(This will perfom the once a day exhaustive log searching and compute intensive mean time calculations and statistic generation)
Create a Program record for this server in the Name & Address book to periodically run MTBF
(Very fast and Inexpensive operation to update Server Elapsed Time in 1) Server Information view.)
(Recommend hourly update but you may choose less often, but do not recommend more often.)