What steps should be taken to perform a basic Notification Server performance health check?
The following steps are focused on eliminating Notification Server console timeouts and or slow responses, but are applicable to resolving many Notification Server performance problems.
- Upgrade to the latest version of the solution that is experiencing a timeout or performance issue. This is particularly true in the case of Patch Management Solution and Software Delivery Solution. Both have had issues in prior versions with handling very large tables and or contained inefficient SQL queries.
- Ensure that the database used space is appropriate to the managed computer count. A general rule of thumb is 1 MB of space per managed computer (5,000 computers equals approx. 5 GB of space in the database). Don't forget that the physical file size of the database will be larger than the actual space used. Use the appropriate SQL management tool to see actual space used in the database.
How do I determine what the database table sizes are per solution?
If used data space is higher than expected, check the individual table sizes and prune the Event tables as necessary. Extremely large event tables indicate an ongoing database timeout problem during the nightly purging process. Several Notification Server reports depend upon aggregating event data, and millions of rows from 6 or more months ago will drastically increase report generation times.
- Check the SQL Index fragmentation levels and rebuild indexes as necessary. Heavily fragmented indexes can have a severe impact on performance. Rebuilding indexes will also help free wasted space in the database.
How can I reorganize or rebuild my Altiris database indexes on a SQL 2005 server for improved performance?
How can I defragment database indexes on SQL 2000?
- Review IIS logs for heavy traffic consumers. Any standard IIS log parser can aggregate visits by IP address and URL. Any IP address visiting the same URL more than 100 times over an 8-hour period indicates an unhealthy managed agent, a broken Notification Server Web service, or an agent configuration policy with an overly aggressive interval.
- Review agent configuration intervals and collection update intervals. The top two sources of processing load are caused by agent configuration requests, and collection rebuilds.
- For production purposes, Altiris agent configuration intervals should be no less than 1 hour, with most enterprise environments using 4–6 hour check-in intervals.
- For production purposes, Delta and Policy Change collection update intervals should be no less than 30 minutes, with most enterprise environments using 1–3 hour update intervals. A general rule of thumb is update collections twice as frequently as the Altiris Agent interval. Stagger the start times on the collection update schedules by 10–15 minutes to avoid concurrency problems. The full collection update schedule should remain at once per day.
- Review the SQL tuning and configuration articles as listed in article "NSEs are not being processed and the Altiris Console is too slow."
NSEs are not being processed and the Altiris Console is too slow
- For very large environments, also review article "Common problems for very large environments."
Common problems for very large environments