Scheduled jobs and tasks take a long time for clients to pick them up.
When looking at the task itself on the SMP Console, it appears as "Queued".
Targeted client machines don't receive an update that there is a new task to run. Anything else that was previously scheduled runs just fine. However, if restarting the Symantec Management Agent service, the task is received and executed. This is happening randomly and with no specific pattern. Usually, a very small number of client machines are affected by this.
ITMS 8.1 RU7
Task execution relies on a tickle connection between the Symantec Management Platform (SMP), remote Task Server (TS) and a Symantec Management Agent (SMA) on a client. When a task is scheduled, SMP sends a tickle packet and task definition to the Task Server which has registration record for that specific client. Then TS sends a tickle packet to the client with notification 'I have a task for you, please take it'. Symptoms above are for the most cases when the tickle connection between the task server and client does not work. Without the tickle connection, the client will ask TS periodically "maybe you have a task to me?" - by default, such request will be done once in 30 minutes (controlled by Client Task Agent (CTA) policy on SMP under Settings>Notification Server>Task Settings>Task Agent Settings, "Check Task Server for new task every:" (by default it is 30min)).
The client establishes a tickle connection while registering on a task server, so it will be restored in case of SMA restart or in case if you will press 'Reset Agent' button in Task Status tab of SMA UI.
Tickle connection might be broken by:
A restart of task server service (atrshost service) on a remote Task Server
Problems with the network adapter on a task-server or on a client
SMP is not aware that the client machine is actually registered to a Task Server so it doesn't assign the task to it
In this particular instance, the following was also noticed:
We look at some of those client machines and those were "assigned to a task server" under the Task Status tab
Looking at the database, the following query will show servers that have agents in this state:
join vRM_Computer_Item c on c.Guid = ti.ResourceGuid
join vRM_Computer_Item c2 on c2.Guid = ti.TaskServerGuid
left join Inv_Client_Task_Resources ctr on ctr._ResourceGuid = ti.ResourceGuid
and ti.EndTime is null
and ctr._ResourceGuid is null
We went to one of the task servers to which the agent was registered, and we restarted the "Client Data Loader" service.
After that, the client machines in the queued state got the task and ran it.
Looks like the quick workaround is to restart the "Client Data Loader" on the affected task server and things go to normal for a while.
That is why when originally the customer was restarting the SMA service, the task was received and executed because it reset the task server connection, forcing the SMP to recognize that there was a task server assigned to the client machine.
This issue has been addressed in our ITMS 8.5 release. This scenario is one of the multiple areas that Symantec Development team tried to improve in the ITMS 8.5 release in regards to Task Management: 1. 8.5 includes more stability with CTDataloader and AtrsHost services 2. 8.5 fixed a problem regarding the IP address check, that was not letting the clients establish a tickle connection.
It appeared that the client machines were not registered to a task server from the SMP perspective, even when the client machines themselves said that were registered to it.
The current workaround for 8.1 RU7 or earlier is to restart the Altiris Client Data Loader (CTDataloader) service in the assigned Task Server to the machines that are affected (the last restarts AtrsHost service as well).
Few things to keep in mind in case you need to troubleshoot this issue if the workaround provided doesn't help you:
Collect verbose logging on SMP, TS, and SMA. Enable extended logging for Task Management/Task Server:
Enable verbose logging on SMP and on remote TS, and on problematic client
Enable extended logging on SMP and on remote TS using attached file "ExtendedLoggingv2.7z"
Reproduce problem with a problematic client
Collect the logs from the machine and specify the task instance GUID for investigation (Step 1) – from all computers: SMP, Remote TS, problematic client