This article contains recommendations and guidelines for configuring Discover Servers to scan File System targets efficiently. You can use the attached spreadsheet to calculate the values of the recommended settings.
Bear in mind the following factors while planning to configure Discover Servers for File System scan targets:
Scan throughput is affected by the available network bandwidth, number of CPU cores, and the total system memory of the participating Discover Servers.
Scan throughput is affected by the complexity of the configured policies.
A higher active user count on a particular File System server could reduce scan performance.
Scan performance is affected by the distances between the participating Discover Servers and the File System target being scanned.
Symantec recommends the following settings for each Discover Server. These settings can be changed in the crawler.properties file in on the Discover Server, located in <SymantecDLPinstalldirectory>\protect\config.
Please note on newer versions of Enforce these settings are modified in the Enforce console Located at: System > Servers and Detectors > Overview Server / Detector Detail - > Advanced Settings
crawler.threadpoolsize = 30 (default is 15) where crawler.threadpoolsize represents the number maximum number of crawler threads. Note: Use the recommended value only if your setup conforms to the recommended hardware configuration in the table below.
MessageChain.NumChains = 1 * No. of CPU cores (if hyper-threading can be verified, 2 * No. of cores) where MessageChain.NumChains represents the number of messages in parallel that the FileReader will process.
MessageChain.CacheSize = 2 * MessageChain.NumChains where MessageChain.CacheSize represents the size of the Detection (MessageChain) queue.
FileReader.MaxFileSystemCrawlerMemory = (Crawler Threads + MessageChain.NumChains + MessageChain.CacheSize) * FileReader.MaxFileSize where FileReader.MaxFileSystemCrawlerMemory represents the total runtime memory for all running threads.
BoxMonitor.FileReaderMemory = 4 * FileReader.MaxFileSystemCrawlerMemory where BoxMonitor.FileReaderMemory represents a dynamic memory pool holding all runtime data about the FileReader. This value should be less than the assigned system memory.
In addition, if you use the Grid Scanning feature, Symantec recommends configuring the following settings:
crawler.grid.follower.queuesize = 2 * crawler.threadpoolsize where crawler.grid.follower.queuesize represents the maximum number of files for detection that can be added to the grid queue.
crawler.grid.queuesize.multiplier = 4 * crawler.threadpoolsize where crawler.grid.queuesize.multiplier represents the grid scan request queue size per detection server.
You can use the attached spreadsheet to calculate the values for all of the recommended settings.
Note: The grid scanning feature is available in Symantec Data Loss Prevention version 15.0 and later.
Scan target configuration guidelines
Symantec recommends the following guidelines for configuring File System scan targets:
Scan mode guidelines:
When you select Grid as the scan mode, ensure that the tuning parameters, specifically for grid scanning, are configured on all of the Discover servers in the grid.
To configure a grid scan, you must select at least two Discover Servers.
To initialize a grid scan, at least two of the selected Discover Servers must be available.
Target configuration guidelines:
To avoid scanning unnecessary files, configure filters based on the expected items to be scanned on the basis of the File Type, Date Modified, and file size attributes.
Summary of configuration recommendations
Recommended Configuration (Single Server)
Recommended Configuration (Grid Leader and 10 Discover servers)
Number of CPU cores
For more information, refer to the grid scanning performance guidelines in the Symantec Data Loss Prevention 15.x Administration Guide.