Guidelines for provisioning Symantec Data Loss Prevention scans of Box targets
Last Updated March 04, 2016
This article contains recommendations for provisioning Discover servers to achieve a scan rate of approximately 23 GB per hour when running Box target scans. It also includes an Excel work sheet to help you calculate the scale of your deployment based on the amount of data in your repository.
Symantec recommends the following detection server system resources for Box scanning:
16 CPU cores per server
30 GB of system memory per server
Symantec recommends the following server settings for each detection server:
15 crawler threads (this is the default setting)
BoxMonitor.FileReaderMemory = -Xmx16G
FileReader.MaxFileSystemCrawlerMemory = 1999M
MessageChains.NumChain = 32
MessageChains.CacheSize = 64
Symantec recommends these additional best practices to achieve a scan rate of approximately 23 GB per detection server per hour:
Divide the Box users and groups uniquely between your detections servers.
Use different Box accounts for authorization for each detection server to avoid reaching the Box account limit. Using the same account for authorization on multiple detection servers may result in a decrease in the scan rate.
Doubling the number of CPU cores and system memory for each detection server scanning Box increases the scan rate by approximately 40%.
Be aware that if a large percentage of the files you are scanning are small (a few Kb each, for example), you will reach the Box API limit more quickly than you would if you were scanning fewer files of a larger size. Scanning a large number of small files will lower the scan rate that you can expect to achieve.
Using these system resource, setting, and best practice recommendations should allow each detection server to scan 23 GB of content stored on Box per hour. You can use the attached worksheet to calculate how long it would take to scan large Box repositories depending on the number of detection servers you can deploy.