Many organizational data management policies distinguish between "Web Application Screens" showing data to the end-user and File Downloads, targeted at exporting some data from the Web Application for off-line use and sharing.

Organizational Data Loss Prevention and Data Security policies stipulate restricting File Downloads, as suspected means of larger scale data leakage, under certain conditions. Typical examples of such restrictions would be:

  • Restrict file downloads for users using non-corporate-managed devices
  • Restrict file downloads for users/sessions with high risk score (due to behavioral or other heuristics)
  • Restrict file downloads for users based on certain geographical location properties (outside of certain geographical boundaries, etc...)
  • Any combination of the above scenarios, for example, restrict file downloads from a folder of sensitive files for users accessing from outside of certain geographical boundaries from mobile devices

Enforcing the above policies cannot be done in a 100% "hermetic" deterministic manner due to a non-deterministic definition of what a file download is in web application. In general, different types of content received by a Web Application Client (usually, but not always, a Web Browser) are governed by a Content-Type HTTP Response Header. The content of this header is referring to MIME Type of the content returned by the server to the client. The Types of content that are blocked can be found in the following Link.

In recent years the amount of different MIME Types used by Web Applications has grown significantly, and, due to advanced AJAX Web Applications and Frameworks, in some cases it isn't possible to identify a File Download event in a deterministic manner. Here are some examples of a an HTTP Response that can be interpreted as File Download in some cases and as normal application activity in others:

  • text/plain content type can mean either a text file loaded by a client-side Javascript or a text file downloaded by the user by clicking on its link
  • text/csv content type similarly can mean a file loaded by a client-side Javascript or a Comma-Separated Values file representing a spreadsheet downloaded by the user by clicking on its link
  • application/atom+xml content type (used for Atom Web Feeds) could either be considered a download or a legitimate information feed

Additionally, various applications can choose to transmit data either via Web Sockets or via changing content type to text/plan and encoding any binary file in Base64 format.

Given the above examples (and the list is definitely not all-inclusive), it is very important to understand that the only mechanism identifying File Downloads can be based on heuristics and will not be hermetic. In addition, a balance need to be defined between allowing applications to function correctly from the logic perspective. 

Luminate Secure Access Cloud (TM) provides means of granular governance for user and data access in web application sessions and is using heuristic mechanisms to identify File Downloads and distinguish between them and a "normal" functioning of Web Applications. The mechanisms used are being constantly updated based on analytics of Web Application traffic.

Current heuristics include the following mechanisms:

  • White-listing and Black-listing of specific Content-Type values
  • Analyzing Content-Disposition Header (if present) as a strong indicator of downloads
  • Analyzing referrer URI for the requested resource

For any comments and suggestions on the heuristics used for Download Detection please reach out to the Luminate Support Team.