Special characters in content keyword lists under Email Data Protection.cloud
search cancel

Special characters in content keyword lists under Email Data Protection.cloud

book

Article ID: 161430

calendar_today

Updated On:

Products

Email Security.cloud

Issue/Introduction

Use of special characters and/or wildcards under keyword lists.

How Data Protection interprets or uses special characters to match text in emails when specified in a keyword list.

Resolution

Valid and invalid characters and characteristics in Content Control lists

List type Description, valid and invalid characters and characteristics of list type
Keyword Lists

The content of an email can be matched against entries in a predefined or a custom list of words and phrases.

  • Digits are supported. A space is not supported as a literal character.
  • So foo<space>bar detects foo followed by bar regardless of the number of spaces you enter.
  • The following characters are supported as literal characters and as a space: " & ' < > . _ + = { } [ ] :;@ ~ # | / , ! £ $ % ^ ( )

So content that contains one of these characters is detected. However, the same the content without the character is also detected.

  • The following character is supported: -
  • The character ! at the beginning of a content phrase means NOT. Use ! to make an exception of a phrase that includes a word that you typically block. For example, to block breast but permit chicken breast, inlcude breast and !chicken breast in your list. The ! must appear at the start of the phrase. Chicken !breast detects the literal "chicken !breast".
  • The character backslash  is treated as an escape character.  The backslash enables you to treat special characters as literal characters.  So to look for the character "*" rather than using it as a wildcard, enter \*.
  • Two backslashes \\ would detect the literal "\". 
  • A backslash followed by a question mark, i.e., "\?" detects the literal character "?".

Wildcards are supported with the following characters (these are only recognized as wildcards and are not translated literally):

  • * by itself represents zero or more characters. Thus B*d matches Bold,Bid, and Billiard
  • ? by itself represents a single character. Thus B?d matches BidBad, and Bod
MIME types

MIME type lists can be used where email content and attachment conditions are required. The MIME types can be matched against entries in a predefined or a custom list of types.

If you are unsure of useful file extensions for your custom MIME type lists, you can copy and paste entries from the predefined lists.

  • Digits are supported
  • Spaces are not supported
  • The following characters are not supported: ! " £ % ^ & ( ) = { } [ ] :;@ ' ~ # | < > ,?
  • The following characters are supported: $ - _ + .
  • * is supported as a wildcard only
  • / is supported as a type/subtype separator only
  • Wildcards are supported to indicate all subtypes for the specified type, for example:
    • type/*
  • Entries must take one of the following forms:
    • type/subtype specific type and subtype combination
    • type/* all subtypes for specified type

Validation of MIME type and subtype text is not performed.

File names

File names of email attachments can be matched against entries in a custom list.

  • Digits are supported
  • Spaces are supported
  • The following characters are not supported: " & :' | / < > ?
  • The following characters are supported: ! £ $ % ^ ( ) - _ + = { } [ ] ;@ ~ # ,.
  • The use of * as a wildcard is allowed, for example: topsecr*,*.exe, and file*.com
URLs

URL lists can be used to detected content in the form of a URL within an email body, header, or subject. URL lists enable you to restrict the communication of specified URLs around the business. Restricting the sending or receipt of URLs removes encouragement for employees to access specific Websites. Use this in combination with the Web Security service to provide complete protection against a user accessing inappropriate or malicious Web sites.

URL entries must be of the following formats:

  • http://www.xxxxxx.com
  • https://www.xxxxx.com

Wildcards are supported with the following characters (these are only recognized as wildcards and are not translated literally):

  • * represents zero or more characters
  • ? represents a single character

Thus:

  • http://www.*.com stops all URLs that take the .com format
  • http://www.ford*.com stops http://www.fordcar.com andhttp://www.fordescort.com
  • http://www.ford.* stops http://www.ford.com,http://www.ford.co.uk, etc.
Domain lists

Domain lists can be used where sender or recipient conditions are required. The sender or recipient of an email can be matched against entries in a custom list.

  • The use of digits is supported
  • The use of spaces is not supported
  • The following characters are not supported because they are not permitted in domain names by RFC standards: ! " £ $ % ^ ( ) _ + = { } [ ] :;@ ' ~ # | / < > ,?
  • The following character is supported: -
  • The following character is supported as a sub-domain separator only: .
  • You can use the * as a wildcard within the domain section, for example:
    • *.example.com stops a subdomain of example.com, or
    • example.* stops example.comexample.co.uk, andexample.net

You can enter characters in extended character set languages into your email content lists. So list items in Japanese, Korean, Chinese, and Russian are identified in the scanning process.