Table: Data source file requirements and considerations
Requirement or consideration
Only one word per delimited field consisting of at least two alphanumeric characters
Each entry must contain at least two alphanumeric characters. Single character entries in a field are unsupported.
Also, a data source file can only have a single word per delimited field. For example, if you want to match the data "1st Street", place "1st" in one delimited field. Place "Street" in a separate (but following) field. If you place two or more words in a field, a match is less likely.
For example, if you place the string "1st Street" in a single delimited field, you have placed multiple "words" in the same cell. A match is then unlikely since the only match that is expected would be when the data that is examined is in a tabular format. In a tabular format, the two strings (1st) and (Street) are evaluated as one string (1st Street). Similar behaviors exist when you try to match any languages that recognize white spaces differently, such as Korean or Chinese.
Only specific separators are recognized for credit card and number patterns
Symantec Messaging Gateway recognizes only certain separator characters when it attempts to match record entries in credit card and number pattern fields.
The recognized separator characters (other than space) for credit card and number pattern fields are as follows:
Pound sign (#)
Plus sign (+)
Symantec Messaging Gateway interprets any numbers that contain an unrecognized separator as a word. For example, 4123*6666*7777*8888 would not return a valid match against a credit card number field. Symantec Messaging Gateway interprets this content as the word: 4123*6666*7777*8888.
Use a delimiter other than commas to separate adjacent number patterns
Use a tab or pipe (|) instead of a comma as your field delimiter to separate two adjacent fields that are number patterns. Otherwise, the Record validator may interpret the two numbers from adjacent fields as a single number.
For example, assume that you have adjacent fields: Age and Weight. And assume that you have separated the fields for Age and Weight by a comma; for example: 25,150. Symantec Messaging Gateway might interpret 25,150 as belonging to the Age field instead of 25 belonging to the Age field and 150 belonging to the Weight field.
Data source file must have the minimum number of columns required by the Structured Data template
Ensure that your data source file contains the columns that you want to use to define a view. For example, assume that you use a Structured Data policy that calls for a minimum of three fields to trigger a violation. Those three fields must be mapped in the record so that Symantec Messaging Gateway can reference them.
For example, assume that you use the EU Data Protection Directives policy template. Any view that accesses the EU Data Protection Directives policy should be configured to match entries in at least four of five fields: Last name, email, phone, account number, and user name.
Pattern fields should match the data types that are used in the policy template
The pattern fields must correspond to the data types that your Structured Data policy template uses. For example, assume that you want to create a policy with the Customer Data Protection template. You should use the pattern fields that correspond to Social Security number, credit card number, phone, and email columns.
If the mappings in your record do not match the columns in the header row, Symantec Messaging Gateway counts the actual header row as invalid. The header row is considered invalid because it returns values other than those expected.
Credit card numbers must pass the Luhn checksum test
All credit card numbers must pass the Luhn checksum test, where total modulus 10 is congruent to 0, to produce a match. The Luhn test is used to distinguish valid numbers from random collections of digits.
CRLF line breaks that precede rows in a data source file are included in row counts
If the data source file does not contain CRLFs, Symantec Messaging Gateway skips the header row in the row count. If a data source file contains CRLFs, Symantec Messaging Gateway treats the first CRLF as the header row. So it returns the values from subsequent rows, including those for the actual header row.
For example, assume that one column is mapped to recognize the US ZIP code pattern and one or more CRLFs begin the data set. Symantec Messaging Gateway counts the actual header row as a normal row. It expects to return a 5-digit number in that column. When the actual header row returns a word value instead of a 5-digit number, Symantec Messaging Gateway counts it as an invalid row.
Symantec Messaging Gateway ignores the CRLFs that occur within a data set or at the end of a data set. Such CRLFs are not counted as rows.
Rows that occur more than 99 times are not matched
Because of implementation limitations, Symantec Messaging Gateway cannot match any rows that occur more than 99 times.