1.
A Watermark within Information Centric Analytics (ICA) will determine how far back in time a data source query will look back when it executes against the Source System. The user will be allowed to select a query column as defined in the Query Details. When defining the watermark column, the column should either be a date-based column, or it should be an Integer based column. When loading data through ICA’s integration wizard it will track the latest value defined for the column using the SQL Server Max() function. The next time the data source query executes ICA will use the Watermark Value to determine where the previous query left off and it will only bring back records greater than the last known watermarked date or integer field. Since it is performing this comparison it is much more effective to use either an integer or a date field.
Figure 1 This figure depicts where one would define the watermark for a data source query
2.
When no watermark column is defined for an SQL based data source query ICA will truncate all data in the staging table, it will pull back all data defined within the SQL statement and it will reload the staging table with the results from the query. It differs from a data source query using a watermark in that the data source query that leverages a watermark does not get truncated and reloaded on a nightly basis, it will instead keep appending to the staging table on a nightly basis.
Figure 2 This statement depicts a sample process with no Watermark defined, data will be truncated and reloaded nightly

Figure 3 This statement depicts a sample process with a Watermark defined, data will not be truncated from stage, ICA will only add data to the staging table
Thanks for your feedback. Let us know if you have additional comments below. (requires login)