Data Deduplication
Setting up automated data deduplication in CYBERQUEST is trivial:
Step 1: Configure Deduplication settings at agent level:
The parameters used for deduplication are:
"deduplicateData": false, //does not deduplicate by default
"deduplicateWindow": 300, // expressed in seconds, defines deduplication individual event window
"deduplicateMaxWindow": 600, // expressed in seconds, defines maximum event window
"deduplicateMaxCount": 1000, // expressed in number of maximum event per deduplcated event
"deduplicateDropShortTermStorage": false, // sets duplicate events to be dropped from shortTermDataStorage, by default false
"deduplicateDropLongTermStorage": false // sets duplicate events to be dropped from longTermDataStorage
Step 2: Configure Deduplication activation at Data Source level
Step 3: Watch for deduplicated events in Browser Module:
FirstSeen - when the first duplicate dataset message appeared;
LastSeen - when the last message in the duplicate dataset appeared;
DuplicateCount - how many times the message was duplicated;
DuplicationHash - this parameter is used to ensure authenticity between messages (it is the element between duplicate messages)
isDuplicate - appears for all messages detected as duplicate, except for the first message (true if the message is duplicate, otherwise false);
isLastDuplicate - true, only on the last duplicate message, false on the rest.
Also, manual data deduplication is possible in order to mark duplicate: Events Manual Deduplication