Who should use this feature?
The manage duplicate entries feature automatically identifies duplicate entries (based on a configurable similarity threshold) and supports manually setting entries as duplicates.
Through a process of selecting a primary entry and confirming duplicates, duplicates are archived, but can be made visible as part of the judging process.
None of this changes the original entry (e.g duplicate entries are not being merged nor data moved between entries)— so that any actions with managing duplicates are non-destructive, reversible and changeable.
What is automatically identified as a duplicate entry?
Entries with the same or very similar entry name, in the same category.
If you are on a Pro plan, you'll find the Manage duplicates button at the bottom left of the Manage entries list view:
The first step on the Manage duplicates page is the Scan for duplicates button. Whilst scanning is quite quick, scanning is a computationally-intensive process, so is only done on demand.
The scan process compares every entry with all other entries created before it for similarity, then displays a list of all identified duplicate entries.
The entry created and submitted first is treated as the "primary" entry. The primary can be changed. The primary entry in an identified duplicate set is the one that will be the basis of judging.
The objective of a program manager is then to work through sets of duplicates to:
- Compare entries if necessary (via the action overflow)
- Select a different primary if preferred (with the radio button)
- Set entries as Not a duplicate if that is the case (via the action overflow)
- Confirm and archive duplicates when satisfied
- ... and eventually empty the Manage duplicates page
- Only a submitted entry can be set as primary (as that will be the judged entry, and only submitted entries can be judged)
- A set of duplicates with no submitted entries (and therefore no primary) cannot be confirmed as duplicates
- The scan process will handle a maximum of 3,000 entries in a batch to keep things running smoothly. After Confirm + archive is done, running the scan again will continue scanning more entries, and it will not rescan entries already handled
- If an entry name or category is changed after a confirm is done, the entry will be flagged for scanning again with the next scan
- Adjacent to the scan button is a summary of when the last scan was done, how many entries remain to be scanned (if any), and whether entries need to be rescanned (e.g. after a change has been made)
St Settings > General > Entries, is a setting for the Minimum entry similarity percentage, that can be set between 50% and 100%.
The default is set to 85%, and optimum seems to be around 80% to 85%. Changing the setting after a scan has been done will necessitate scanning again.
Any entries already confirmed as duplicates will remain so.
On the manager's view of each entry is a summary of duplicates, where a program manager can:
- have an overview of a duplicate set
- observe or change the primary
- see those that are confirmed as duplicates and archived
- compare duplicate entries
It may be that the automatic scanning process does not identify a duplicate (e.g. because a nominee's nickname is used, which substantially differs in spelling from their name used in other nominations), in which case a program manager can manually set entries as duplicates:
- via the Manage entries list view
- as a bulk action
- or acting on a single entry via the action overflow
The judging duplicates settings will be released very soon - watch this space.