Manage duplicate entries

The manage duplicate entries feature automatically identifies duplicate entries (based on a configurable similarity threshold) and supports manually setting entries as duplicates. Through a process of selecting a primary entry and confirming duplicates, duplicates are archived, but can be made visible as part of the judging process.

None of this changes the original entry (e.g duplicate entries are not being merged nor data moved between entries)— so that any actions with managing duplicates are non-destructive, reversible, and changeable.

What is automatically identified as a duplicate entry?

Entries with the same or very similar entry name, in the same category.

Scanning for and confirming duplicates

If you are on a Pro plan, you'll find the Manage duplicates button at the bottom left of the Entries list view in the Manage workspace:

Manage duplicates button

The first step on the manage duplicates page is the Scan for duplicates button. Whilst scanning is quite quick, scanning is a computationally-intensive process, so is only done on demand.
Scan for duplicates button

The scan process compares every entry with all other entries created before it for similarity, then displays a list of all identified duplicate entries.

List of duplicate entries

The entry created and submitted first is treated as the "primary" entry. The primary can be changed. The primary entry in an identified duplicate set is the one that will be the basis of judging.

The objective of a program manager is then to work through sets of duplicates to:

  • Compare entries if necessary (via the action overflow)
  • Select a different primary if preferred (with the radio button)
  • Set entries as Not a duplicate if that is the case (via the action overflow)
  • Confirm and archive duplicates when satisfied
  • ... and eventually empty the Manage duplicates page

Duplicate options list

  • Only a submitted entry can be set as primary (as that will be the judged entry, and only submitted entries can be judged)
  • A set of duplicates with no submitted entries (and therefore no primary) cannot be confirmed as duplicates
  • The scan process will handle a maximum of 3,000 entries in a batch to keep things running smoothly. After Confirm + archive is done, running the scan again will continue scanning more entries, and it will not rescan entries already handled
  • If an entry name or category is changed after a confirm is done, the entry will be flagged for scanning again with the next scan
  • Adjacent to the scan button is a summary of when the last scan was done, how many entries remain to be scanned (if any), and whether entries need to be rescanned (e.g. after a change has been made)

Setting the similarity threshold

In the Manage workspace, under Settings > Entries > General, is a setting for the Minimum entry similarity percentage, this can be set between 50% and 100%.

Similarity percentage threshold

The default is set to 85%, and optimum seems to be around 80% to 85%. Changing the setting after a scan will necessitate scanning again.

Any entries already confirmed as duplicates will remain so.

Duplicates summary

On the manager's view of each entry is a summary of duplicates, where a program manager can:

  • have an overview of a duplicate set
  • observe or change the primary
  • see those that are confirmed as duplicates and archived
  • compare duplicate entries

Manually setting duplicates

It may be that the automatic scanning process does not identify a duplicate (e.g. because a nominee's nickname is used, which substantially differs in spelling from their name used in other nominations), in which case a program manager can manually set entries as duplicates:

  • via the Entries list view 
  • as a bulk action
  • or acting on a single entry via the action overflow

Set as duplicate from entry ellipsis

Judging duplicates

The quantity of duplicate entries and the content of duplicates may have a bearing on judging, so there are a couple of options to be able to display this information to judges. This can be set on each score set, so can be treated differently for different stages of judging.

In the Manage workspace under Judging > Score sets, select the score set in question, then, from the 'Display' tab, scroll down to the 'Display' widget. You will see two checkboxes to control what is shown to judges in relation to duplicate entries:

Duplicate entries checkboxes in score set

Based on these settings, a duplicates box is displayed at the bottom of the entry for judges to view. They can see the total number of duplicates, who the entrants/nominators are and, optionally, links to those duplicates for review. All judging, scoring and commenting is against the primary entry only.

Example: The view of an entry, scrolled to the bottom, as seen by a judge

 Judge view of duplicates

Was this article helpful?
0 out of 1 found this helpful

Articles in this section

See more