Fuzziness is a concept used to compare two names and recognize similarities even if they are not exactly the same. It allows for small differences like spelling variations, typos, or missing letters so that names that look or sound alike can still be matched.
Flagright uses advanced fuzziness logic to match a name and/or aliases with the individual/business name and aliases in the screening watchlist. This approach accounts for variations in spelling, word order, and data inconsistencies, enabling the identification of potential matches even when the input data does not perfectly align with the records in the watchlist.
To enhance precision, Flagright has developed customizable elements to the name-matching algorithms, each tailored to specific scenarios. These customizations include:
Method 1: Fuzziness is calculated using only the Levenshtein Distance (Edit Distance). Method 2: Fuzziness is calculated using Levenshtein Distance while ignoring special characters and spaces. Method 3: An upgraded version of Method 2 that incorporates name pairings and similarity calculations. For example, "ALAGADOREY KALIAPPAN" and "KALIAPPAN ALAGADOREY" will have a fuzziness of 0, as corresponding name components are compared individually (e.g., "ALAGADOREY" with "ALAGADOREY" and "KALIAPPAN" with "KALIAPPAN").
Levenshtein Distance Logic
The matching algorithm calculates the minimum number of single-character edits (insertions, deletions, or substitutions) required to transform one string into another.
Example 1 (Method 1):
User’s Name: "John Smith"
Watchlist Name: "Jon Smyth"
Checks: "John" vs. "Jon" → 1 substitution (remove "h") "Smith" vs. "Smyth" → 1 substitution (swap "i" for "y")
Total Levenshtein Distance (d) = 2 Maximum Length including spaces: max(10, 9) = 10 Fuzziness (%) = (2/10)\100 = 20%
Depending on the fuzziness threshold (e.g., 0-10% or 0-20%), this match might be classified as a false positive or a true positive.
Example 2 (Method 2): User’s Name: "John @ Smith" Watchlist Name: "Jon Smyth" Checks: Remove special character "@" "John" vs. "Jon" → 1 substitution (remove "h") "Smith" vs. "Smyth" → 1 substitution (swap "i" for "y") Total Levenshtein Distance (d) = 2 Maximum Length excluding spaces and special characters: max(9, 8) = 9 Fuzziness (%) = (2/9)\100 = 22.22%
Depending on the fuzziness threshold (e.g., 0-10% or 0-20%), this match might be classified as a false positive or a true positive.
Example 3 (Method 3): User’s Name: "Smith John" Watchlist Name: "Jon Smyth" Checks: Compare similar names "Smith" with "Smyth" and "John" with "Jon" "John" vs. "Jon" → 1 substitution (remove "h") "Smith" vs. "Smyth" → 1 substitution (swap "i" for "y") Total Levenshtein Distance (d) = 2 Maximum Length: max(9, 8) = 9 Fuzziness (%) = (2/9)\100 = 22.22%
Depending on the fuzziness threshold (e.g., 0-10% or 0-20%), this match might be classified as a false positive or a true positive.
