How to recode 40.000 brand mentions without training data using a levenshtein distance matrix in R.
If you ever had to deal with open-ended questions in online-interviews you definitely run into the problem of quantifing them. The given responses of the interviewees are often full of typos and mismatches needed to get recoded properly, even worse, most brands have various aliases you have to consider into your analysis. I’ll show you how to correctly classify 95%-99% of that given answers, without even using a Machine Learning…
