This repository contains a Python notebook that automates the process of matching district names between two datasets: people_of_india_clean_2014.csv and minority_conc_census_2011.csv. The matching is performed using n-grams and Jaccard similarity to compare district names and identify the most similar pairs.
bishmaybarik/ngram-code
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|