An Example of Dirty Data (with format error, missing values, and duplicate values)

City Country Population Area Density
r1 New York USA 8734520 6400 26403
r2 Philadelphia United States "1,204,542" "3,231" NaN
r3 New York City USA 8734520 6400 26403

-- Jiannan Wang

from "When Data Cleaning Meets Crowdsourcing"

Quoted on Mon Mar 9th, 2015