Session - {janitor} clean data
Packages like {janitor} have functions to do a lot of the cleaning required for data like:
Getting the data from following slides
Removes spaces and changes the %
to a word
Often code removes duplicates but sometimes you’ll want to see all the duplicated information:
duplicates <- tibble::tribble(
~Ethnicity, ~`%`, ~`estimated.number.(thousands)`,
"All", 90.8, 48098,
"All", 90.8, 48098,
"All", 90.8, 48098,
"Bangladeshi", 91.9, 354,
"Chinese", 98.6, 265,
"Indian", 90.4, 1077,
"Pakistani", 91.1, 767,
"Asian other", 95.6, 620,
"Black", 92.8, 1376,
"Mixed", 96, 547,
"White", 90.5, 42296,
"Other", 94.5, 796
)