TLDR: Convert your problem file with Sublime Text by opening the file and using “Save with encoding” as utf-8. Alternatively, use  iconv -t UTF-8//TRANSLIT -c Zip_Zhvi_SingleFamilyResidence.csv > new_file.csv
When does this error happen?
I wanted to parse the housing data from Zillow at their research page. Zip code is a great measure of single family home real estate values.

However, when I download this data set as “Zip_Zhvi_SingleFamilyResidence.csv”, I could not simply load this data into pandas.

This last line seemed like the clue:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 4: invalid continuation byte
Well, what format is that file?
Using a Mac, we can use file -I <file_name>
![]()
Oh, great! its “us-ascii”, we just pass that encoding into pandas right?

Oh maybe, I need to specify the encoding I want. WHY PANDAS, WHY!?

Why does this error happen?
Some encoding error has occurred, maybe because you accidentally opened Excel before opening ipython or Zillow saves in a crazy format.
Awesome, lets just convert it
Let’s use the *nix program iconv to convert the file. According to the man page (man iconv), “The iconv program converts text form one encoding to another encoding. Great!![]()
Let’s use this.
iconv -f us-ascii -t utf-8 < Zip_Zhvi_SingleFamilyResidence.csv > new_zip_code_file.csv
![]()
“cannot convert”
But iconv, that’s your only job… you know, unix philosophy, one program, one job done well etc etc.
Turns out if you use “//TRANSLIT” appended to the encoding, characters are transliterated when needed and
possible (man page)
Solution 1 – iconv with //TRANSLIT
> iconv -t UTF-8//TRANSLIT -c Zip_Zhvi_SingleFamilyResidence.csv > new_file.csv
> mv new_file.csv Zip_Zhvi_SingleFamilyResidence.csv
Solution 2 (easier to remember) – Sublime Text
Is there a better free editor than Sublime? Be a good citizen and buy your license.
Step 1: Open your file in Sublime Text
Step 2: Save with Encoding > UTF-8
DONE!

read_csv to your hearts desire 🙂
ipython> data = pd.read_csv("new_file.csv")