Monday, March 5, 2018

pandas read_csv from https with Python 3.6.4

On Mac OSX, if you are using Python 3.6 and pandas to try to read a csv file via https:
california_housing_dataframe = pd.read_csv("https://storage.googleapis.com/mledu-datasets/california_housing_train.csv", sep=",")
california_housing_dataframe.describe()
 you may get an error like:
urllib.error.URLError:

To fix this issue:
Open a terminal and take a look at:
/Applications/Python 3.6/Install Certificates.command
Python 3.6 on MacOS uses an embedded version of OpenSSL, which does not use the system certificate store. More details here.
(To be explicit: MacOS users can probably resolve by opening Finder and double clicking Install Certificates.command)
Or read https csv with a workaround:
from io import StringIO

import pandas as pd
import requests


url = "https://storage.googleapis.com/mledu-datasets/california_housing_train.csv"
s = requests.get(url).text
c = pd.read_csv(StringIO(s))
print(c.head())


No comments: