I have a data set that contains '-' and 'na' value. How to convert the data that are considered missing values to NAN by using the na_values attribute?
df = pd.read_csv('austin_weather.csv'.na_values==['na','-'])
CodePudding user response:
You code is not correct, try this:
import pandas as pd
from pprint import pprint
# test.csv:
# h1;h2;h3;h4
# test1;test2;-;na
# this is an incorrect syntax:
# df = pd.read_csv('test.csv'.na_values==['na','-'])
# >> AttributeError: 'str' object has no attribute 'na_values'
# correct usage of pandas read_csv():
df = pd.read_csv('test.csv', na_values=['na','-'], sep=';')
pprint(df)
# >> h1 h2 h3 h4
# >> 0 test1 test2 NaN NaN
CodePudding user response:
Like Jeffrey Ram and Mortz explained in the comments, pandas.read_csv
arguments need to be separated by a comma ,
and values have to be assigned by the equals sign =
:
Use this instead :
df = pd.read_csv('austin_weather.csv', na_values=['na','-'])