My raw data in csv format is as below
text,location,user_follower_count,user_verified
shameful a poor man from bihar is killed j amp k administration under central rule did not even arrange for his body to be flown,NA,NA,NA
being right about anything can be its own reward but you migh more for aquarius,NA,NA,NA
they don t want herd immunity they want a herd mentality,NA,NA,NA
I am trying to read the above file as follows.
raw_data = pd.read_csv(raw_tweet_data,sep=",", header='infer')
But everything gets deleted jammed under column (text). location,user_follower_count,user_verified columns are "Nan". I tried both "delimiters" and "sep". Both don't work. Why is it so?
CodePudding user response:
try to mention the columns when reading along with other parameters like pd.read_csv(columns=[...]) once I faced this type of problem by mentioning the column names it was solved.
CodePudding user response:
Problem was because of "NA" string in columns like location, user_follower_count and user_verified Pandas is reading "NA" string as Nan. Once I changed the string "NA" to "not available", issue got resolved. Bit tricky. Thanks.