I have an excel document that has the information of 3 columns in only one separated by ",". I want to separate the columns during the pd.read_excel(). I tried to use usecols but it did not work. I would like also to name the columnus while calling pd.read_excel().
CodePudding user response:
Not sure how your .xlsx
file is formatted but it looks you should be using pandas.read_csv()
instead. Link here.
So maybe something llike pandas.read_csv(filename, sep=',', names=['Name', 'Number', 'Gender'])
CodePudding user response:
Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string. It works similarly to Python’s default split() method but it can only be applied to an individual string. Pandas str.split() method can be applied to a whole series. .str has to be prefixed every time before calling this method to differentiate it from Python’s default function otherwise, it will throw an error. Source
CodePudding user response:
The text inside your excel is comma sep. One way to do is simply convert that excel to text before reading like so.
your excel
a,b,c
0 1,2,3
1 4,5,6
Convert to text & read again.
import pandas as pd
with open('file.txt', 'w') as file:
pd.read_excel('file.xlsx').to_string(file, index=False)
df = pd.read_csv("file.txt", sep = ",")
print(df)
Which prints #
a b c
0 1 2 3
1 4 5 6