I am having a problem when I pull data from Google sheets into a pandas dataframe. When I call the dataframe I see numbers above the column names.
Here is what I have tried so far:
df = df.rename_axis(None)
df.rename_axis(None, axis = 1)
df = df.reset_index(drop=True)
del df.index.name
df = df.iloc[1: , :]
None of these removed those numbers above the columns. Does anyone have any other suggestions for removing?
When I input print(df.columns)
the result is:
RangeIndex(start=0, stop=10, step=1)
When I print(df.iloc[0, :])
the result is:
0 Day
1 Currency
2 Spend
3 Total Order Value
4 CVR (Click through)
5 Clicks
6 Impressions
7 CTR
8 CPC
9 Conversions (Click through)
Name: 0, dtype: object
CodePudding user response:
Since you haven't provided an executable code snippet it's hard to be sure, but from the image you provided it looks as though the column names have ended up in the first row of the dataframe. This can be deduced because the first row index (0) is aligned with the string values you wanted to use as column names.
The index numbers along the top are in fact the (default) column names.
You can confirm this by doing the following:
print(df.columns) # This will print the actual column names
print(df.iloc[0, :]) # This will print the first row of values
Can you add the output of these statements to your question? Once we have confirmed this, we can think about how to solve the problem.
If I am correct in my diagnosis, you can see this answer for how to fix the problem.
CodePudding user response:
I believe the problem is when you read in your csv file. Pandas is assuming those numbers at the top are your headers (when in fact its your first row) There are 2 fixes I can suggest.
The first one is to tell pandas that your header is actually your first row
df = pd.read_csv("PATH", header=1)
The second is pretty much similar to the first one, but you tell pandas to skip the first row
df = pd.read_csv("PATH", skiprows=1)