Home > database >  Turning a Dataframe into a Series with .squeeze("columns")
Turning a Dataframe into a Series with .squeeze("columns")

Time:07-21

I'm studying how to work with data right now and so I'm following along with a tutorial for working with Time Series data. Among the first things he does is read_csv on the file path and then use squeeze=True to read it as a Series. Unfortunately (and as you probably know), squeeze has been depricated from read_csv.

I've been reading documentation to figure out how to read a csv as a series, and everything I try fails. The documentation itself says to use pd.read_csv('filename').squeeze('columns') , but, when I check the type afterward, it is always still a Dataframe.

I've looked up various other methods online, but none of them seem to work. I'm doing this on a Jupyter Notebook using Python3 (which the tutorial uses as well).

If anyone has any insights into why I cannot change the type in this way, I would appreciate it. I'm not sure if I've misunderstood the tutorial altogether or if I'm not understanding the documentation.

I do literally type .squeeze("columns") when I write this out because when I write a column name or index, it fails completely. Am I doing that correctly? Is this the correct method or am I missing a better method?

Thanks for the help!

    shampoo = pd.read_csv('shampoo_with_exog.csv',index_col= [0], parse_dates=True).squeeze("columns")

CodePudding user response:

I would start with this...

#Change the the stuff between the '' to the entire file path of where your csv is located. 
df = pd.read_csv(r'c:\user\documents\shampoo_with_exog.csv')

To start this will name your dataframe as df which is kind of the unspoken industry standard the same as pd for pandas.

Additionally, this will allow you to use a "raw" (the r) string which makes it easier to insert directories into your python code.

Once you are are able to successfully run this you can simply put df in a separate cell in jupyter. This will show you what your data looks like from your CSV. Once you have done all that you can start manipulating your data. While you can use the fancy stuff in pd.read_csv() I mostly just try to get the data and manipulate it from the code itself. Obviously there are reasons not to only do a pd.read_csv but as you progress you can start adding things here and there. I almost never use squeeze although I'm sure there will be those here to comment stating how "essential" it is for whatever the specific case might be.

  • Related