Home > Net >  Adding a Column to a Filtered and Sorted Dataframe
Adding a Column to a Filtered and Sorted Dataframe

Time:04-27

I am attempting to add a column with current current token prices to just the top 9 tokens in this DataFrame. I started with a DataFrame consisting of 37,000 cryptocurrency token pairs (talking liquidity pools). I've sorted the df to just show pairs with the token 'WETH' in the column 'asset2.' I've then sorted the df by the number of tokens in in the pool by descending order and only want to find prices for the first 9 rows.

I have the prices in another df, but when I attempt to combine the two DataFrames I get an error.

Here is the process for filtering and sorting the df:

df_filtered = df[df['Asset2'].str.contains('WETH-USD', na = False)]
df_sorted = df_filtered.sort_values(by=['level2'], ascending=False)
df_top10 = df_sorted[0:9]

Then here is the process for finding the token prices:

import yfinance as yf

priceslist = []
for x in df_top10['Asset1']:
  try:
    price = yf.Ticker(x).info['regularMarketPrice']
    print(price)
    priceslist.append(price)
  except KeyError:
    priceslist.append(float('0'))

But when I attempt to create a new column with these prices...

    df_prices = pd.DataFrame(priceslist)

    df_top10['prices'] = df_prices

I get this error:

    <ipython-input-71-47f058100846>:3: SettingWithCopyWarning:
    A value is trying to be set on a copy of a slice from a DataFrame.
    Try using .loc[row_indexer,col_indexer] = value instead
    See the caveats in the documentation: https://pandas.pydata.org/pandas- 
    docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_top10['prices'] = df_prices

How can I combine these two DataFrames?

Here is some other info:

df_top10.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 9 entries, 36552 to 4666
Data columns (total 11 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   address          9 non-null      object 
 1   asset_addresses  9 non-null      object 
 2   asset_symbols    9 non-null      object 
 3   labels           9 non-null      object 
 4   avg_liquidity    9 non-null      object 
 5   liquidity        0 non-null      object 
 6   level1           9 non-null      float64
 7   level2           9 non-null      float64
 8   Asset1           9 non-null      object 
 9   Asset2           9 non-null      object 
 10  prices1          0 non-null      float64
dtypes: float64(3), object(8)
memory usage: 864.0  bytes


df_prices.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9 entries, 0 to 8
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       7 non-null      float64
dtypes: float64(1)
memory usage: 200.0 bytes


print(df_prices.head())
           0
0  2871.6995
1  2871.6995
2        NaN
3        NaN
4   496.1454

CodePudding user response:

I think it is because you are trying to add a column that is referenced by another header name and not pulled out, I would use join, or maybe concat depending:

Try

df_top10.insert(10, "prices",df_prices['0'] )

or

df_top10 = df_top190.join(df_prices['0'])

CodePudding user response:

Just use copy() on the line df_top10 = df_sorted[0:9] like:

df_top10 = df_sorted[0:9].copy()

Without copy() you are taking a slice of the original dataframe which is what the error message suggests.

  • Related