Home > Mobile >  Create new dataframe from another dataframe
Create new dataframe from another dataframe

Time:06-01

I've created a dataframe. I'd like to create a new dataframe depending on the current dataframe's conditions. My Python code is as follows:

df = pd.DataFrame({'A':[1,2,3,4,5,6,7,8,9,10],'B':[10,20,30,40,50,60,70,80,90,100]})
df
    A   B
0   1   10
1   2   20
2   3   30
3   4   40
4   5   50
5   6   60
6   7   70
7   8   80
8   9   90
9   10  100

import pywt
import numpy as np

import scipy.signal as signal
import matplotlib.pyplot as plt
from skimage.restoration import denoise_wavelet
wavelet_type='db6'


def new_df(df):
  df0 = pd.DataFrame()
  if (df.iloc[:,0]>=1) & (df.iloc[:,0]<=3):
    df0['B'] = denoise_wavelet(df.loc[(df.iloc[:,0]>=1) & (df.iloc[:,0]<=3),'B'], method='BayesShrink', mode='soft', wavelet_levels=3, wavelet='sym8', rescale_sigma='True')
  elif (df.iloc[:,0]>=4) & (df.iloc[:,0]<=6):
    df0['B'] = denoise_wavelet(df.loc[(df.iloc[:,0]>=4) & (df.iloc[:,0]<=6),'B'], method='BayesShrink', mode='soft', wavelet_levels=3, wavelet='sym8', rescale_sigma='True') 
  else:
    df0['B']=df.iloc[:,1]
  return df0

I want a new dataframe that will denoise the values in column B that meet the conditions, but leave the remaining values alone and keep them in the new dataframe. My code gives me error message: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). Could you please help me?

My desired output should look

    A   B
0   1   15*
1   2   25*
2   3   35*
3   4   45*
4   5   55*
5   6   65*
6   7   70
7   8   80
8   9   90
9   10  100

#* represents new values may be different when you get the result. 
#this is just for a demo. 

May be my code idea is wrong. Could you please help me?

CodePudding user response:

(df.iloc[:,0]>=1) will return a pandas series of boolean values corresponding to which elements in the first column of df are greater than or equal to 1.

In the line

if (df.iloc[:,0]>=1) & (df.iloc[:,0]<=3):

you are hence trying to do boolean arithmetic with two pandas series which doesn't make sense.

Pandas gives you a hint in the error message as to what might solve the problem: e.g. if you wanted to check whether any element in df.iloc[:,0] was greater than one, you could use (df.iloc[:,0]>=1).any() which would return a single bool that you could then compare with the result of (df.iloc[:,0]<=3).any(). Without more context to the problem or what you're trying to do, it is hard to help you further.

CodePudding user response:

Note that since you are filtering the data while passing it to denoise_wavelet, you don't really need the if statements, but you should assign the returned value to the same "view" of the DataFrame. Here is my approach. It first copy the original DataFrame and replace the desired rows with the "denoised" data.

import numpy as np
import pandas as pd
import scipy.signal as signal
import matplotlib.pyplot as plt
from skimage.restoration import denoise_wavelet
wavelet_type='db6'

df = pd.DataFrame({'A':[1,2,3,4,5,6,7,8,9,10],'B':[10,20,30,40,50,60,70,80,90,100]})


def new_df(df):
    df0 = df.copy()
    df0.loc[(df.iloc[:,0]>=1) & (df.iloc[:,0]<=3),'B'] = denoise_wavelet(df.loc[(df.iloc[:,0]>=1) & (df.iloc[:,0]<=3),'B'].values, method='BayesShrink', mode='soft', wavelet_levels=3, wavelet='sym8', rescale_sigma='True')
    df0.loc[(df.iloc[:,0]>=4) & (df.iloc[:,0]<=6),'B'] = denoise_wavelet(df.loc[(df.iloc[:,0]>=4) & (df.iloc[:,0]<=6),'B'].values, method='BayesShrink', mode='soft', wavelet_levels=3, wavelet='sym8', rescale_sigma='True') 
    return df0

new_df(df)

However, I don't really know how denoise_wavelet so I don't know if the result is correct, but the values from index 6 to 9 are left unchanged.

  • Related