Home > Software design >  How to insert range of 'NaN' values each n rows into dataframe in Python
How to insert range of 'NaN' values each n rows into dataframe in Python

Time:07-10

Good morning guys, I have a simple question for you.

Considering this dataframe dataset:

N Lat Long 
1 37.536866 15.068850 
2 37.536867 15.068850 
3 37.536868 15.068850 
4 37.536868 15.068850 
5 37.536869 15.068850 
6 37.536869 15.068850 
7 37.536870 15.068850 
8 37.536871 15.068850 
9 37.536871 15.068850 

How do I replace the actual values of Lat column with NaNs considering a step of 2? Here is what the final dataframe should look like:

N Lat Long 
1 37.536866 15.068850 
2 NaN 15.068850 
3 NaN 15.068850 
4 37.536868 15.068850 
5 NaN 15.068850 
6 NaN 15.068850 
7 37.536870 15.068850 
8 NaN 15.068850 
9 NaN 15.068850 

Thank you in advance.

CodePudding user response:

You can do this:

import numpy as np
import pandas as pd

data = np.array(
    [
        1,37.536866,15.068850,
        2,37.536867,15.068850,
        3,37.536868,15.068850,
        4,37.536868,15.068850,
        5,37.536869,15.068850,
        6,37.536869,15.068850,
        7,37.536870,15.068850,
        8,37.536871,15.068850,
        9,37.536871,15.068850
    ]
)

df = pd.DataFrame(data.reshape(-1, 3), columns=["N", "Lat", "Lng"])

df.loc[(df["N"] - 1) % 3 != 0, "Lat"] = np.nan
>>> df
     N        Lat       Lng
0  1.0  37.536866  15.06885
1  2.0        NaN  15.06885
2  3.0        NaN  15.06885
3  4.0  37.536868  15.06885
4  5.0        NaN  15.06885
5  6.0        NaN  15.06885
6  7.0  37.536870  15.06885
7  8.0        NaN  15.06885
8  9.0        NaN  15.06885

CodePudding user response:

Here is a simply code to achieve what you want.

There can be other methods that can enhance the computation time.

import numpy as np
import pandas as pd

data = np.array([1,37.536866,15.068850,
2,37.536867,15.068850,
3,37.536868,15.068850,
4,37.536868,15.068850,
5,37.536869,15.068850,
6,37.536869,15.068850,
7,37.536870,15.068850,
8,37.536871,15.068850,
9,37.536871,15.068850])

df = pd.DataFrame(data.reshape(-1, 3), columns=['N', 'Lat', 'Lng'])

Case: Step size 2 for the latitude value

df_new = df.copy()
df_new['Lat'].loc[~(np.fmod(df_new['Lat'], 2e-6) < 1e-7)] = np.nan
     N        Lat       Lng
0  1.0  37.536866  15.06885
1  2.0        NaN  15.06885
2  3.0  37.536868  15.06885
3  4.0  37.536868  15.06885
4  5.0        NaN  15.06885
5  6.0        NaN  15.06885
6  7.0  37.536870  15.06885
7  8.0        NaN  15.06885
8  9.0        NaN  15.06885

Case: Step size 2 for the index

df_new = df.copy()
df_new.loc[1::3, 'Lat'] = np.nan
df_new.loc[2::3, 'Lat'] = np.nan
     N        Lat       Lng
0  1.0  37.536866  15.06885
1  2.0        NaN  15.06885
2  3.0        NaN  15.06885
3  4.0  37.536868  15.06885
4  5.0        NaN  15.06885
5  6.0        NaN  15.06885
6  7.0  37.536870  15.06885
7  8.0        NaN  15.06885
8  9.0        NaN  15.06885
  • Related