Copying a Pandas DataFrame Columns leads to Nan


I have this very simple Python Pandas DataFrame calles "sa"

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 5 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   searchAppearance  7 non-null      object 
 1   clicks            7 non-null      int64  
 2   impressions       7 non-null      int64  
 3   ctr               7 non-null      float64
 4   position          7 non-null      float64
dtypes: float64(2), int64(2), object(1)
memory usage: 408.0  bytes

with these values

          searchAppearance  clicks  impressions       ctr   position
0      AMP_TOP_STORIES     376          376  0.022917   8.108978
1        AMP_BLUE_LINK   55670        55670  0.051522  13.158574
2      PAGE_EXPERIENCE   68446        68446  0.039298  20.056293
3       RECIPE_FEATURE   40175        40175  0.042920   4.186674
4  RECIPE_RICH_SNIPPET   37428        37428  0.069153  18.726152
5                VIDEO      72           72  0.025361  15.896090
6              WEBLITE       1            1  0.001055  51.493671

all is good there.

now I do


this leads to

          searchAppearance  clicks  impressions       ctr   position  ctr-test
0      AMP_TOP_STORIES     376          376  0.022917   8.108978  0.039522
1        AMP_BLUE_LINK   55670        55670  0.051522  13.158574  0.026543
2      PAGE_EXPERIENCE   68446        68446  0.039298  20.056293  0.051098
3       RECIPE_FEATURE   40175        40175  0.042920   4.186674       NaN
4  RECIPE_RICH_SNIPPET   37428        37428  0.069153  18.726152       NaN
5                VIDEO      72           72  0.025361  15.896090       NaN
6              WEBLITE       1            1  0.001055  51.493671       NaN

do you see all the NaN? but only starting from the 3rd row? it does not make any sense to me. the dataframe info still looks good

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 6 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   searchAppearance  7 non-null      object 
 1   clicks            7 non-null      int64  
 2   impressions       7 non-null      int64  
 3   ctr               7 non-null      float64
 4   position          7 non-null      float64
 5   ctr-test          3 non-null      float64
dtypes: float64(3), int64(2), object(1)
memory usage: 464.0  bytes

i don't get it. what is going wrong? I am using Google Collaborate.

Seems like a bug, but not in my code? Any idea on how to debug this? (If it's not in my code.)

CodePudding user response:

The output of sa.info() includes this line:

5   ctr-test          3 non-null      float64

It seems that these three non-null values end up in the first three rows of sa.

CodePudding user response:


stupidity on my side, mix up in dataframes / variables

