Home > Back-end >  To Create index lookup in python dataframe
To Create index lookup in python dataframe

Time:10-12

enter image description here

To create an Index for the ‘IMEI & IMSI’ pair as follows:

  • IMEI index (named A): a number is assigned for each IMEI in sequential order, starting with the integer number 1.
  • IMSI index (named as B): a number is assigned for each IMSI, in the sense that: if an IMEI appears more than once, the paired IMSIs will be assigned in sequential order, e.g.: 1, 2, 3...]2

This is what I am trying


import pandas as pd
import numpy as np
df1 = pd.readcsv('file.csv')
df1 = df1[['Imei','Imsi']]
df1 = df1.groupby(['Imei']).count()
df1 = df1.rename(columns = {'Imsi': 'Occurences'})
df1 = df1.sort_values(by=['Occurences'],ascending= False)
df1 = df1.reset_index()


I am able to count occurrences but how can I print them in order starting from 1 like shown in the image

CodePudding user response:

Use cumsum to compute IMEI Index and groupby_cumcount to compute IMSI Index.

# Setup
df = pd.DataFrame({'Imei': {0: 'A', 1: 'A', 2: 'B', 3: 'B', 4: 'C', 5: 'C', 6: 'C', 7: 'D', 8: 'E'}})

df['IMEI Index'] = df['Imei'].ne(df['Imei'].shift()).cumsum()
df['IMSI Index'] = df.groupby('Imei').cumcount().add(1)

Output:

>>> df
  Imei  IMEI Index  IMSI Index
0    A           1           1
1    A           1           2
2    B           2           1
3    B           2           2
4    C           3           1
5    C           3           2
6    C           3           3
7    D           4           1
8    E           5           1
  • Related