replace pandas column value with increments of numbers-CodePudding

import pandas as pd
data = {'Account':['Paul','Jenn']}


df = pd.DataFrame(data=data)

The desired output would be 1 for paul and 2 for Jenn, the basis of the solution would form a for loop for a much bigger dataset to replace account number names with numeric values

CodePudding user response：

You could do something like this: First create a dictionary mapping of the unique names in Accounts to the number ordered by how they appear. Then use .replace() to replace the values in the series with this number. This will ensure that Paul is always replaced by 1 if it appears more than once and Jenn is replaced by 2 if it appears more than once, etc.

import pandas
import json


data = {'Account':['Paul','Jenn']}
df = pandas.DataFrame(data=data)

name_mapping = json.loads(pandas.Series(
    index=df.Account.unique(),
    data=range(1, len(df.Account.unique())   1)
).to_json())

df.Account = df.Account.replace(name_mapping)

Output:

>>> df
   Account
0        1
1        2

CodePudding user response：

Not entirely sure what you're trying to do, please futher elaborate if i misunderstood the question :) If you want to replace the 'name' column with index incremented by a value:

df['name'] = df.index   value

Increase the values of a column (or do any other arithmetical operation over a column) :

df['column name']  = value
# add do that over a column and add it to another column
df['result']  = df['other column']   value

Count the number of occurences for each value in column name

df = pd.DataFrame({"name": ["Paul", "Jenn", "Paul"]})
# count the number of occurences for each name
df["count"] = df.groupby("name")['name'].transform('count')
# in case you don't want duplicate rows
df.drop_duplicates(inplace=True)