Home > Blockchain >  How to create a list from Pandas Series?
How to create a list from Pandas Series?

Time:08-12

My dataframe looks like this. I am trying to create a list of Names. For ex: ["Mike", "Jean"]:

0. Mike, Jean
1. May, Weather
2. Jack, 100

What I've tried:

df["NAME"] = df["NAME"].str.split(",")
for i in range(len(df["NAME"])):
    df["NAME"][i] = df["NAME"][i] .split(",")

OUTPUT

0. [Mike, Jean]
1. [May, Weather]
2. [Jack, 100]

OUTPUT I WANT

0. ["Mike", "Jean"]
1. ["May", "Weather"]
2. ["Jack", "100"]

I am new to Python and Pandas.

CodePudding user response:

Assuming this input:

df = pd.DataFrame({'Name': ['Mike, Jean', 'May, Weather', 'Jack, 100']})

When you run:

df['Name'].str.split(', ')

and get:

0      [Mike, Jean]
1    [May, Weather]
2       [Jack, 100]
Name: Name, dtype: object

The [Mike, Jean] format is just a representation.

The real data is indeed a Series of lists, as show by an explicit conversion of the Series to list:

df['Name'].str.split(', ').to_list()

output:

[['Mike', 'Jean'],
 ['May', 'Weather'],
 ['Jack', '100']]

CodePudding user response:

You don't really need to use a for-loop, you can do the split with:

df['Name'] = df['Name'].str.split()

This will return a pandas series containing a list per row, such as:

0 ["Mike", "Jean"]
1 ["May", "Weather"]
2 ["Jack", "100"]

If you wish to extract the Series' values as list itself then you can use:

name_lists = df['Name'].str.split().values.tolist()

Returning:

[["Mike","Jean"],["May","Weather"],["Jack","100"]]

CodePudding user response:

You can iterate over the values of your dataframe and easily convert the rows of the resulting Numpy array into lists (cf example code below)

import pandas as pd

df = pd.DataFrame()

df['Name'] = ['Name1', 'Name2']
df['FirstName'] = ['FirstName1', 'FirstName2']

L = []
for row in df.values:
    L.append(list(row))
    
print(L)

Cheers

CodePudding user response:

Code snippet below should solve your purpose :)

import pandas as pd

df = pd.DataFrame(["[Mike, Jean]" , "[May, Weather]", "[Jack, 100]"], columns=['name'])

df.head()
             name
0    [Mike, Jean]
1  [May, Weather]
2     [Jack, 100]
df['type_name']  = df.apply(lambda y: type(y['name']), axis=1)

df['name1'] = df.apply(lambda y: y['name'].replace('[', '').replace(']', '').split(", "), axis=1)

df['type_name1']  = df.apply(lambda y: type(y['name1']), axis=1)

df.head()
             name      type_name           name1      type_name1
0    [Mike, Jean]  <class 'str'>    [Mike, Jean]  <class 'list'>
1  [May, Weather]  <class 'str'>  [May, Weather]  <class 'list'>
2     [Jack, 100]  <class 'str'>     [Jack, 100]  <class 'list'>
final_list = df['name1'].values.tolist()

print(final_list)
[['Mike', 'Jean'], ['May', 'Weather'], ['Jack', '100']]
  • Related