Pandas: Index the first element of each list in a dataframe column of lists-CodePudding

I have a Series of lists and would like to index the first element of each list in a data frame of lists using pandas. How can I do this?

Working example

My original dataset is a pandas data frame that looks like:


# Import raw dataset from URL
url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'
column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 
                'Weight', 'Acceleration', 'Model Year', 'Origin', 'Carname']
train = pd.read_csv(url, names=column_names,
                          na_values='?',sep='\s '
                          , skipinitialspace=True)

temp1 = pd.DataFrame(train["Carname"].str.split())
print(temp1)

                            Carname
0                 [plymouth, champ]
1                    [amc, matador]
2     [chevroelt, chevelle, malibu]
...                             ...
1489         [vw, dasher, (diesel)]
1490                [honda, accord]
1491             [ford, escort, 4w]

The desired result for this would be something like,

    'plymouth'
    'amc'
    'chevroelt'
    .....

CodePudding user response：

You can use the string accessor .str[], as follows:

temp1['Carname'].str[0]         # str[0] for first element in list

Result:

0      chevrolet
1          buick
2       plymouth
3            amc
4           ford
         ...    
393         ford
394           vw
395        dodge
396         ford
397        chevy
Name: Carname, Length: 398, dtype: object

CodePudding user response：

You could apply a function to the series that selects the first element of each series item, resulting in a new series:

s_first_elements = s.apply(lambda x: x[0])