So here is my code.
data = pd.read_csv('cast.csv')
data = pd.DataFrame(data)
print(data)
The data look like this.
title year name type \
0 Closet Monster 2015 Buffy #1 actor
1 Suuri illusioni 1985 Homo $ actor
2 Battle of the Sexes 2017 $hutter actor
3 Secret in Their Eyes 2015 $hutter actor
4 Steve Jobs 2015 $hutter actor
... ... ... ... ...
74996 Mia fora kai ena... moro 2011 Penelope Anastasopoulou actress
74997 The Magician King 2004 Tiannah Anastassiades actress
74998 Festival of Lights 2010 Zoe Anastassiou actress
74999 Toxic Tutu 2016 Zoe Anastassiou actress
75000 Fugitive Pieces 2007 Anastassia Anastassopoulou actress
character n
0 Buffy 4 31.0
1 Guests 22.0
2 Bobby Riggs Fan 10.0
3 2002 Dodger Fan NaN
4 1988 Opera House Patron NaN
... ... ...
74996 Popi voulkanizater 11.0
74997 Unicycle Race Attendant NaN
74998 Guidance Counselor 20.0
74999 Demon of Toxicity NaN
75000 Laundry Girl 25.0
[75001 rows x 6 columns]
First I group the data because I want to take only data that have the type="actor" and I sort it by year.
grouped = data.sort_values(['year'],ascending=True).groupby(["type"])
data_2 = pd.DataFrame(grouped.get_group('actor'))
print(data_2)
Here is the result.
title year \
21879 From the Manger to the Cross; or, Jesus of Naz... 1912
20819 Katastrofen i Dokken 1913
44273 Prins for en dag 1913
44272 Prins for en dag 1913
17190 Ballettens Datter 1913
... ... ...
44824 Devil's Cove 2019
7343 Bses Slwl I: The Musical Journey 2020
35687 Roses in the Concrete 2020
24732 Nostradamus Mission 3: Alien Invasion 2020
28874 Inside Me 2023
name type character n
21879 James D. Ainsley actor John the Baptist 6.0
20819 Hakon Ahnfelt-R?nne actor Mental Patient 4.0
44273 Carl Alstrup actor Journalist Herbert 1.0
44272 Carl Alstrup actor Prince Karl Heinrich 1.0
17190 Svend Aggerholm actor Count de Croisset NaN
... ... ... ... ...
44824 Ron Althoff actor Officer Bradley NaN
7343 Sudarshan Acharya actor Sudarshan Acharya NaN
35687 Darren Alford actor Seth NaN
24732 Misan Akuya actor Anunnaki warrior NaN
28874 Antonio Alcala actor Max NaN
[50000 rows x 6 columns]
Then I want to get data that have the First Name "Aaron". I'm thinking to group the data by name first and then split it so I get the first name.
grouped_2 = data_2.groupby(["name"])
for keys, group in grouped_2:
letter_name = keys.split(" ")
if (letter_name[0] == "Aaron"):
print(group)
The result looks like this.
title year name type character n
8266 The Slingers 2013 Aaron (II) Acosta actor Bradley 1.0
8267 Unbreakable Bond 2017 Aaron (II) Acosta actor James 13.0
title year name type character n
9431 Hitters 2017 Aaron (II) Adair actor Sonny NaN
title year name type character n
10426 Night Shift (II) 2009 Aaron (II) Adams actor Paul 7.0
title year name type character n
27366 Detention 2011 Aaron (II) Albert actor Young Principal Verge 51.0
title year name type character n
10427 The Standard Man 2009 Aaron (III) Adams actor Guest of Zach NaN
title year name type \
32555 Patch Adams 1998 Aaron (III) Alexander actor
character n
32555 Children's Ward Patient 43.0
title year name type character n
10428 Blood on the Highway 2008 Aaron (VII) Adams actor Vampire 76.0
title year name type character n
32556 Show of Hands 2008 Aaron (VII) Alexander actor Vince 9.0
title year name type character n
32558 Big Mistake 2014 Aaron (XI) Alexander actor Nemec NaN
32559 Two Men in Town 2014 Aaron (XI) Alexander actor Bar Patron NaN
32557 After the Fall 2014 Aaron (XI) Alexander actor Bartender NaN
title year name type character n
585 Surrender 2003 Aaron (XV) actor Submissive 19.0
title year name type character n
586 Two Coyotes 2001 Aaron (XVIII) actor Lorenzo 8.0
title year name type character n
32560 Love Like This 2014 Aaron (XX) Alexander actor Bernard 11.0
title year name type character n
478 Bloodshed and Emeralds 1999 Aaron Aames actor Cardinal Feelito NaN
title year name type character n
1875 Blood Justice 1995 Aaron Abbott actor Anthony's Thug NaN
1876 Dead Tides 1997 Aaron Abbott actor Lt. Quartermaster Green 17.0
title year name type character n
2929 In the Closet 2009 Aaron Abdullah actor Spirit Boy #2 NaN
title year name type character n
4227 The Rhino Brothers 2002 Aaron Abernethy actor Extra 61.0
title year name type character n
4517 Dagitab 2014 Aaron Abion actor Grad Student 60.0
title year name type \
5784 The In-Laws 2003 Aaron Abrams actor
5785 The Visual Bible: The Gospel of John 2003 Aaron Abrams actor
5778 Resident Evil: Apocalypse 2004 Aaron Abrams actor
5780 Siblings 2004 Aaron Abrams actor
5779 Sabah 2005 Aaron Abrams actor
5769 Cinderella Man 2005 Aaron Abrams actor
5787 Zoom 2006 Aaron Abrams actor
5786 Young People Fucking 2007 Aaron Abrams actor
5772 Firehouse Dog 2007 Aaron Abrams actor
5773 Flash of Genius 2008 Aaron Abrams actor
5767 Amelia 2009 Aaron Abrams actor
5768 At Home by Myself... with You 2009 Aaron Abrams actor
5776 Jesus Henry Christ 2011 Aaron Abrams actor
5775 Jesus Henry Christ 2011 Aaron Abrams actor
5766 388 Arletta Avenue 2011 Aaron Abrams actor
5781 Take This Waltz 2011 Aaron Abrams actor
5782 The Chicago 8 2011 Aaron Abrams actor
5774 It Was You Charlie 2013 Aaron Abrams actor
5777 Regression 2015 Aaron Abrams actor
5770 Closet Monster 2015 Aaron Abrams actor
5783 The Go-Getters 2017 Aaron Abrams actor
5765 #FromJennifer 2017 Aaron Abrams actor
5771 Code 8 2018 Aaron Abrams actor
character n
5784 Student 17.0
5785 Man in Temple Crowd #3 NaN
5778 Assistant 20.0
5780 Pastor 9.0
5779 Paramedic 8.0
5769 1928 Fan 67.0
5787 Corporal Lipscombe NaN
5786 Matt 1.0
5772 Policeman at Bridge 32.0
5773 Ian Meillor 44.0
5767 Slim Gordon 8.0
5768 Guy 2.0
5776 Nurse Stewart 23.0
5775 Malcolm's Dad 23.0
5766 Alex 4.0
5781 Aaron 10.0
5782 Lee Weiner 9.0
5774 Tom 3.0
5777 Farrell 12.0
5770 Peter Madly 1.0
5783 Owen 1.0
5765 Ralph Sinclair NaN
5771 Actor NaN
title year name type character n
7639 Night and Day 2003 Aaron Acker actor Teenager 15.0
title year name type character n
13552 Director 2008 Aaron Addicott actor Cop #6 NaN
title year name type character n
18754 Slammed 2004 Aaron Aguilera actor The Eradicator 19.0
18755 The Dead Sleep Easy 2007 Aaron Aguilera actor El Tezca NaN
18752 Avenge 2014 Aaron Aguilera actor Vinnie NaN
18753 Minutes to Midnight 2017 Aaron Aguilera actor Angus NaN
title year name type character n
18893 Vale Tudo Project 2009 Aaron Aguirre actor Lupo NaN
18892 Know Thy Enemy 2009 Aaron Aguirre actor Snaps 24.0
title year name type character n
19186 Babagwa 2013 Aaron Agustin actor Boatman 17.0
title year name type character n
23895 Split Decisions 1988 Aaron Akins actor Man #2 at Bar 34.0
title year name type character n
26879 George Takei's Allegiance 2016 Aaron Albano actor Tom Maruyama NaN
title year name type character n
27706 Del Playa 2015 Aaron Alberti actor High School Student NaN
title year name type character n
28478 Missouri Trippin' 2016 Aaron Albright actor Trail Snakes Leader NaN
title year name type character n
28894 Troubadours 2010 Aaron Alcala-Mosley actor Jesse NaN
title year name type character n
30807 Pornography 2009 Aaron Aldorisio actor Video Store Customer 36.0
title year name type \
31409 Gospel of Wonderland 2008 Aaron Aleiner actor
character n
31409 Second Plainclothesman NaN
title year name type character n
31439 Deep in the Heart 2012 Aaron Alejandro actor Himself 32.0
title year name type character n
32554 The In-Laws 2003 Aaron Alexander actor Frat Brother #1 48.0
title year name type character n
36180 Mischief Night 2006 Aaron Ali actor Ifzah 63.0
title year name type character n
38401 Mr. Dungbeetle 2005 Aaron Allen actor Tony NaN
title year name type character n
41468 In the Dead of Winter 2013 Aaron Allister actor Guard #1 20.0
title year name type character n
43792 Use Your Head 1996 Aaron Alpern actor Duncan NaN
title year name type character n
44433 Die Unsichtbaren 2017 Aaron Altaras actor Eugen Friede NaN
title year name type character n
47791 Swamper 2005 Aaron Amaral actor Loud Man NaN
title year name type \
32562 Fall to Grace 2005 Aaron D. Alexander actor
32561 Cherry Bomb 2011 Aaron D. Alexander actor
32563 Fess Up 2015 Aaron D. Alexander actor
32565 Last Girl Standing 2015 Aaron D. Alexander actor
32566 Second Impression 2016 Aaron D. Alexander actor
32564 Fun with Hackley: Axe Murderer 2017 Aaron D. Alexander actor
character n
32562 Basketball Player #3 NaN
32561 Ed Randall NaN
32563 Alan Chambers NaN
32565 Police Officer 2 NaN
32566 Store Manager NaN
32564 Sugar Duke NaN
title year name type character \
587 Rose of Santa Rosa 1947 Aaron Gonzales' Orchestra actor Orchestra
n
587 11.0
title year name type \
27947 Terror Tract 2000 Aaron J. Alberts actor
character n
27947 Lawnmower Man (segment "Make Me An Offer") 6.0
title year name type character n
4026 I Before Thee 2016 Aaron M. Abelto actor Jeffery Douglas NaN
4025 Fight Within 2017 Aaron M. Abelto actor Omar NaN
title year name type character n
4306 LBJ 2016 Aaron Michael Abeyta actor Senate Aide NaN
title year name type character \
43883 Under the Blood-Red Sun 2014 Aaron Scott Alpeter actor Principal
n
43883 37.0
The problem is the data is not sorted by year anymore and the header (title, year, name, type) showed multiple times so the data looks not tidy like the initial data (variable data). How to make the data keep sorted by year and the header showed just one time as the initial data (variable data)?
CodePudding user response:
import pandas as pd
data = pd.read_csv('cast.csv')
data_2 = data[data['type'] == 'actor']
output = data_2[data['name'].str.startswith('Aaron')]
print(output)