I'm kind of new to Python, and I'd like to ask a question about for loops and variable subset. It turns out that I'm working with high dimensional data from oficial polls. Then, I have a panel data which accounts for five years, and in which variables from different years can be identified by the last two digits. For example: "codpeople_16" belongs to 2016 year. In order to work with observations at different times, I need to get a list for all variables that account for one year: lets say, for instance, variables for 2016.
Then, I have a list of all variables in my dataframe. It looks like this:
names = (a200_16, a201_16, a202_16..., a200_17, a201_17, a202_17..., a200_18...)
Then I have defined a function as follows:
def itercolumn(names):
result = ""
final = []
for i in names:
result = i[-2:]
if result == "16":
print(i)
It works and prints all variables from the year 2016. However, I need to have a list of all variables of 2016. Then, I need a function to return. I've tried this:
ddef itercolumn(names):
result = ""
final = []
for i in names:
result = i[-2:]
if result == "16":
final = i
return final
It returns only the first iteration characters, but I need all variables that meet the condition. Then, what could I do to get a list for all variables that meet the condition?
Regards
CodePudding user response:
Just replace return
with yield
and then you can print the whole list with.
print(list(itercolumn(names)))
CodePudding user response:
def itercolumn(names)
result = "":
final = []
for i in names:
result = i[-2:]
if result == "16":
final.append(i)
Change your code to this.