Home > Blockchain >  dictionary comprehension for counting items in the list doesn't work. but normal version works
dictionary comprehension for counting items in the list doesn't work. but normal version works

Time:12-26

Edit 2:

thanks to the guidance of @john-gordon, I found the problem.

I was searching through a wrong list. which i fixed by changing the first case to

splitted_subs = [] #> a list that will be used for splitting the values before searching in them.
                
for sub in self.subList: # Splitting the subtitles
    splitted_subs.append(sub.rsplit(".",1)[-1])
    
subtitle_counter = {sub.rsplit(".",1)[-1]: splitted_subs.count(sub.rsplit(".",1)[-1]) for sub in self.subList}

I'm trying to count the "type of subtitle" I have in a folder.

how it works

  • it takes the subtitle_list and stores it in self.subList
  • then there is a for loop which loops for sub in self.subList
  • in the loop the sub gets rsplit(".",1)-ed and then will be the key of the dictionary while the self.subList.count(sub) will counts the occurance of the subtitle extension. like {srt : 2}

and there is a strange problem during the splitting phase of subtitle's file name for extracting its file extension.

the below code only works if countType = Multi-line_2 and the two other cases don't work.

subtitle_list = ['10_Dutch.srt', '11_Spanish.idx', '12_Finnish.sub', '13_French.srt','4_English.idx','5_French.sub']

countType = "Multi-line_2"

class MovieSub:
    def __init__(self):
        self.subList = subtitle_list
        
    def subtitle_counter(self):
        
        match countType:
            case "oneLiner":
                #> One-liner way of counting the different types of subtitles in subList which somehow doesn't work !!! (no it's not because of the self reference of subList., it's because of the split but which part i don't know.)
                subtitle_counter = {sub.rsplit(".",1)[-1]: self.subList.count(sub.rsplit(".",1)[-1]) for sub in self.subList}
            
            case "Multi-line_1":
                
                subtitle_counter = {}
                splitted_subs = []
                for sub in self.subList:
                    splitted_subs.append(sub)

                
                for sub in splitted_subs:
                    splitted_sub = sub.rsplit(".",1)[-1]
                    subtitle_counter[splitted_sub] = splitted_subs.count(splitted_sub)
                
            case "Multi-line_2":
                
                subtitle_counter = {}
                splitted_subs = []
                
                for sub in self.subList: # Splitting the subtitles
                    splitted_subs.append(sub.rsplit(".",1)[-1])
                    # print(sub.rsplit(".",1)[-1])
                
                for sub in splitted_subs: # Counting the Splitted Subtitles
                    subtitle_counter[sub] = splitted_subs.count(sub)

        print(subtitle_counter)
        #> the multi-line version of the code (lame version)
        
        
movie = MovieSub()

movie.subtitle_counter()

the result of "Multi-line_1" and "oneLiner" cases:

>> {'srt': 0, 'idx': 0, 'sub': 0}

result of "Multi-line_2" Case:

>> {'srt': 2, 'idx': 2, 'sub': 2}

I tried to understand how it's possible and only found that it's a problem when I split the file name in the same scope which i count them (Multi-line_2 case), I don't know if it's relevent to the problem or not.

I will be thankful if anyone could help me out about what I'm missing.

edit 1 :

I think there is a need for an explanation first of all, my variable names are a bit misleading and splitted_subs and splitted_sub are different variables.

second: the story of this match case system is that my first case which is a dictionary comprehension didn't work, so I tried to debug it by expanding the code which is Multi-line_1 case, then it didn't work again and I changed the position of split to before appending to the list and its the Multi-line_2 case, and I understood the problem was with the placement of my split method. but why? that's my question

so if you add a print statement before the final line of Multi-line_1 like below:

print(splitted_sub)
subtitle_counter[splitted_sub] = splitted_subs.count(splitted_sub)

and another before final line of Multi-line_2 like:

print(sub)
subtitle_counter[sub] = splitted_subs.count(sub)

they will print the same input but not the same results.

Multi-line_1 results:

>> srt
>> idx
>> sub
>> srt
>> idx
>> sub
>> {'srt': 0, 'idx': 0, 'sub': 0}

Multi-line_2 results:

>> srt
>> idx
>> sub
>> srt
>> idx
>> sub
>> {'srt': 2, 'idx': 2, 'sub': 2}

CodePudding user response:

case "Multi-line_1":
            
    subtitle_counter = {}
    splitted_subs = []
    for sub in self.subList:
        splitted_subs.append(sub)
            
    for sub in splitted_subs:
        splitted_sub = sub.rsplit(".",1)[-1]
        subtitle_counter[splitted_sub] = splitted_subs.count(splitted_sub)

splitted_sub is just the file extension, i.e. ".srt".

But the items in splitted_subs are the full filename, i.e. "10_Dutch.srt". (The variable name is misleading -- those values are not split.)

So of course .count() returns zero.

  • Related