Some files dont have an extention at all and i need to figure how to count them as "no extention" types
import os
extension_count = {}
# no_extention_count = {}
for filename in os.listdir('/Users/saraAlbertt/Downloads'):
pieces = filename.split('.')
# print(pieces)
extension = pieces[-1]
# print(extension)
if extension not in extension_count:
extension_count[extension] = 1
else:
extension_count[extension] = 1
pieces are lists in my files that are split into strings and some have extentions which is the last element in the list. All lists have different sizes and i need to figure out how to not count lists with only one element inside as extention.
{'dmg': 2, 'png': 6, 'MP4': 3, 'vtt': 4, 'docx': 12, 'DS_Store': 1, 'dots-game': 1, 'localized': 1, 'download': 1, 'pptx': 2, 'pkg': 1, 'txt': 4, 'World': 1, 'JPEG': 1, 'crdownload': 3, 'm4a': 1, 'app': 1, 'ppt': 1, 'jpg': 2, 'zip': 2, 'mp4': 1}
this is the outcome of printing (extention_count) but some files like Ds-store are not extentions and look like this ['DS-store'] I want to minus extention from pieces to get the no-extention and if the length equals to 1 count it seperately as no_extention_count = {}
CodePudding user response:
I suggest taking look at os.path.splitext
function, which does return pair of (root, ext), latter is empty string in case path contains no extension, consider following simple example
import collections
import os
files = ['file.mp4','file.txt','file']
count = collections.defaultdict(int)
for f in files:
count[os.path.splitext(f)[-1]] = 1
print(dict(count))
gives output
{'.mp4': 1, '.txt': 1, '': 1}
Observe that I have used collections.defaultdict(int)
which allows me to increase value by 1 without prior checking if it does exist or not.
CodePudding user response:
Do you mean something like this?:
extension_count = {}
no_extension_count = 0
for filename in os.listdir('/Users/saraAlbertt/Downloads'):
pieces = filename.split('.')
# print(pieces)
# If a file had an extension, pieces would equal 2.
# This only works if you are sure that filenames do not include dots
if len(pieces) == 1:
no_extention_count = 1
continue
extension = pieces[-1]
# print(extension)
if extension not in extension_count:
extension_count[extension] = 1
else:
extension_count[extension] = 1
CodePudding user response:
You could write a simple if check to check the length of pieces
, such as:
if len(pieces) == 1:
##Add To "No Extension"
else:
##Add to Associated Extension
This should check whether or not the file has an extension. You could alternatively check filename
for "." before you split it with :
if "." not in filename:
###Add to "No extension"
Either of those should work.
CodePudding user response:
Using os.path.splitext
, stripping a leading dot and checking for an empty extension will do the trick:
import os
extension_count = {}
# no_extention_count = {}
for filename in os.listdir('/Users/saraAlbertt/Downloads'):
_, extension = os.path.splitext(filename)
extension = 'no extension' if not extension else extension.lstrip('.')
if extension not in extension_count:
extension_count[extension] = 1
else:
extension_count[extension] = 1
CodePudding user response:
You can let Counter
from collections
do the hard work for you, in only one line of code.
import os
from collections import Counter
c = Counter(list(map(lambda x: os.path.splitext(x)[1], os.listdir('path/here'))))
print(c)
Counter({'.py': 14, '.html': 5, '': 5, '.csv': 3})
From this you can see the number with no extension (from the key of '').
CodePudding user response:
str.rsplit('.', maxsplit=1)
method returns list whether length is one(no extension) or two(has extension).
>>> import os
>>> os.system('ls /var/log')
alternatives.log
apt
btmp
dpkg.log
faillog
fontconfig.log
lastlog
wtmp
0
>>> from collections import defaultdict
...
... extension_count = defaultdict(lambda: 0)
... no_extension_count = defaultdict(lambda: 0)
...
... for filename in os.listdir('/var/log'):
... filename_n_ext = filename.rsplit('.', maxsplit=1)
... if len(filename_n_ext) == 1:
... no_extension_count[filename_n_ext[0]] = 1
... else: # len(filename_n_ext) == 2
... extension_count[filename_n_ext[1]] = 1
>>> no_extension_count
defaultdict(<function <lambda> at 0x7f33d4d90ee0>, {'faillog': 1, 'apt': 1, 'wtmp': 1, 'lastlog': 1, 'btmp': 1})
>>> extension_count
defaultdict(<function <lambda> at 0x7f33d4d90e50>, {'log': 3})