Home > database >  How do I figure out how to count files with 'no extention' seperately
How do I figure out how to count files with 'no extention' seperately

Time:02-04

Some files dont have an extention at all and i need to figure how to count them as "no extention" types

import os
extension_count = {}
# no_extention_count = {}
for filename in os.listdir('/Users/saraAlbertt/Downloads'):
    pieces = filename.split('.')
    # print(pieces)
    extension = pieces[-1]
    # print(extension)
    if extension not in extension_count:
        extension_count[extension] = 1
    else:
        extension_count[extension]  = 1

pieces are lists in my files that are split into strings and some have extentions which is the last element in the list. All lists have different sizes and i need to figure out how to not count lists with only one element inside as extention.

{'dmg': 2, 'png': 6, 'MP4': 3, 'vtt': 4, 'docx': 12, 'DS_Store': 1, 'dots-game': 1, 'localized': 1, 'download': 1, 'pptx': 2, 'pkg': 1, 'txt': 4, 'World': 1, 'JPEG': 1, 'crdownload': 3, 'm4a': 1, 'app': 1, 'ppt': 1, 'jpg': 2, 'zip': 2, 'mp4': 1}

this is the outcome of printing (extention_count) but some files like Ds-store are not extentions and look like this ['DS-store'] I want to minus extention from pieces to get the no-extention and if the length equals to 1 count it seperately as no_extention_count = {}

CodePudding user response:

I suggest taking look at os.path.splitext function, which does return pair of (root, ext), latter is empty string in case path contains no extension, consider following simple example

import collections
import os
files = ['file.mp4','file.txt','file']
count = collections.defaultdict(int)
for f in files:
    count[os.path.splitext(f)[-1]]  = 1
print(dict(count))

gives output

{'.mp4': 1, '.txt': 1, '': 1}

Observe that I have used collections.defaultdict(int) which allows me to increase value by 1 without prior checking if it does exist or not.

CodePudding user response:

Do you mean something like this?:

extension_count = {}
no_extension_count = 0

for filename in os.listdir('/Users/saraAlbertt/Downloads'):
    pieces = filename.split('.')
    # print(pieces)
    
    # If a file had an extension, pieces would equal 2.
    # This only works if you are sure that filenames do not include dots
    if len(pieces) == 1: 
        no_extention_count  = 1
        continue

    extension = pieces[-1]
    # print(extension)
    if extension not in extension_count:
        extension_count[extension] = 1
    else:
        extension_count[extension]  = 1

CodePudding user response:

You could write a simple if check to check the length of pieces, such as:

if len(pieces) == 1:
   ##Add To "No Extension"
else:
   ##Add to Associated Extension

This should check whether or not the file has an extension. You could alternatively check filename for "." before you split it with :

if "." not in filename:
   ###Add to "No extension"

Either of those should work.

CodePudding user response:

Using os.path.splitext, stripping a leading dot and checking for an empty extension will do the trick:

import os

extension_count = {}
# no_extention_count = {}
for filename in os.listdir('/Users/saraAlbertt/Downloads'):
    _, extension = os.path.splitext(filename)
    extension = 'no extension' if not extension else extension.lstrip('.')
    if extension not in extension_count:
        extension_count[extension] = 1
    else:
        extension_count[extension]  = 1

CodePudding user response:

You can let Counter from collections do the hard work for you, in only one line of code.

import os
from collections import Counter

c = Counter(list(map(lambda x: os.path.splitext(x)[1], os.listdir('path/here'))))

print(c)
Counter({'.py': 14, '.html': 5, '': 5, '.csv': 3})

From this you can see the number with no extension (from the key of '').

CodePudding user response:

str.rsplit('.', maxsplit=1) method returns list whether length is one(no extension) or two(has extension).

>>> import os
>>> os.system('ls /var/log')
alternatives.log
apt
btmp
dpkg.log
faillog
fontconfig.log
lastlog
wtmp
0

>>> from collections import defaultdict
...
... extension_count = defaultdict(lambda: 0)
... no_extension_count = defaultdict(lambda: 0)
...
... for filename in os.listdir('/var/log'):
...     filename_n_ext = filename.rsplit('.', maxsplit=1)
...     if len(filename_n_ext) == 1:
...         no_extension_count[filename_n_ext[0]]  = 1
...     else:  # len(filename_n_ext) == 2
...         extension_count[filename_n_ext[1]]  = 1

>>> no_extension_count
defaultdict(<function <lambda> at 0x7f33d4d90ee0>, {'faillog': 1, 'apt': 1, 'wtmp': 1, 'lastlog': 1, 'btmp': 1})
>>> extension_count
defaultdict(<function <lambda> at 0x7f33d4d90e50>, {'log': 3})
  • Related