How to get the record count on hourly basis unix?-CodePudding

I have thousands of files on my unix server and i want to count the records of files on hourly basis in below format. What is the easiest way to do it?

Date       Hour records

2022-07-08 00   5565

2022-07-08 01   77878

2022-07-08 02   545

.

.

2022-07-08 23   656

2022-07-09 00   787

2022-07-09 01   54547

CodePudding user response：

You can try rquery, it can sort and group the records.

[oracle@oem-web-ss ~]$ ls -lrt /var/local/logs/ --time-style=" %Y/%m/%d:%H:%M:%S" | rq -q "parse /(\S )[ ]{1,}(\S )[ ]{1,}(\S )[ ]{1,}(\S )[ ]{1,}(\S )[ ]{1,}(?P<datetime>\S )[ ]{1,}(\S )/ | select truncdate(datetime,3600), count(1) | group truncdate(datetime,3600) | sort truncdate(datetime,3600)"
2022/07/08:05:00:00     17
2022/07/09:04:00:00     2
2022/07/09:05:00:00     18
2022/07/10:03:00:00     1
2022/07/10:04:00:00     1
2022/07/10:05:00:00     18
2022/07/10:22:00:00     1
2022/07/11:04:00:00     2
2022/07/11:05:00:00     20
...

You can download rquery from here: https://github.com/fuyuncat/rquery/releases

CodePudding user response：

Counting (recursively) all files in current dir, by hour

find is the command to use for finding filesystem entries regarding any kind of consideration. This way will print one date, limited by hour, for each file found.

find . -type f -printf '%TY-%Tm-%Td %TH\n' | sort | uniq -c

Output could look like:

    851 2022-07-13 00
    849 2022-07-13 01
    855 2022-07-13 02
    858 2022-07-13 03
...

Some cosmetic, using `sed`:

find . -type f -printf '%TY-%Tm-%Td %TH\n' |
    sort |
    uniq -c |
    sed 's/^\( *[0-9]\ \) \([0-9-]\ \) \([0-9]\ \)/    \2  \3  \1/;
         1i\    Date        Hour  Count'

Will produce:

    Date        Hour  Count
...
    2022-07-13  00      851
    2022-07-13  01      849
    2022-07-13  02      855
    2022-07-13  03      858
...

By using `ls` instead of `find`?

ls -ARlrt  --time-style=" |%Y-%m-%d:%H|" |
    grep -a ^-|
    cut -d \| -f 2 |
    sort |
    uniq -c

Will produce near same result:

...
    851 2022-07-13:00
    849 2022-07-13:01
    855 2022-07-13:02
    858 2022-07-13:03
...

But as ls will print filenames who could contain special characters, could force grep to procuce wrong output... This way is not recommended!

Counting (recursively) all files in current dir, by hour

Some cosmetic, using sed:

By using ls instead of find?

Some cosmetic, using `sed`:

By using `ls` instead of `find`?