Home > Blockchain >  How do I catch the count in this recursive loop?
How do I catch the count in this recursive loop?

Time:01-18

I have a recursive function that counts the number of occourances in a file.

A common task I like to do is report the outcome of a function with a format:


(defun csv-counter (list)
  (let ((counter 0)
    (email (first list)))
    (if (null list)
    nil
    (progn
      (  1 (count email list :test #'string=))
      (incf counter)
      (csv-counter (rest list))))
    (format t "count for email ~a is ~a~%" email counter)))


The counter number in the format function doesnt actually accumulate the total number, instead it reports each occurance as 1

...
count for email [email protected] is 1
count for email [email protected] is 1
count for email [email protected] is 1
... 

What am I doing wrong?

CodePudding user response:

It is unclear what your function is meant to do or what you are trying to achieve. Even so it's possible to say some things about it. Below I have reindented it and annotated some points with numbers

(defun csv-counter (list)
  (let ((counter 0)
        (email (first list)))
    ;; (0)
    (if (null list)
        nil
      (progn
        (  1 (count email list :test #'string=)) ;(1)
        (incf counter)                           ;(2)
        (csv-counter (rest list))))
    ;; (3)
    (format t "count for email ~a is ~a~%" email counter)))

At (0) counter will be zero, on every call.

At (1) is an expression, ( 1 (count email list :test #'string=)) whose value is not used. So this expression does not do anything useful at all: it merely serves to make the time complexity quadratic rather than linear.

At (2) counter is incremented by 1, which means it will now be 1. The result of (2) is that, if the list is not empty, the value of counter will be 1.

At (3) this value is reported: it will be 1 if the list is not empty, 0 otherwise.

So we get:

count for email nil is 0
count for email fish@bat is 1
count for email foo@bar is 1
count for email foo@bar is 1

Now, as I said above, it is not clear what you are trying to achieve. However, it might be to count the number of distinct occurrences of each email address (represented as a string) in a list of them. So for instance, given ("foo@bar" "foo@bar" "fish@bat") you want a count of 2 for "foo@bar and of 1 for "fish@bat".

In order to do this you need two things: a count for each email, and a notion of which emails you have seen. The second is crucial.

So here is an initial approach to doing this:

(defun count-distinct-emails (emails)
  (labels ((cde-loop (tail seen counts)
             (cond
              ((null tail)
               counts)
              ((member (first tail) seen :test #'string=)
               ;; already counted this one
               (cde-loop (rest tail) seen counts))
              (t
               ;; a new email
               (let ((email (first tail))
                     (more (rest tail)))
                 (cde-loop more
                           (cons email seen)
                           (acons email (  1 (count email more :test #'string=)) counts)))))))
    (cde-loop emails '() '())))

This function is not itself recursive, but it has a recursive helper function, cde-loop, which is written as an internal definition. It is written as an internal function to avoid the nightmare of needing all sorts of weird extra, perhaps optional, arguments to the function you actually call and because it is not called by any other function than its parent. In cde-loop you can see that it maintains a table (a list) of emails it has seen, and builds up another table (an alist) of addresses with counts.

For this function:

> (count-distinct-emails '("foo@bar" "foo@bar" "fish@bat"))
(("fish@bat" . 1) ("foo@bar" . 2))

And you can then write a little reporter function:

(defun report-emails (table)
  (dolist (email&count table)
    (format t "~&count for ~A: ~D~%"
            (car email&count) (cdr email&count))))

So:

> > (report-emails (count-distinct-emails '("foo@bar" "foo@bar" "fish@bat")))
count for fish@bat: 1
count for foo@bar: 2
nil

Now count-distinct-emails is horrible: not because it's recursive (any reasonable implementation will turn that into a loop) but because it's repeatedly probing the list of things it has seen and the list of emails it is looking for. A much better approach is to unify these two things into one thing, and use a hashtable which has better search performance:

(defun count-distinct-emails (emails)
  (labels ((cde-loop (tail table)
             (if (null tail)
                 table
               (progn
                 (incf (gethash (first tail) table 0))
                 (cde-loop (rest tail) table)))))
    (cde-loop emails (make-hash-table :test #'equal))))

And then the reporter function needs to know to use a hashtable as well:

(defun report-emails (table)
  (maphash (lambda (email count)
             (format t "~&count for ~A: ~D~%"
                     email count))
           table))

Note that cde-loop uses a nice trick: it says (incf (gethash (first tail) table 0)): incf knows how to increment the value of an entry in a hashtable, and using the default of 0 when the entry is not present means that the entry will spring into being so you don't have to do the awkward 'check if entry is present, increment if so' thing yourself.

Finally, once you've given in and used a hashtable, this is a case where a straightforward iterative solution is probably clearer:

(defun count-distinct-emails (emails)
  (let ((table (make-hash-table :test #'equal)))
    (dolist (email emails table)
      (incf (gethash email table 0)))))
  • Related