Welcome, guest | Sign In | My Account | Store | Cart

This simple function counts Lines Of Code in Python files in two ways: maximal size (source LOC) with blank lines and comments, minimal size (logical LOC) stripping same. It includes a simplified version of my directory tree walker from recipe 52664.

Python, 43 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import os
import fnmatch

def Walk(root='.', recurse=True, pattern='*'):
    """
        Generator for walking a directory tree.
        Starts at specified root folder, returning files
        that match our pattern. Optionally will also
        recurse through sub-folders.
    """
    for path, subdirs, files in os.walk(root):
        for name in files:
            if fnmatch.fnmatch(name, pattern):
                yield os.path.join(path, name)
        if not recurse:
            break

def LOC(root='', recurse=True):
    """
        Counts lines of code in two ways:
            maximal size (source LOC) with blank lines and comments
            minimal size (logical LOC) stripping same

        Sums all Python files in the specified folder.
        By default recurses through subfolders.
    """
    count_mini, count_maxi = 0, 0
    for fspec in Walk(root, recurse, '*.py'):
        skip = False
        for line in open(fspec).readlines():
            count_maxi += 1
            
            line = line.strip()
            if line:
                if line.startswith('#'):
                    continue
                if line.startswith('"""'):
                    skip = not skip
                    continue
                if not skip:
                    count_mini += 1

    return count_mini, count_maxi

For what can be learned from counting lines of code? Should we assume that the programmer who produces more bulk is more productive. Or that the language that produces less is more efficient? I wouldn't want to (re)start any of those flame wars!

Despite that, for basic metrics in the most general sense, this code might be useful.

The logic to skip comment lines could get confused by the use of triple quotes for multiline strings. To prevent that, do not start such a line with the quotes.

I know there are common cross-language tools to perform the same task, but whipping up a pure-Python implementation was quick and saves me installing YAT (Yet Another Tool). Also, this code is freely available under the Python license.