Welcome, guest | Sign In | My Account | Store | Cart

Function to auto-strip indentation and whitespace from triple-quoted multi-line strings in Python code. Useful when you need to emit blocks of HTML/TCL/etc. from Python, but don't want to mess up the visual flow of your Python code.

Python, 46 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import re
...
def formatBlock(block):
        '''Format the given block of text, trimming leading/trailing
        empty lines and any leading whitespace that is common to all lines.
        The purpose is to let us list a code block as a multiline,
        triple-quoted Python string, taking care of indentation concerns.'''
        # separate block into lines
        lines = str(block).split('\n')
        # remove leading/trailing empty lines
        while lines and not lines[0]:  del lines[0]
        while lines and not lines[-1]: del lines[-1]
        # look at first line to see how much indentation to trim
        ws = re.match(r'\s*',lines[0]).group(0)
        if ws:
                lines = map( lambda x: x.replace(ws,'',1), lines )
        # remove leading/trailing blank lines (after leading ws removal)
        # we do this again in case there were pure-whitespace lines
        while lines and not lines[0]:  del lines[0]
        while lines and not lines[-1]: del lines[-1]
        return '\n'.join(lines)+'\n'

# Discussion:
# No one likes to read code that goes
 
            # ...
            htmlFrag = '''
<hr>
<p>Several lines of text
for example's sake.</p>
<hr>
'''
            # do stuff with htmlFrag
            # ...

# This function lets you list them instead as:

            # ...
            htmlFrag = formatBlock('''
                <hr>                            # this block can be
                <p>Several lines of text        # indented to wherever
                for example's sake.</p>         # looks pleasing to you
                <hr>                            #
            ''')
            # do stuff with htmlFrag
            # ...

See discussion in code, above, due to variable-width font here.

7 comments

Hamish Lawson 21 years, 8 months ago  # | flag

Control how much indentation to trim? You might not want all the common indentation to be trimmed from a block. Perhaps a marker for each block could indicate where to trim up to.

Brett Levin (author) 21 years, 8 months ago  # | flag

Re: Control how much indentation to trim? What I do is prefix the trimmed block with as much indentation has I need wherever I'm emitting the code. The Script object that I submit trimmed fragments to has an internal indentation level that it keeps track of, so each line submitted becomes

'\t'*indentLevel + lineText

The idea of this function is that you free it from the indentation of the surrounding Python code, not that you place it into some new indent level, since that depends on where you emit the code.

Note, though, that relative indentation IS preserved after the first line; you can have

htmlFag = formatBlock('''
....&lt;p&gt;
....::::Paragraph text goes here.
....&lt;/p&gt;
''')

Where '.' whitespace gets trimmed but ':' is left intact.

Brett Levin (author) 21 years, 8 months ago  # | flag

Re: Control how much indentation to trim? What I do is prefix the trimmed block with as much indentation as needed for wherever I'm emitting the code. The way I do that is that the Script object that I submit trimmed fragments to has an internal indentation level that it keeps track of, so each line submitted becomes

'\t'*indentLevel + lineText

The idea of formatBlock() is that you free text from the indentation of the surrounding Python code, not that you place it into some new indent level, since that depends on where you emit the text.

Note, though, that relative indentation IS preserved after the first line; you can have

htmlFag = formatBlock('''
....&lt;p&gt;
....::::Paragraph text goes here.
....&lt;/p&gt;
''')

Where '.' whitespace gets trimmed but ':' is left intact. Does that address what you were thinking of?

Brett Levin (author) 21 years, 8 months ago  # | flag

Re: Control how much indentation to trim? What I do is prefix the trimmed block with as much indentation as needed for wherever I'm emitting the code. The way I do that is that the Script object that I submit trimmed fragments to has an internal indentation level that it keeps track of, so each line submitted becomes

'\t'*indentLevel + lineText

The idea of formatBlock() is that you free text from the indentation of the surrounding Python code, not that you place it into some new indent level, since that depends on where you emit the text.

Note, though, that relative indentation IS preserved after the first line; you can have

htmlFrag = formatBlock('''
....&lt;p&gt;
....::::Paragraph text goes here.
....&lt;/p&gt;
''')

Where '.' whitespace gets trimmed but ':' is left intact. Does that address what you were thinking of?

gyro funch 21 years, 4 months ago  # | flag

Re: Control how much indentation to trim? I made a small modification of this useful recipe to suite my needs for indentation. I added an 'nlspaces' (number of leading spaces) parameter to the function call so that the user can specify the number of spaces that should be prepended to each line in the final return string.

The recipe then becomes

def format_block(block,nlspaces=0):
    '''Format the given block of text, trimming leading/trailing
    empty lines and any leading whitespace that is common to all lines.
    The purpose is to let us list a code block as a multiline,
    triple-quoted Python string, taking care of
    indentation concerns.'''

    import re

    # separate block into lines
    lines = str(block).split('\n')

    # remove leading/trailing empty lines
    while lines and not lines[0]:  del lines[0]
    while lines and not lines[-1]: del lines[-1]

    # look at first line to see how much indentation to trim
    ws = re.match(r'\s*',lines[0]).group(0)
    if ws:
        lines = map( lambda x: x.replace(ws,'',1), lines )

    # remove leading/trailing blank lines (after leading ws removal)
    # we do this again in case there were pure-whitespace lines
    while lines and not lines[0]:  del lines[0]
    while lines and not lines[-1]: del lines[-1]

    # account for user-specified leading spaces
    flines = ['%s%s' % (' '*nlspaces,line) for line in lines]

    return '\n'.join(flines)+'\n'
Bjorn Pettersen 21 years, 2 months ago  # | flag

Receipe doesn't handle text where first line is not aligned. As part of a code generator, I often need to output fragments like:

txt = """
            x = a[i] + b[i];
        }
    }
    """
method.addlines(codeblock(2, txt)) # put at tabstop 2

The code is split up into one function that removes the whitespace, and one that outputs the result in the right position.

def ltrimBlock(s):
    w = len(s)
    s = s.strip('\n') # leading/trailing empty lines
    lines = s.expandtabs(4).split('\n')

    # find w, the smallest indent of a line with content
    for line in lines:
        line2 = line.lstrip()
        if line2:
            w = min(w, len(line)-len(line2))

    return [line[w:] for line in lines]

def codeblock(ntabs, text):
    """Return list of correctly indented lines."""
    tabs = '\t' * ntabs
    return [tabs+line for line in ltrimBlock(text)]

-- bjorn

Dev Player 12 years, 11 months ago  # | flag

Of course in Python 2.7 8 years later there is the Python library wraptext.

But here is an slightly altered version of Bjorn Pettersen's recipe above.

def ltrimBlock(s, i=0, tz=4, sp=' ', n='\n'):
    p = sp * i  # p = line prefix

    # break the lines down
    st = s.strip(n).expandtabs(sz).split(n)   # list of lines

    # get the smallest indent
    w = min([len(l)-len(l.lstrip()) for l in st if l.lstrip()])

    # rebuild block with new indent
    b = n.join([p + l[w:] for l in st])

    return b

I kept the varible names short to mirror Bjorn's example. Personally I'd use longer names and add a doc string. I'd not likely make a generator out of it as for block text like this I'd often use this just for docstrings.

Created by Brett Levin on Mon, 19 Aug 2002 (PSF)
Python recipes (4591)
Brett Levin's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks