Slurping the Indent Hadouken

Created: · Last Updated:

The Indent Hadouken, a skill that every programmer must have once unconsciously mastered, is devastating to the readability of code. Linus Torvalds once remarked, “if you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program.”.

While I agree with this principle, it is hard to follow in practice. Language features like lambda expressions, exception handling, and pattern matching can easily lead to deep nesting. To alleviate the pain, this post proposes a simple, universal syntax to flattern these nested structures.

The “Slurp” Syntax

Consider this example from the Rust by Example book (comments removed):

fn parse_csv_document(src: impl std::io::BufRead) -> std::io::Result<Vec<Vec<String>>> {
src.lines()
.map(|line| {
line.map(|line| {
line.split(’,’)
.map(|entry| String::from(entry.trim()))
.collect()
})
})
.collect()
}

The logic is straightforward, at least to someone familiar with Rust. But the deep nesting creates a visible inward drift. Now, let’s refactor it with the proposed 🤤 (slurp) operator:

fn parse_csv_document(src: impl std::io::BufRead) -> std::io::Result<Vec<Vec<String>>> {
src.lines().map(|line| 🤤).collect()
line.map(|line| 🤤)
line.split(’,’).map(|entry| 🤤).collect()
String::from(entry.trim())
}

The 🤤 (slurp) operator acts as a placeholder for a block. It “slurps” the subsequent statements into its current position. “Subsequent” means those after the current statement, but within the same block.

In general, a program in this form

{
AAAAAAAA 🤤 AAAAAAAA;
BBBBBBBB 🤤 BBBBBBBB;
CCCCCCCC 🤤 CCCCCCCC;
DDDDDDDD 🤤 DDDDDDDD;
EEEEEEEE;
FFFFFFFF;
}

is equivalent to

{
AAAAAAAA {
BBBBBBBB {
CCCCCCCC {
DDDDDDDD {
EEEEEEEE;
FFFFFFFF;
} DDDDDDDD
} CCCCCCCC
} BBBBBBBB
} AAAAAAAA
}

That’s all. Straight and simple. Essentially, slurp is an alternative syntax for blocks that allows us to write the block’s content after the current statement, at a lower indent level.

Slurp is purely syntactical, which provides several advantages:

  • It is universal and consistent. It can be used in any language construct requiring a block. It is a one-for-all solution instead of inventing special syntax for each case.
  • It adds no new semantics. By contrast, the let-else statement, for example, requires a diverging block for the else branch, which is intuitive but is still one more thing to remember.
  • It is as expressive as the original construct. By contrast, after having the let-else statement in Rust, people still want a more powerful let-else-match statement. With slurp it is not a problem.

I must confess: at first I found this slurp syntax ugly, sometimes hard to parse, and utterly controversail. That’s why this post took me a month to finish. I wrote down a few examples and was disgusted by my own examples. But after several rounds of revisions it looks more natural to me. Our brains are excellent, extensible parsers that can be trained to recognize new patterns. Now I can advocate for it genuinely.

Where to Slurp

Slurp is designed to unindent the “main” logic block from the enclosing construct. It can be useful in the following situations:

  • The if-else statement. Although most if not all unwanted indentation can be eliminated by else-if, early returns, and negating the condition, sometimes slurp might be more pleasant to write and read. For example,

    let x = {
    let precondition1 = ...; // Multi-line expression
    if precondition1 🤤 else 1
    let precondition2 = ...; // Multi-line expression
    if precondition2 🤤 else 2
    let precondition3 = ...; // Multi-line expression
    if precondition3 🤤 else 3
    // Ten lines of main logic
    }
  • The match statement. It is a major source of the Indent Hadouken in Rust, because one match statement brings two levels of indentation. Just think about the fact that we already have if-let, while-let, let-else, and potentially let-else-match statements to alleviate the pain of the match statement. Slurp solves them all. See the last section for example.
  • Closures. Slurp can be especially useful when the main logic is nested in multiple closures. See the example at the beginning of this post.
  • The impl block. When the rest of the file is for implementing a data type or a trait, one might be bored to indent every method.

    impl Animal for Sheep 🤤

    fn ...

    fn ...

    fn ...
    // EOF

  • Similarly, a module, or any other top-level block.

    #[cfg(test)]
    mod tests 🤤

    #[test]
    fn test1() ...

    #[test]
    fn test2() ...

  • Exception handling.

    fn ... {
    try 🤤
    catch ...
    finally ...
    // main logic in try
    }
  • The with statement.

    with ... as cursor: 🤤
    with ... as file: 🤤
    with ... as pool: 🤤
    // main logic
  • Other binding constructs. In many Lisps we can see let, if-let, when-let, and-let, cond-let, multiple-bind, with-something, along with cond and match, contributing to numerous indentations and parentheses. The problem could be addressed by the nest macro in Common Lisp. But slurp can be more flexible.

One might want to use slurp in for/while/loop. I have no opinion.

Refactor Challenge

Challenge: Refactor the following hypothetical example to eliminate the Indent Hadouken, without introduing helper functions. The hypothetical let-else-match statement is allowed.

fn indent_Hadouken(y: i32) -> i32 {
let something = match foo(y) {
FooA(a) => f1(a),
FooB(b) => f2(b),
FooC(x0) => {
let x1 = long_function_name(x0, ...);
let x2 = another_long_function(x0, x1, ...);
match bar(x0, x1, x2, ...) {
BarA(a) => f3(a),
BarB(b) => f4(b),
BarC(x3) => {
let x4 = long_long_fn(x0, x1, x2, x3, ...);
let x5 = long_long_long_fn(x0, x1, x2, x3, x4, ...);
match baz(x0, x1, x2, x3, x4, x5, ...) {
BazA(a) => f5(a),
BazB(b) => f6(b),
BazC(x6) => {
multiline(x0);
verbose(x1, x2);
logic(x0, x3);
involving(x1, x4, x5);
x0_to_x6(x0, x1, x2, x3, x4, x5, x6)
}
}
}
}
}
};
do_something(something, y)
}

With slurp it is easy:

fn indent_Hadouken(y: i32) -> i32 {
let something = {
match foo(y) {
FooA(a) => f1(a),
FooB(b) => f2(b),
FooC(x0) => 🤤
};
let x1 = long_function_name(x0, ...);
let x2 = another_long_function(x0, x1, ...);
match bar(x0, x1, x2, ...) {
BarA(a) => f3(a),
BarB(b) => f4(b),
BarC(x3) => 🤤
};
let x4 = long_long_fn(x0, x1, x2, x3, ...);
let x5 = long_long_long_fn(x0, x1, x2, x3, x4, ...);
match baz(x0, x1, x2, x3, x4, x5, ...) {
BazA(a) => f5(a),
BazB(b) => f6(b),
BazC(x6) => 🤤
};
multiline(x0);
verbose(x1, x2);
logic(x0, x3);
involving(x1, x4, x5);
x0_to_x6(x0, x1, x2, x3, x4, x5, x6)
};
do_something(something, y)
}

Hope it looks straightforward. And may the world be free of the Indent Hadouken.