Useful things to know about recursion
Published on: 2025-10-13
Reading time: 47 minutes
An attempt to document the different kinds of recursion and highlight several common recursive patterns.
By Tom Hulton-Harrop
An enormous thank you to David Greer, who graciously proof-read this post and pointed out several typos along with providing some very helpful feedback.
Topic
The aim of this post is to capture a bunch of useful information about recursion. There’ll be examples to help solidify things, and some suggestions/ideas on when to reach for particular patterns/techniques.
Motivation
This post is for programmers (primarily myself) who don’t regularly use recursion. It’s for those who all too often find it to be rather confusing and mysterious. In my experience, this mystery comes from not having the vocabulary and familiarity to easily read and write recursive code. The more time I’ve spent with recursion, the more I feel we should treat it as an entirely separate paradigm. Writing a recursive function can feel like suddenly switching to a new and completely alien language, which doesn’t behave as you’d expect.
What follows is an attempt to catalog the sorts of recursion that exist, when to use each type, and a handful of common patterns (with examples) that show up repeatedly when solving recursive problems.
This comes off the back of working through a sizeable chunk of Leetcode style problems while preparing for various interviews. The following may be useful in the context of coding interviews/challenges. I make no guarantees about the performance or robustness of these techniques for any kind of production code.
Starting point
This post assumes you’re vaguely familiar with the concept of recursion (a function calling itself), and are aware of the importance of both a base case and a recursive case. There are lots of good resources available that provide an introduction to recursion, if the concept of recursion is completely new to you, then it might be worth seeking them out before continuing. The idea behind this post is to explore recursion in a bit more detail, and gather related concepts and patterns together in one place.
Recursive mindset
Before we begin, there is a recursive concept known by a few different names worth acknowledging here. The Recursive Book of Recursion refers to it as the ‘leap-of-faith technique’. I’ve also seen it referred to as the ‘Big Recursive Idea (BRI)’ in Think Like a Programmer.
The idea is to assume that the recursive call you make inside your recursive function, already works. By making this jump, you write the rest of the code around the function, filling in the remaining code needed to stitch together the smaller result with the partially completed current result. Another way to think about it, that I also learned from Think Like a Programmer, is to imagine replacing the recursive call with an iterative solution (you can actually do this in some simple cases).
The idea is you should treat the recursive call as a black box. Forget it is the same call as the one you’re actually writing. If you do this, it can make writing a recursive solution easier than trying to manually walk the call chain all the way through (though seemingly a good idea, this often doesn’t give great results, and can lead to more confusion and frustration).
Contents
1 Structural and generative recursion
1.1 Structural recursion
1.2 Generative recursion
1.3 Identifying structural and generative recursion
2 Recursive control flow
2.1 Single-path recursion
2.1.1 Linear recursion
2.1.2 Branching recursion
2.2 Multi-path recursion (tree recursion)
2.2.1 Multi-branch recursion
2.2.2 Looping recursion
2.3 Recursive control flow summary
3 Pure and impure recursive functions
4 Recursive function return patterns
4.1 Accumulator pattern
4.2 Result aggregation
5 Interfaces to recursion
5.1 Recursive helper function pattern
5.2 Mutual recursion
6 Traversal and reduction/search
6.1 Backtracking
6.1.1 Implicit backtracking
6.1.2 Explicit backtracking (visited)
6.2 Early return/short-circuiting
6.3 Include/exclude
7 Recursive optimisations
7.1 Memoization
7.2 Tail call optimisation
8 Closing
9 Bonus
9.1 head_tail implementation
9.2 More examples
9.2.1 Returning all partitions
1 Structural and generative recursion
At the top most level, there are two distinct kinds of recursion, structural recursion, and generative recursion. These terms relate to how a recursive function modifies its inputs at each recursive call.
1.1 Structural recursion
Structural recursion is when we process the data we’re consuming based purely on the structure of the data itself. How we process the data is deterministic. It’s easy to assume ‘structural’ only applies to recursion using what we’d think of as an explicit data structure, but it also applies to numeric (scalar) data types. To make this concrete, let’s see some examples.
The first form of structural recursion is with numeric data. We can use structural recursion to make an incredibly simple function print out a string a set number of times.
void print_n_times(std::string_view message, int n) { if (n == 0) { return; } std::println("{}", message); print_n_times(message, n - 1);}What makes this recursive function structural, is the fact a deterministic step is being taken at each recursive call (n - 1). No matter what number is passed in, we’re always going to decrement it by the same amount. The input is being systematically reduced until it reaches the known base case. We can choose to either increment or decrement the number, but we’re doing so by a fixed amount that is not driven by the value of the data we’re visiting.
The next form of structural recursion is that which deals with a linear data structure (such as a list) as input. The distinction that makes the recursive algorithm structural is how it processes the list, which in our case is one element at a time. This can be achieved in one of two ways, by using the head/tail technique (more on this later), or by effectively recreating a for loop using recursion by passing an index to the list as a parameter.
Let’s create a similar recursive function to before, which accepts a list of strings and prints them out. The first function uses the head/tail approach (for a quick explainer on how the head_tail function is implemented, see the bonus 9.1 head_tail implementation section at the end).
void print_words(std::span<std::string> words) { if (words.empty()) { return; } auto [head, tail] = head_tail(words); std::println("{}", head); print_words(tail);}The terminating condition for the above function is when the list of words is empty. This condition is reached after splitting off the head of the list, and passing the tail (the remaining elements) to the recursive call repeatedly, leaving a smaller and smaller list each time. The key thing to realize here is we’re processing the list sequentially and deterministically. The value of the items in the list are not going to affect how we process the list.
The following implementation is functionally equivalent to the head/tail approach, we’re just using an explicit index to track our position in the list, and stopping when it reaches the size of the list.
void print_words(std::span<std::string> words, int i) { if (i == words.size()) { return; } std::println("{}", words[i]); print_words(words, i + 1);}You can think of the above as a direct transformation from a for loop to a recursive approach.
The last form of structural recursion is tree traversal. This is where we process nodes in a tree and recursively visit the root and all children. Processing terminates when we reach a leaf node. A tree will have a set structure, and as with lists, two trees with the same structure and different values will be processed in the exact same way.
Below is a snippet showing the code for processing a binary tree and printing each node’s value.
struct binary_tree_node_t { std::string word; binary_tree_node_t* left = nullptr; binary_tree_node_t* right = nullptr;};
void print_tree(binary_tree_node_t* node) { if (node == nullptr) { return; } std::println("{}", node->word); print_tree(node->left); print_tree(node->right);}This more or less captures structural recursion. It maps closely to the structure of the input data, and is the kind of recursion we think of when doing things like processing a file/folder hierarchy (though as we saw at the outset, it applies to numeric inputs as well).
One other helpful way to think about structural recursion is whether its parameters are resource based (how much of something is left) vs positional (how far through the input data are you). It’s a subtle distinction, but can help guide the design of a recursive function’s interface (its parameters) based on the problem to be solved. Numeric data tends to fit resource based better, whereas lists are naturally aligned with positional (the head/tail pattern can be thought of as a way to reframe a positional approach as resource based).
1.2 Generative recursion
Generative recursion covers cases where a recursive function produces new data for each recursive call. The input chosen for the next level of recursion will depend on the value of the data being processed. The most common examples of this are divide and conquer algorithms such as quicksort and binary search.
void quicksort(std::span<int> numbers) { if (numbers.size() <= 1) { return; } int pivot = numbers.back();
auto mid = std::partition( numbers.begin(), numbers.end(), [pivot](int n) { return n < pivot; }); std::swap(*mid, numbers.back());
quicksort(numbers.subspan(0, mid - numbers.begin())); quicksort(numbers.subspan((mid - numbers.begin()) + 1));}If we take the above quicksort implementation as an example, passing it the list {30, 50, 15, 45, 10, 2, 36} to sort, the first partitioned range will be {30, 2, 15, 10}, with the corresponding half being {50, 45} (36 is the pivot in the center).
[30, 2, 15, 10, 36, 50, 45]However, if a different pivot was selected, either by selecting a random pivot in the implementation, or if the list was in a different starting configuration, the first and second half of the range could be quite different (for example, if the input data was {30, 50, 36, 45, 10, 2, 15}, and 15 was selected as the pivot, the first partitioned range would be 2, 10, with the other half of the range being {45, 50, 30, 36}).
[2, 10, 15, 45, 50, 30, 36]This is what makes this recursive call generative, as opposed to structural. The same reasoning applies to binary search, which generates different inputs based on the target it’s searching for. Generative recursion also applies in cases where we’re traversing a graph structure and certain neighbors may or may not be reachable depending on the data (for example, a grid with blocked cells).
1.3 Identifying structural and generative recursion
Now we’ve defined both structural and generative recursion, why is this actually useful to know, and what does it tell us about the algorithms we’re either building or understanding. The most important factor when dealing with structural vs generative recursion is how to handle the base case. Structural recursion tends to be much simpler, as a base case will naturally arise from the shape of the data being processed (we reach zero, the end of a list, or a leaf node). Generative recursion is more complicated, it’s necessary to prove and validate the terminating condition, as it is not guaranteed by the structure of the data alone. Time complexity can also vary more with generative recursion. With structural recursion, complexity (or performance) is usually tied to the input size, whereas with generative recursion there is no guaranteed relationship between the two. This is why quicksort’s performance can degenerate to O(n²) in a worst case scenario, rather than the average case of O(n log n).
2 Recursive control flow
There unfortunately does not appear to be a well agreed upon term for this, but in this context what is meant by recursive control flow is how many times a recursive function is called at each level of recursion.
2.1 Single-path recursion
Single-path recursion is when a recursive function is called no more than once at each level of recursion. There are two varieties of single path recursion, linear and branching.
2.1.1 Linear recursion
Linear recursion applies to several of the earlier examples we saw (numerical decrement and head/tail). This is where a function calls itself a single time at each level of recursion, and the recursive call appears once in the source code. This leads to a simple call structure; the callstack grows linearly until the base case is reached, and then each nested function call returns.
void print_n_times(std::string_view message, int n) { if (n == 0) { return; } std::println("{}", message); print_n_times(message, n - 1);}Even in this simple case, there’s one additional subtlety which arises from whether we choose to perform the logic in our function before or after the recursive call (this is often referred to as preorder and postorder traversal when discussing tree structures; you may also hear it referred to as head recursion and tail recursion in the context of scalar or list types). There is no observable behavior change when running each example, however if we add some additional output we can see a little more clearly what’s happening internally.
The preorder case looks as follows.
void print_n_times_preorder( std::string_view message, int n, int indent) { if (n == 0) { std::println("{}returning", std::string(indent, ' ')); return; } std::println("{}\"{}\"", std::string(indent, ' '), message); print_n_times_preorder(message, n - 1, indent + 2); std::println("{}returning", std::string(indent, ' '));}
print_n_times_preorder("hello", 3, 0);
// "hello"// "hello"// "hello"// returning// returning// returning// returningFrom the output, we can see the action this function takes (printing "hello") happens on the way down the callstack. The recursive function then returns after reaching the base case.
The postorder case is shown below.
void print_n_times_postorder( std::string_view message, int n, int indent) { std::println("{}calling", std::string(indent, ' ')); if (n == 0) { return; } print_n_times_postorder(message, n - 1, indent + 2); std::println("{}\"{}\"", std::string(indent, ' '), message);}
print_n_times_postorder("hello", 3, 0);
// calling// calling// calling// calling// "hello"// "hello"// "hello"In this scenario, we see the recursive calls happen first, and on the way back up the callstack the action is taken.
This has wide ranging implications for how we handle return values with recursive functions. We’ll cover recursive function return patterns in much more detail later, this is just a hint as to what’s to come so we don’t completely gloss over this point now.
2.1.2 Branching recursion
Branching single-path recursion is when more than one recursive call exists in the body of a recursive function, but it is guaranteed only one of those calls will be made at each level of recursion.
A prime example of this is binary search, shown below (simplified to not return anything).
// low and high represent a half-open range -> [0 - numbers.size())void binary_search( std::span<const int> numbers, int target, int low, int high) { if (low >= high) { std::println("\"{}\" not found", target); return; } const int mid = low + (high - low) / 2; if (numbers[mid] == target) { std::println("\"{}\" exists at index: {}", target, mid); } else if (numbers[mid] < target) { binary_search(numbers, target, mid + 1, high); } else { binary_search(numbers, target, low, mid); }}Even though the above has more than one recursive call in the function body, we only ever make one recursive call at each level. The callstack will still be linear, and it will grow proportional to O(log n).
2.2 Multi-path recursion (tree recursion)
Multi-path recursion (or tree recursion) applies to any recursive function that makes more than one recursive call at each level of recursion. Tree recursion (confusingly), has nothing to do with the tree data structure. The ‘tree’ in the name is referring expressly to the call graph that is generated when more than one recursive function is called. This creates a tree shape when the call stack is visualised (this is why the name multi-path recursion can be clearer).
There are two main kinds of multi-path/tree recursion.
2.2.1 Multi-branch recursion
Multi-branch recursion refers to two sorts of cases, conditional and unconditional branches. Unconditional branches are simpler and we’ve previously seen an example of this with our tree traversal function. This is where the recursive calls always happen, regardless of the current state of the callstack (discounting the base case).
void print_tree(binary_tree_node_t* node) { if (node == nullptr) { return; } std::println("{}", node->word);
print_tree(node->left); print_tree(node->right);}Conditional branches are only called based on the current state of recursion. They explore choices or decisions that need to be made at each level. Conditional multi-branch recursive functions do tend to be more complex, but are also more powerful and flexible.
Below is an example illustrating this concept.
void count_down_repeatedly(int n, int& top) { if (n <= 0) { top--; return; } std::println("{}", n);
if (n < top) { count_down_repeatedly(n - 1, top); } else { count_down_repeatedly(n - 1, top); count_down_repeatedly(n - 1, top); }}This function will count down from the input number n, and then countdown from the next largest number after it reaches the base case (this behavior relies on n and top being set to the same value initially). This recursive function isn’t pure (explained later), as we mutate one of the function arguments. It’s only purpose is to demonstrate conditional multi-branch recursion in a limited setting. For a more representative example, see the parenthesis permutation generation algorithm presented later when backtracking is introduced (see 6.1.1 Implicit backtracking).
2.2.2 Looping recursion
The last kind of tree recursion is looping. This is where we call a recursive function from inside a loop in the body of a recursive function. Loop based recursion most commonly iterates through a recursive structure. It can also be combined with multi-branch recursion. For example, when checking neighbors in a grid, we iterate over all possible directions, but first check if our next position both falls inside the grid (isn’t out of bounds), and is reachable (not blocked or already visited).
If we revisit the earlier tree example, and update the data structure to contain a variable number of children, we can see a short example of what this looks like.
struct tree_node_t { std::string value; std::vector<tree_node_t*> children;};
void print_tree(const tree_node_t* node) { if (tree == nullptr) { return; } std::println("{}", node->value);
for (auto* child : node->children) { print_tree(child); }}This only really applies in imperative languages where looping constructs are available. It is possible to mimic this kind of looping using the head/tail approach we touched on earlier, or with an index and a technique called mutual recursion (see 5.2 Mutual recursion for more details).
2.3 Recursive control flow summary
These are all the kinds of control flow you’ll encounter when dealing with recursive functions (excluding the slightly more exotic mutual recursion that we’ll cover later).
The general structure breaks down as follows.
├── Recursive control flow ├── Single-path recursion │ ├── Linear recursion │ └── Branching recursion └── Multi-path recursion (tree recursion) ├── Multi-branch recursion │ ├── Conditional │ └── Unconditional └── Loop recursion3 Pure and impure recursive functions
We’re going to briefly touch on this topic before looking at how we return values from recursive functions. So far, the recursive functions we’ve seen haven’t returned a value, they’ve either modified global state (in our case IO when printing values), or mutated the input arguments of the function (as with quicksort). The second case, where input data is modified, is more representative of impure functions out in the wild.
Pure functions, by comparison, do not modify external state, and simply return a new value. There are cases when input arguments are modified as part of the recursive call, but these tend to be an implementation detail, and the input data is not mutated. Another way to define pure functions is that they do not have side effects; calling them does not change external state in any way. They have what is called referential transparency, where the same input (parameter) is guaranteed to produce the same output (return value).
Both pure and impure recursive functions have their place, some well known examples of impure functions are maze generation (where a grid has passages ‘carved’ into it as the recursive algorithm runs), flood fill (think of the paint bucket in a 2D drawing application) to update a bitmap image of some kind, in-place sort algorithms, and problems such as N-Queens and Sudoku solver. Pure recursive functions include examples such as Fibonacci, factorial, generation of permutations, subsets and combinations, and binary search.
4 Recursive function return patterns
The following section describes the internal recursive function computation strategy for building results to be returned. There are two main ways recursive functions build values to return, top down, widely known as the accumulator pattern, and bottom up, which we’ll call result aggregation. There are pros and cons to each approach, and both are worth knowing about. Mixed into this discussion are the subtleties around returning multiple values from a function which is also different depending on if we’re using the accumulator pattern or result aggregation. We’ll also touch on how the accumulator pattern and tail recursion are closely related.
4.1 Accumulator pattern
The accumulator pattern introduces an additional parameter to a recursive function which is used to build up the return value as recursion proceeds down the callstack. The computation happens during the descent, with the modified accumulator passed forward at each level of the recursion. When the base case is reached, we simply return the accumulated value.
A simple way to demonstrate this is with a recursive implementation of the sum function.
int sum(std::span<const int> collection, int acc) { if (collection.empty()) { return acc; } auto [head, tail] = head_tail(collection); return sum(tail, head + acc);}We use the same head/tail pattern from print_words, and at each recursive call we take the current head, and add it to the accumulator. One advantage of this approach is in certain contexts (this depends on the language and compiler), we can take advantage of a technique called tail optimisation (see 7.2 Tail call optimisation). This is an optimisation technique, and there is no guarantee it will be implemented by the compiler/interpreter for your language.
The values of head and acc are shown below as the recursion descends.
# head + acc 5 + 0 4 + 5 3 + 9 2 + 12 1 + 14 15When dealing with a collection of scalar values, it’s possible to swap the accumulator with an out parameter (a ‘collector’), that can have values appended to it (say if finding all even numbers). The tricky part is when dealing with a list of list values, where we expect different configurations for each internal value. This is common in such problems as combinations, subsets and permutations. In this scenario, an accumulator (the parameter building the internal value) and an out parameter (to hold the combined results) are needed.
One way to demonstrate this is with a recursive function to generate all permutations of an n length binary digit.
void binary_strings_acc( int n, std::string acc, std::vector<std::string>& result) { if (n == 0) { result.push_back(acc); return; } binary_strings_acc(n - 1, acc + "0", result); binary_strings_acc(n - 1, acc + "1", result);}Here we can see we keep the accumulator value as before, but have an additional out parameter, result. The recursive function now doesn’t actually return anything directly. The computation happens as we move down the callstack (the acc + "0" and acc + "1" lines). We remember to decrement n to reduce our input parameter to ensure the base case condition is met, and when it reaches zero, the function returns. It’s at this point that we append the accumulated value to the result out parameter.
This style of recursion definitely feels more natural to programmers familiar with imperative programming, but it can sometimes lead to more convoluted solutions when compared to the alternative means of returning from recursive functions below.
It’s worth also briefly mentioning the interface to this kind of recursion can be improved by following the helper function pattern (see 5.1 Recursive helper function pattern section).
4.2 Result aggregation
The alternative to the accumulator pattern, which tends to be a more natural fit with recursion, is result aggregation (also known as head recursion). In this case, we proceed all the way down the callstack first, before performing any computation on the return value. When we reach the base case and return, computation occurs as the recursive calls unwind. On the way back up, at each level a partial result is computed from the results of the previous recursive calls. The final return value will include all previous results aggregated together.
For comparison, we can see the equivalent sum implementation using result aggregation (or head recursion).
int sum(std::span<const int> collection) { if (collection.empty()) { return 0; } auto [head, tail] = head_tail(collection); return head + sum(tail);}We use the head/tail pattern again, only this time the recursive call is made before the sum is computed. This has the effect of recursing until we hit the base case and return 0. We then return whatever the current head value is added to 0, and then the previous head added to that value, and so on, until we reach the top of the callstack. This is what we mean when we say the computation happens during the ascent.
The values of head and sum(tail) are shown below as the recursive function returns.
# head + sum(tail) 5 + 0 4 + 5 3 + 9 2 + 12 1 + 14 15Similar to the accumulator pattern above, we can see what a result aggregation version of binary_strings would look like below.
std::vector<std::string> binary_strings(int n) { if (n == 0) { return {{}}; } auto partials = binary_strings(n - 1); std::vector<std::string> result; for (const auto& partial : partials) { result.push_back("0" + partial); result.push_back("1" + partial); } return result;}In this version, we make the recursive call before performing any computation. In the base case, we return a vector with an empty string, this is to ensure the vector has a size of 1 ({{}} is just short hand for std::vector<std::string>(1, "")). If we wanted to ignore this particular base case, we’d simply return an empty vector, {}. We then iterate over the returned vector of strings, and build a new results vector with elements "1" and "0". This bubbles up to the next invocation, which then appends "1" and "0" to the previous "1" and "0", resulting in "00", "10", "01", "11". This repeats until all permutations are generated.
Note: When returning collections, think of
{{}}as the identity. Combining with it preserves what you have. This is like1with multiplication,x × 1 = x(identity). In contrast,{}(an empty list), is the empty set, or zero element. Combining with it produces nothing. This is akin to0in multiplication.
prepend(group, {{}}) = [group] // identityprepend(group, {}) = [] // zeroResult aggregation naturally fits recursion, and provides the ability to write a pure function with no side effects. There’s often no need for a wrapper/helper function as well. Result aggregation can be a little more difficult to understand if it’s not something you encounter often (this is definitely my experience), though the patterns relating to traversal and reduction show up again and again with different recursive problems, so gaining familiarity with the technique makes solving these problems easier.
5 Interfaces to recursion
There are several useful patterns that control how the outside world interacts with recursive functions, and how they interact with one another. Some are outside the scope of this article, but two useful ones to know about are the helper function pattern, and mutual recursion.
5.1 Recursive helper function pattern
More often than not, a recursive call will need more information as part of its interface to pass down to subsequent calls than we’d like to expose to the caller. To work around this, it’s often advised to write a wrapper function around a recursive function to provide a clean interface to the outside would, and setup the arguments of the actual recursive call inside the helper.
For example, if we take a variation of the sum function we looked at earlier, we can rewrite it to use an index instead of the head/tail approach (this is essentially a recursive way of mimicking a for loop).
int sum(std::span<const int> collection, int i) { if (i == collection.size()) { return 0; } return collection[i] + sum(collection, i + 1);}The problem is to use this, the caller needs to pass 0 along with the collection they’d like to sum.
std::vector<int> some_numbers = {1, 2, 3, 4, 5};
int total = sum(numbers, 0);In this particular case we could add a default parameter, but it still risks users passing a value other than 0. A much better approach is simply to wrap the call in a helper function, and make that the public interface.
int sum_internal(std::span<const int> collection, int i) { if (i == collection.size()) { return 0; } return collection[i] + sum_internal(collection, i + 1);}
int sum(std::span<const int> collection) { return sum_internal(collection, 0);}This technique is particularly effective when applied to recursive functions using the accumulator pattern, or those using memoization (see 7.1 Memoization section).
5.2 Mutual recursion
Mutual recursion is when two separate recursive functions call one another. This pattern is more relevant for functional languages that do not support loops, but it can crop up from time to time in other recursive algorithms. It’s particularly common in parsing implementations.
A quick example is shown below using the same tree_node_t type from the looping recursion example.
void process_children(std::span<const tree_node_t*> children) { if (children.empty()) { return; } auto [head, tail] = head_tail(children);
process_tree(head); process_children(tail);}
void process_tree(const tree_node_t* tree) { if (tree == nullptr) { return; } std::println("{}", tree->value);
process_children(tree->children);}In the example above, instead of using a for loop to iterate over the tree’s child nodes, we use a separate recursive function (process_children), which uses the head/tail pattern to work its way through the children, after first processing each node.
6 Traversal and reduction/search
There are a handful of really useful traversal (all matches) and reduction (single match) patterns that show up over and over again in recursive algorithms. These are different to the purely structural techniques we encountered at the start (head/tail, index etc.) The focus of traversal and reduction is on the generation of the results (return values), rather than simply exploring a data structure recursively.
6.1 Backtracking
Backtracking is very common in recursive functions, and comes about when recursive functions make decisions about which branches to explore. This implies that for a recursive function to use backtracking, it must make use of conditional multi-branch recursion (introduced in 2.2.1 Multi-branch recursion). By extension, this means an algorithm not using conditional multi-branch recursion, cannot use backtracking.
Backtracking comes in several forms, with both implicit and explicit tracking of explored branches. We’ll briefly cover these various forms with examples.
6.1.1 Implicit backtracking
Implicit backtracking occurs automatically when branches are explored and the recursive calls return. When the recursive call returns, it automatically ‘undoes’ the action it took on the previous path, so it can explore a different one.
A textbook example of this particular form of backtracking is generating all permutations for a fixed number of open and closed parentheses.
std::vector<std::string> generateParentheses(int n) { if (n == 0) { return {}; } return generateParentheses_internal(n, n);}
std::vector<std::string> generateParentheses_internal(int left, int right) { if (left == 0 && right == 0) { return {{}}; } std::vector<std::string> result; if (left > 0) { for (const auto& partial : generateParentheses_internal(left - 1, right)) { result.push_back("(" + partial); } } if (right > left) { for (const auto& partial : generateParentheses_internal(left, right - 1)) { result.push_back(")" + partial); } } return result;}We make use of the recursive helper function pattern to track two separate variables for the left and right parentheses (we also handle the degenerate case where 0 is passed to the function). The function uses multi-path (tree) recursion with multiple conditional branches. In this recursive function, rather than using a position based formulation (tracking where we are in the input data), we’re using a resource based formulation (how many of each set of parentheses do we have remaining). This algorithm will only produce matched open and closing parentheses (e.g. for generateParentheses(3), we’d get ()()() and ((())) etc…, but not )()()(), this is due to the if (right > left) check, which guarantees we can’t place a right parenthesis before the corresponding left parenthesis. If we instead changed the condition to right > 0, we’d generate all permutations of parentheses. This is an example of pruning, a feature of backtracking to prevent the recursion from exploring future branches it knows can’t lead to a valid solution. This is much more efficient than exploring all possible branches, and then filtering the results afterwards.
The key to understanding why this algorithm uses backtracking, is that we first try a path, see if it’s valid, and then undo that change to explore other possible alternatives. For example, if we reach what will become (()()), we hit the base case and then start returning, when we reach the fourth character (index 3), instead of placing a (, we undo that change, and place a ), this allows us to explore a separate branch where we then add () again, resulting in (())() as another valid option. Due to our use of result aggregation (using head recursion to return the values), the order these results get built up is a little counter intuitive, but it’s the base case that still determines when we’ve found a match.
6.1.2 Explicit backtracking (visited)
The other common form of backtracking is where we explicitly track values or positions we’ve already seen (visited), and then clear or unset those values as we undo the action after a recursive function returns.
Below is an example demonstrating how to find all possible routes from the starting location, (0, 0), to the destination, (size - 1, size -1) of a grid (U, D, L, R are returned to indicate the movement as up, down, left and right).
struct coord_t { int row; int col;};
std::string directions_str = "DRLU";std::vector<coord_t> directions = {{1, 0}, {0, 1}, {0, -1}, {-1, 0}};
// required by std::setbool operator<(const coord_t& lhs, const coord_t& rhs) { return std::tie(lhs.row, lhs.col) < std::tie(rhs.row, rhs.col);}
bool is_valid(coord_t coord, int rows, int cols, std::set<coord_t>& visited) { return coord.row >= 0 && coord.row < rows && coord.col >= 0 && coord.col < cols && visited.count(coord) == 0;}
std::vector<std::string> all_routes( std::vector<std::vector<bool>>& grid, coord_t coord, std::set<coord_t>& visited) {
if (!is_valid(coord, grid.size(), grid[0].size(), visited)) { return {}; }
if (coord.row == grid.size() - 1 && coord.col == grid[0].size() - 1) { return {{}}; }
visited.insert(coord); std::vector<std::string> result; for (int i = 0; i < directions.size(); i++) { auto next_coord = coord_t{ .row = coord.row + directions[i].row, .col = coord.col + directions[i].col}; auto partials = all_routes(grid, next_coord, visited); for (auto& partial : partials) { partial.insert(partial.begin(), directions_str[i]); } result.insert(result.end(), partials.begin(), partials.end()); }
visited.erase(coord); return result;}The key lines are highlighted above. Each time we enter a cell, we add it to our visited list. The is_valid test includes checking that we are not attempting to add a node we have already visited as part of this branch. If/when the destination is reached, we return a vector with a single empty string to have the current direction prepended immediately after the recursive call (this is the identity element mentioned before). When the recursive call returns, we backtrack, and undo having visited the cell so it may be included in subsequent paths.
If we run this on a 3x3 grid, we’ll see the following results returned.
["DDRR", "DDRURD", "DDRUURDD", "DRDR", "DRRD", "DRURDD", "RDDR", "RDRD", "RDLDRR", "RRDD", "RRDLDR", "RRDLLDRR"]In this specific example, we could have forgone the visited set, and instead used the grid itself by updating each cell to true to indicate when it was visited. Depending on the data we’re working with though this isn’t always possible. Keeping track of visited cells in-place, or via a separate data structure, are both valid options depending on the scenario.
6.2 Early return/short-circuiting
Early return/short-circuiting is a technique used across many kinds of recursive algorithms (it applies to both traversal and reduction/search kinds). The general idea is to cease execution as soon as a success condition is met, and to avoid wasted work (pruning in backtracking is one such use of early return/short-circuiting). Examples of the former include algorithms such as binary search (where we return as soon as we find the target value) and path finding in graph traversal (stop as soon as you reach the target node).
6.3 Include/exclude
Include/exclude1 is an incredibly useful technique. The pattern is a form of binary recursion (we usually, though not always, have two recursive calls). The idea is to recursively process the input data one element at a time, by either including or excluding the foremost value. When the input data is a list this is easy, it’s just the head of the list (the first element).
A good example to demonstrate this technique is finding which numbers from a list sum together to produce a given target value.
std::vector<std::vector<int>> sum_to_total_without_repeats( std::span<int> numbers, int target) {
if (target == 0) { return {{}}; } else if (numbers.empty() || target < 0) { return {}; }
auto [head, tail] = head_tail(numbers);
const auto including = sum_to_total_without_repeats(tail, target - head);
std::vector<std::vector<int>> result(including.size()); std::transform( including.begin(), including.end(), result.begin(), [head](auto included) { included.insert(included.begin(), head); return included; });
const auto excluding = sum_to_total_without_repeats(tail, target);
result.insert(result.end(), excluding.begin(), excluding.end()); return result;}Let’s walk through the sections in the above code. The first base case condition checks for success, target == 0, which indicates the numbers from the input list processed so far equal the target value. We invert things by subtracting instead of adding, so we do not require a secondary accumulator value. We can then just test target against 0.
The second base case condition is when the recursive call fails to produce the target value. The first failure case is if we ran out of numbers to add (with the input {1, 2, 3, 4} and target value 5, in the case were we visit 4 first, there are then no more numbers to process/add, so we hit this case). The second failure case is if we overshoot our target (if we proceed from 1, and add 2 and 3, we’ll have a total of 6, which we represent as -1, so target will be less than 0).
We next have the familiar head/tail pattern (ideal for handling lists of data), immediately followed by the first recursive call to sum_to_total_without_repeats. We pass the tail of the list as the first argument, to ensure our input shrinks, but we also decrement the value of head from our running total. By doing this, we say we include head in this recursive branch, we are factoring it into the computation.
The following section is where we build our intermediate results. We take the result of the recursive call, and iterate over each vector of integers contained, prepending our current head value. This is the crux of most recursive algorithms. We assume sum_to_total_without_repeats has returned to us the correct value for the smaller list, and smaller value of k, and we then add the current head we have to the front of each element in the list. Because of our base case, if there wasn’t a match for the particular combination, we’ll return the zero element, so we’ll never perform the logic in the std::transform call as including will be empty.
Last but not least, we have the second recursive call, which counts as the exclude call. The reason we refer to this as exclude is because we’re not including the current head value we’ve pulled off our list at this level of the recursion. We’re effectively skipping it, and passing in the remaining tail, without using it at all.
We finally append the excluded results from the second recursive call with our previous results variable, and then return the concatenated result. This pattern allows us to visit all combinations of the numbers from our input. It turns out to be super useful and can be used for generating permutations, combinations and subsets which comes in handy when solving a range of recursive problems.
7 Recursive optimisations
There are two main approaches to speed-up the running of recursive solutions. Each method applies in slightly different scenarios, but they can be useful to be aware of.
7.1 Memoization
Memoization is the process of caching precomputed values. It allows a recursive function to avoid repeated calls to problems it has already calculated. The most famous example of this is Fibonacci. The recursive solution to Fibonacci is as follows.
int fibonacci(int n) { if (n <= 1) { return n; } return fibonacci(n - 1) + fibonacci(n - 2);}This recursive function is an example of multi-path, unconditional branch recursion, resulting in tree recursion (many branching paths). The issue with this naive solution is we end up recomputing the same values over and over again (e.g. fibonacci(5) calculates fibonacci(2) three times, and fibonacci(3) twice).
├── Fib 5 ├── Fib 4 │ ├── Fib 3 │ │ ├── Fib 2 │ │ │ ├── Fib 1 │ │ │ └── Fib 0 │ │ └── Fib 1 │ └── Fib 2 │ ├── Fib 1 │ └── Fib 0 └── Fib 3 ├── Fib 2 │ ├── Fib 1 │ └── Fib 0 └── Fib 1The runtime of this function is O(2ⁿ), which gets out of hand very quickly.
To avoid this repeated work, we can introduce a cache (essentially just a collection of key/value pairs). In C++ this might look something like the following.
int fibonacci(int n) { std::unordered_map<int, int> cache; return fibonacci_internal(n, cache);}
int fibonacci_internal( int n, std::unordered_map<int, int>& cache) { if (n <= 1) { return n; }
if (auto it = cache.find(n); it != cache.end()) { return it->second; } int partial = fibonacci_internal(n - 1, cache) + fibonacci_internal(n - 2, cache);
cache.insert({n, partial}); return partial;}If we log invocations to the recursive function, we now get the following.
├── Fib 5 ├── Fib 4 │ ├── Fib 3 │ │ ├── Fib 2 │ │ │ ├── Fib 1 │ │ │ └── Fib 0 │ │ └── Fib 1 │ └── Fib 2 └── Fib 3This takes the algorithm from a time complexity of O(2ⁿ) to O(n) (the space complexity is also O(n)).
It’s possible to apply this technique to any problem where there are overlapping subproblems (where the same subproblem is solved multiple times). The key used for the cache must be based exclusively on the parameters to the function. One small detail to be aware of is using the head/tail technique where the tail is passed as the reduced parameter isn’t a great fit for memoization, as it doesn’t work well as a key for a map/hash-table data structure. It’s instead better to use an offset, or position through the list. It may be necessary to also include the end (offset + size) or target, depending on the problem, as the key to guarantee uniqueness based on the state of the recursion.
7.2 Tail call optimisation
The accumulator pattern discussed earlier in this article can be combined with a compiler optimisation to eliminate the call stack overhead of recursive functions. To achieve this, the last operation in the recursive case of a function must be the recursive call, and the function must be able to return its result immediately when reaching the base case (as no stack unwinding will take place), this is why an accumulator parameter is required. Tail call optimisation is something often employed in purely functional languages, and may not be supported in certain imperative languages (e.g. Java, JavaScript and Python, though most modern C++ compilers do implement some form of tail call optimisation), so your mileage may vary.
The sum function we saw earlier in the accumulator pattern section is a perfect example of this.
int sum(std::span<const int> collection, int acc) { if (collection.empty()) { return acc; } auto [head, tail] = head_tail(collection); return sum(tail, head + acc);}Notice how the last operation in the recursive part of the function is a recursive call (no logic at the current level happens after it executes), and it passes through an accumulator value which it returns directly in the base case.
If we take a quick look at both implementations in Compiler Explorer, we can see the big difference in assembly code. The smoking gun that the tail recursive version has had the tail call optimisation applied, is that we see the call to sum in the tail recursive version is using a jmp instruction, as opposed to the head recursive version which allocates space on the callstack and then uses call to invoke the function again (there’s some clever loop unrolling going on in the accumulator version too).
The kind of problems that adapt to tail call recursion are not often the ones that benefit from a recursive approach (e.g. backtracking), so chances are writing the algorithm iteratively will make the most sense in an imperative language, though functional languages benefit greatly from this approach and optimisation.
8 Closing
Recursion is an enormous topic. Even though this post is relatively long, it only scratches the surface of all the different sorts of problems recursion can be used to solve. Recursion still routinely gives me pause, and I’m still learning new and interesting approaches to apply. To help cement this understanding and get more comfortable using recursion, I can highly recommend attempting Leetcode-style problems which can be solved using recursion. When you do, see if it’s possible to build a solution around one or more of the techniques we’ve covered. It can be tough, but as with everything it gets easier with practice.
Thanks for reading, and if you’re an expert on recursion and feel I’ve missed something (or got something horribly wrong), please do let me know. I’m still learning, and I love discovering new tips and tricks I haven’t come across before.
9 Bonus
Some additional material related to the content above, which didn’t fit neatly anywhere else.
9.1 head_tail implementation
This section is off-topic when it comes to recursion, but is included for completeness to briefly explain the head_tail implementation for those who are curious.
Throughout this article, a number of the examples use the incredibly convenient C++20 library type std::span. It allows us to create a non-owning view of a contiguous sequence (most likely an array or std::vector), without needing to use a pointer and a size. By using front and subspan, we can get a reference to the first element, and then a new range (or span) of the remaining elements (the tail). No copies are made, we’re simply adjusting some offsets to alter the view of the elements.
This can be a little verbose to type, and it can be easy to forget to take a reference to the head of the list and accidentally make a copy. Thankfully, this is where our little helper function called head_tail comes in.
template<typename Range>auto head_tail(Range&& r) { std::span s{r}; return std::pair<decltype(s.front()), decltype(s.subspan(1))>( s.front(), s.subspan(1));}The function template uses a forwarding reference to pass a range to an internal std::span, and then ensures the types of the std::pair to be returned match that of the span (this is so we don’t inadvertently make a copy when returning the head). It’s essentially syntactic sugar for the following.
auto& head = list.front();auto tail = list.subspan(1);
auto [head, tail] = head_tail(list);This helps prevent errors and makes the examples a little more succinct. We can also add an overload for std::string_view which can be quite useful as well.
auto head_tail(std::string_view s) { return std::pair<decltype(s.front()), decltype(s.substr(1))>( s.front(), s.substr(1));}This is very much bonus/auxiliary content and not super relevant when it comes to everything else recursion related mentioned above.
9.2 More examples
What follows are some more examples of how to apply recursion to solve various problems.
9.2.1 Returning all partitions
Suppose we are given a collection of some sort (this could be a vector of ints or a string), and we want to return all possible ways to partition the collection. For example, given "aab", we’d return [["a", "a", "b"], ["a", "ab"], ["aa", "b"], ["aab"]] (a vector of vectors of strings). On seeing this problem, because it requires us to return all possible combinations, given what we now know, we can infer it is an application of combination/subsets, which will require backtracking.
To start solving this sort of problem, it’s often easiest to start sketching out the basic shape first.
std::vector<std::vector<std::string>> partitions( std::string_view s) { // base case // todo // recursive case std::vector<std::vector<std::string>> result; // todo return result;}If we think about the base case first, the simplest possible input we can hope to partition is an empty string. There is one way to partition an empty string, and that is with the empty partition, []. To support this case we need to return our identity type, which is a vector containing an empty vector.
std::vector<std::vector<std::string>> partitions( std::string_view s) {
if (s.empty()) { return {{}}; } // recursive case std::vector<std::vector<std::string>> result; // todo return result;}Next comes the interesting part, which is deciding how we cut the string to partition it into different groups. Every partition must start with an initial unpartitioned group, we then take the remaining elements and partition them recursively. For the string "xyz", this would be the first character, 'x', and then all subsequent characters ("yz"), followed by "xy" and all subsequent characters ('z'), and then finally "xyz" and the empty string (""). This pattern is shown below.
[x] [yz][xy] [z][xyz] []We have to decide what is the first group (that we use to prepend), and then solve the smaller problem (the remaining characters). We can move through the string this way using substr (if we were using std::span we could use subspan in the same way). The following code will print the same output as was shown above.
for (int i = 1; i <= s.size(); i++) { auto left = s.substr(0, i); auto right = s.substr(i); std::println("[{}], [{}]", left, right);}Note: We start at index
1, and iterate until<= s.size(). This is to ensure the initial substring length is1and not0.
Next comes the recursive leap of faith. We assume that our recursive function partitions is already working, and will return all groupings for the smaller string we asked for.
for (int i = 1; i <= s.size(); i++) { auto left = s.substr(0, i); auto right = s.substr(i);
auto groups = partitions(right); // todo}We then iterate over those groups, prepending the left substring that wasn’t part of the smaller set of partitions.
for (int i = 1; i <= s.size(); i++) { auto left = s.substr(0, i); auto right = s.substr(i); auto groups = partitions(right); for (auto& group : groups) {
group.insert(group.begin(), std::string(left));
result.push_back(group); }}And that’s it!
We can streamline the code a bit to remove some of the local variables, and then see the full implementation of our recursive function below.
std::vector<std::vector<std::string>> partitions( std::string_view s) { if (s.empty()) { return {{}}; } std::vector<std::vector<std::string>> result; for (int i = 1; i <= s.size(); i++) { for (auto& group : partitions(s.substr(i))) { group.insert(group.begin(), std::string(s.substr(0, i))); result.push_back(group); } } return result;}Note: We’ve moved the
substrcalls to happen inline, and used the return value of the recursivepartitionscall directly in the range based for loop.
Calling partitions("xyz") now returns [["x", "y", "z"], ["x", "yz"], ["xy", "z"], ["xyz"]].
The key is to not think about how instances of partition("...") work, just trust they return the partitioned groups we expect. We delegate partitioning the rest of the data (and don’t care how this is done), and then combine the results with our start of the string.
If we walk through things at the topmost level of recursion, we can see how the prepending logic works, and how we obtain our final list of results.
partitions("xyz"): Choice 1: first group = "x" Trust: partitions("yz") returns [["y", "z"], ["yz"]] Prepend "x": [["x", "y", "z"], ["x", "yz"]]
Choice 2: first group = "xy" Trust: partitions("z") returns [["z"]] Prepend "xy" to ["z"]: [["xy","z"]]
Choice 3: first group = "xyz" Trust: partitions("") returns [[]] Prepend "xyz" to []: [["xyz"]]
Returns: [["x", "y", "z"], ["x", yz"], ["xy", "z"], ["xyz"]]Last but not least, as an added bonus, we can take the above algorithm and transform it to fit the include/exclude style pattern. We can hoist the index from the for loop to become an additional parameter to the recursive function, and call the recursive function twice with different inputs. The use of head and tail variables is perhaps a bit of a cheat, as head is not a single element and instead the left split of the string, but it still fits relatively cleanly.
std::vector<std::vector<std::string>> partitions( std::string_view s, int offset) { if (s.empty()) { return {{}}; } if (offset > s.size()) { return {}; } std::vector<std::vector<std::string>> result; auto head = s.substr(0, offset); auto tail = s.substr(offset); for (auto& group : partitions(tail, 1)) { group.insert(group.begin(), std::string(head)); result.push_back(group); } auto remaining = partitions(s, offset + 1); result.insert(result.end(), remaining.begin(), remaining.end()); return result;}The first call to partitions always starts at index 1, just with a smaller string (we’re passing tail as the first parameter), and the second call reuses the existing string, but decides to ‘cut’ it at a different point (the offset + 1 expression). Because we no longer have the terminating condition of the for loop, we have a separate base case which checks for the offset exceeding the string length. This branch returns the zero element ({}), which prevents the left of the string (head) from being prepended.
This isn’t a direct application of the include/exclude pattern, as we’re not explicitly choosing to include or exclude the current value for each branch (binary recursion). We’re instead moving where the ‘cut’ point is. Though this is the case, the general shape/feel of the function is very close, and this style can be used to solve similar kinds of problems. The general pattern is backtracking through conditional multi-branch recursion.
Hopefully this worked example has been useful, and might help illuminate how to approach solving other recursive problems of different shapes and sizes.
Footnotes
-
This is unrelated to the inclusion-exclusion principle which sum readers may be familiar with. ↩