Useful things to know about recursion
Published on: 2025-10-13
An attempt to document the different kinds of recursion and highlight several common recursive patterns.
By Tom Hulton-Harrop
Topic
The aim of this post is to capture a bunch of useful information about recursion. There’ll be examples to help solidify things, and some suggestions/ideas on when to reach for particular patterns/techniques.
Motivation
This post is for programmers (primarily myself) who don’t regularly use recursion. It’s for those who all too often find it to be rather confusing and mysterious. In my experience, this mystery comes from not having the vocabulary and familiarity to easily read and write recursive code. The more time I’ve spent with recursion, the more I feel we should treat it as an entirely separate paradigm. Writing a recursive function can feel like suddenly switching to a new and completely alien language, which doesn’t behave as you’d expect.
What follows is an attempt to catalog the sorts of recursion that exist, when to use each type, and a handful of common patterns (with examples) that show up repeatedly when solving recursive problems.
This comes off the back of working through a sizeable chunk of Leetcode style problems while preparing for various interviews. The following may be useful in the context of coding interviews/challenges. I make no guarantees about the performance or robustness of these techniques for any kind of production code.
Starting point
This post assumes you’re vaguely familiar with the concept of recursion (a function calling itself), and are aware of the importance of both a base case and a recursive case. There are lots of good resources available that provide an introduction to recursion, if the concept of recursion is completely new to you, then it might be worth seeking them out before continuing. The idea behind this post is to explore recursion in a bit more detail, and gather related concepts and patterns together in one place.
Recursive mindset
Before we begin, there is a recursive concept known by a few different names worth acknowledging here. The Recursive Book of Recursion refers to it as the ‘leap-of-faith technique’. I’ve also seen it referred to as the ‘Big Recursive Idea (BRI)’ in Think Like a Programmer.
The idea is to assume that the recursive call you make inside your recursive function, already works. By making this jump, you write the rest of the code around the function, filling in the remaining code needed to stitch together the smaller result with the partially completed current result. Another way to think about it, that I also learned from Think Like a Programmer, is to imagine replacing the recursive call with an iterative solution (you can actually do this in some simple cases).
The idea is you should treat the recursive call as a black box. Forget it is the same call as the one you’re actually writing. If you do this, it can make writing a recursive solution easier than trying to manually walk the call chain all the way through (though seemingly a good idea, this often doesn’t give great results, and can lead to more confusion and frustration).
Contents
1 Structural and generative recursion
1.1 Structural recursion
1.2 Generative recursion
1.3 Identifying structural and generative recursion
2 Recursive control flow
2.1 Single-path recursion
2.1.1 Linear recursion
2.1.2 Branching recursion
2.2 Multi-path recursion (tree recursion)
2.2.1 Multi-branch recursion
2.2.2 Looping recursion
2.3 Recursive control flow summary
3 Pure and impure recursive functions
4 Recursive function return patterns
4.1 Accumulator pattern
4.2 Result aggregation
5 Interfaces to recursion
5.1 Recursive helper function pattern
5.2 Mutual recursion
6 Traversal and reduction/search
6.1 Backtracking
6.1.1 Implicit backtracking
6.1.2 Explicit backtracking (visited)
6.2 Early return/short-circuiting
6.3 Include/exclude
7 Recursive optimisations
7.1 Memoization
7.2 Tail call optimisation
8 Bonus
8.1 head_tail
implementation
1 Structural and generative recursion
At the top most level, there are two distinct kinds of recursion, structural recursion, and generative recursion. These terms relate to how a recursive function modifies its inputs at each recursive call.
1.1 Structural recursion
Structural recursion is when we process the data we’re consuming based purely on the structure of the data itself. How we process the data is deterministic. It’s easy to assume ‘structural’ only applies to recursion using what we’d think of as an explicit data structure, but it also applies to numeric (scalar) data types. To make this concrete, let’s see some examples.
The first form of structural recursion is with numeric data. We can use structural recursion to make an incredibly simple function print out a string a set number of times.
void print_n_times(std::string_view message, int n) { if (n == 0) { return; } std::println("{}", message); print_n_times(message, n - 1);}
What makes this recursive function structural, is the fact a deterministic step is being taken at each recursive call (n - 1
). No matter what number is passed in, we’re always going to decrement it by the same amount. The input is being systematically reduced until it reaches the known base case. We can choose to either increment or decrement the number, but we’re doing so by a fixed amount that is not driven by the value of the data we’re visiting.
The next form of structural recursion is that which deals with a linear data structure (such as a list) as input. The distinction that makes the recursive algorithm structural is how it processes the list, which in our case is one element at a time. This can be achieved in one of two ways, by using the head/tail technique (more on this later), or by effectively recreating a for loop using recursion by passing an index to the list as a parameter.
Let’s create a similar recursive function to before, which accepts a list of strings and prints them out. The first function uses the head/tail approach (for a quick explainer on how the head_tail
function is implemented, see the bonus 8.1 head_tail implementation section at the end).
void print_words(std::span<std::string> words) { if (words.empty()) { return; } auto [head, tail] = head_tail(words); std::println("{}", head); print_words(tail);}
The terminating condition for the above function is when the list of words is empty. This condition is reached after splitting off the head of the list, and passing the tail (the remaining elements) to the recursive call repeatedly, leaving a smaller and smaller list each time. The key thing to realize here is we’re processing the list sequentially and deterministically. The value of the items in the list are not going to affect how we process the list.
The following implementation is functionally equivalent to the head/tail approach, we’re just using an explicit index to track our position in the list, and stopping when it reaches the size of the list.
void print_words(std::span<std::string> words, int i) { if (i == words.size()) { return; } std::println("{}", words[i]); print_words(words, i + 1);}
You can think of the above as a direct transformation from a for loop to a recursive approach.
The last form of structural recursion is tree traversal. This is where we process nodes in a tree and recursively visit the root and all children. Processing terminates when we reach a leaf node. A tree will have a set structure, and as with lists, two trees with the same structure and different values will be processed in the exact same way.
Below is a snippet showing the code for processing a binary tree and printing each node’s value.
struct binary_tree_node_t { std::string word; binary_tree_node_t* left = nullptr; binary_tree_node_t* right = nullptr;};
void print_tree(binary_tree_node_t* node) { if (node == nullptr) { return; } std::println("{}", node->word); print_tree(node->left); print_tree(node->right);}
This more or less captures structural recursion. It maps closely to the structure of the input data, and is the kind of recursion we think of when doing things like processing a file/folder hierarchy (though as we saw at the outset, it applies to numeric inputs as well).
One other helpful way to think about structural recursion is whether its parameters are resource based (how much of something is left) vs positional (how far through the input data are you). It’s a subtle distinction, but can help guide the design of a recursive function’s interface (its parameters) based on the problem to be solved. Numeric data tends to fit resource based better, whereas lists are naturally aligned with positional (the head/tail pattern can be thought of as a way to reframe a positional approach as resource based).
1.2 Generative recursion
Generative recursion covers cases where a recursive function produces new data for each recursive call. The input chosen for the next level of recursion will depend on the value of the data being processed. The most common examples of this are divide and conquer algorithms such as quicksort or binary search.
void quicksort(std::span<int> numbers) { if (numbers.size() <= 1) { return; } int pivot = numbers.back();
auto mid = std::partition( numbers.begin(), numbers.end(), [pivot](int n) { return n < pivot; }); std::swap(*mid, numbers.back());
quicksort(numbers.subspan(0, mid - numbers.begin())); quicksort(numbers.subspan((mid - numbers.begin()) + 1));}
If we take the above quicksort implementation as an example, passing it the list {30, 50, 15, 45, 10, 2, 36}
to sort, the first partitioned range will be {30, 2, 15, 10}
, with the corresponding half being {50, 45}
(36
is the pivot in the center).
[30, 2, 15, 10, 36, 50, 45]
However, if a different pivot was selected, either by selecting a random pivot in the implementation, or if the list was is a different starting configuration, the first and second half of the range would be quite different (for example if the input data was {30, 50, 36, 45, 10, 2, 15}
and 15
was selected as the pivot, the first partitioned range would be 2, 10
, with the other half of the range being {45, 50, 30, 36}
).
[2, 10, 15, 45, 50, 30, 36]
This is what makes this recursive call generative, as opposed to structural. The same reasoning applies to binary search, which generates different inputs based on the target it’s searching for. Generative recursion also applies in cases where we’re traversing a graph structure and certain neighbors may or may not be reachable depending on the data (for example a grid with blocked cells).
1.3 Identifying structural and generative recursion
Now we’ve defined both structural and generative recursion, why is this actually useful to know, and what does it tell us about the algorithms we’re either building or understanding. The most important factor when dealing with structural vs generative recursion is how to handle the base case. Structural recursion tends to be much simpler, as a base case will naturally arise from the shape of the data being processed (we reach zero, the end of a list, or a leaf node). Generative recursion is more complicated, it’s necessary to prove and validate the terminating condition, as it is not guaranteed by the structure of the data alone. Time complexity can also vary more with generative recursion. With structural recursion, complexity (or performance) is usually tied to the input size, whereas with generative recursion there is no guaranteed relationship between the two. This is why quicksort’s performance can degenerate to O(n²) in a worst case scenario, rather than the average case of O(n log n).
2 Recursive control flow
There unfortunately does not appear to be a well agreed upon term for this, but in this context what is meant by recursive control flow is how many times a recursive function is called at each level of recursion.
2.1 Single-path recursion
Single-path recursion is when a recursive function is called no more than once at each level of recursion. There are two varieties of single path recursion, linear and branching.
2.1.1 Linear recursion
Linear recursion applies to several of the earlier examples we saw (numerical decrement and head/tail). This is where a function calls itself a single time at each level of recursion, and the recursive call appears once in the source code. This leads to a simple call structure; the callstack grows linearly until the base case is reached, and then each nested function call returns.
void print_n_times(std::string_view message, int n) { if (n == 0) { return; } std::println("{}", message); print_n_times(message, n - 1);}
Even in this simple case, there’s one additional subtlety which arises from whether we choose to perform the logic in our function before or after the recursive call (this is often referred to as preorder and postorder traversal when discussing tree structures; you may also hear it referred to as head recursion and tail recursion in the context of scalar or list types). There is no observable behavior change when running each example, however if we add some additional output we can see a little more clearly what’s happening internally.
The preorder case looks as follows.
void print_n_times_preorder( std::string_view message, int n, int indent) { if (n == 0) { std::println("{}returning", std::string(indent, ' ')); return; } std::println("{}\"{}\"", std::string(indent, ' '), message); print_n_times_preorder(message, n - 1, indent + 2); std::println("{}returning", std::string(indent, ' '));}
print_n_times_preorder("hello", 3, 0);
// "hello"// "hello"// "hello"// returning// returning// returning// returning
From the output, we can see the action this function takes (printing "hello"
) happens on the way down the callstack. The recursive function then returns after reaching the base case.
The postorder case is shown below.
void print_n_times_postorder( std::string_view message, int n, int indent) { std::println("{}calling", std::string(indent, ' ')); if (n == 0) { return; } print_n_times_postorder(message, n - 1, indent + 2); std::println("{}\"{}\"", std::string(indent, ' '), message);}
print_n_times_postorder("hello", 3, 0);
// calling// calling// calling// calling// "hello"// "hello"// "hello"
In this scenario, we see the recursive calls happen first, and on the way back up the callstack the action is taken.
This has wide ranging implications for how we handle return values with recursive functions. We’ll cover recursive function return patterns in much more detail later, but this is just a hint as to what’s to come so we don’t completely gloss over this detail at this point.
2.1.2 Branching recursion
Branching single-path recursion is when more than one recursive call exists in the body of a recursive function, but it is guaranteed only one of those functions will be called at each level of recursion.
A prime example of this is binary search, shown below (simplified to not return anything).
// low and high represent a half-open range -> [0 - numbers.size())void binary_search( std::span<const int> numbers, int target, int low, int high) { if (low >= high) { std::println("\"{}\" not found", target); return; } const int mid = low + (high - low) / 2; if (numbers[mid] == target) { std::println("\"{}\" exists at index: {}", target, mid); } else if (numbers[mid] < target) { binary_search(numbers, target, mid + 1, high); } else { binary_search(numbers, target, low, mid); }}
Even though the above has more than one recursive call in the function body, we only ever make one recursive call at each level. The callstack will still be linear, and it will grow proportional to O(n log n).
2.2 Multi-path recursion (tree recursion)
Multi-path recursion (or tree recursion) applies to any recursive function that makes more than one recursive call at each level of recursion. Tree recursion (confusingly), has nothing to do with the tree data structure. The ‘tree’ in the name is referring expressly to the call graph that is generated when more than one recursive function is called. This creates a tree shape when the call stack is visualised (this is why the name multi-path recursion can be clearer).
There are two main kinds of multi-path/tree recursion.
2.2.1 Multi-branch recursion
Multi-branch recursion refers to two sorts of cases, conditional and unconditional branches. Unconditional branches are simpler and we’ve previously seen an example of this with our tree traversal function. This is where the recursive calls always happen, regardless of the current state of the callstack (discounting the base case).
void print_tree(binary_tree_node_t* node) { if (node == nullptr) { return; } std::println("{}", node->word);
print_tree(node->left); print_tree(node->right);}
Conditional branches are only called based on the current state of recursion. They explore choices or decisions that need to be made at each level. Multi-branch recursive functions do tend to be more complex, but are also more powerful and flexible.
Below is an example illustrating this concept.
void count_down_repeatedly(int n, int& top) { if (n <= 0) { top--; return; } std::println("{}", n); if (n < top) { count_down_repeatedly(n - 1, top); } else { count_down_repeatedly(n - 1, top); count_down_repeatedly(n - 1, top); }}
This function will count down from the input number n
, and then countdown from the next largest number after it reaches the base case (this behavior relies on n
and top
being set to the same value initially). This recursive function isn’t pure (explained later), as we mutate one of the function arguments. It’s only purpose is to demonstrate conditional multi-branch recursion in a limited setting. For a more representative example, see the parenthesis permutation generation algorithm presented later when backtracking is introduced (see 6.1.1 Implicit backtracking).
2.2.2 Looping recursion
The last kind of tree recursion is looping. This is where we call a recursive function from inside a loop in the body of a recursive function. Loop based recursion most commonly iterates through a recursive structure. It can also be combined with multi-branch recursion. For example, when checking neighbors in a grid, we iterate over all possible directions, but first check if our next position both falls inside the grid (isn’t out of bounds), and is reachable (not blocked or already visited).
If we revisit the earlier tree example, and update the data structure to contain a variable number of children, we can see a short example of what this looks like.
struct tree_node_t { std::string value; std::vector<tree_node_t*> children;};
void print_tree(const tree_node_t* node) { if (tree == nullptr) { return; } std::println("{}", node->value);
for (auto* child : node->children) { print_tree(child); }}
This only really applies in imperative languages where looping constructs are available. It is possible to mimic this kind of looping using the head/tail approach we touched on earlier, or with an index and a technique called mutual recursion (see 5.2 Mutual recursion for more details).
2.3 Recursive control flow summary
These are all the kinds of control flow you’ll encounter when dealing with recursive functions (excluding the slightly more exotic mutual recursion that we’ll cover later).
The general structure breaks down as follows.
├── Recursive control flow ├── Single-path recursion │ ├── Linear recursion │ └── Branching recursion └── Multi-path recursion (tree recursion) ├── Multi-branch recursion │ ├── Conditional │ └── Unconditional └── Loop recursion
3 Pure and impure recursive functions
We’re going to briefly touch on this topic before looking at how we return values from recursive functions. So far, the recursive functions we’ve seen haven’t returned a value, they’ve either modified global state (in our case IO when printing values), or mutated the input arguments of the function (as with quicksort). The second case, where input data is modified, is more representative of impure functions out in the wild.
Pure functions, by comparison, do not modify external state, and simply return a new value. There are cases when input arguments are modified as part of the recursive call, but these tend to be an implementation detail, and the input data is not mutated. Another way to define pure functions is that they do not have side effects; calling them does not change external state in any way. They have what is called referential transparency, where the same input (parameter) is guaranteed to produce the same output (return value).
Both pure and impure recursive functions have their place, some well known examples of impure functions are maze generation (where a grid has passages ‘carved’ into it as the recursive algorithm runs), flood fill (think of the paint bucket in a 2D drawing application) to update a bitmap image of some kind, in-place sort algorithms, and problems such as N-Queens and Sudoku solver. Pure recursive functions include examples such as Fibonacci, factorial, generation of permutations, subsets and combinations, and binary search.
4 Recursive function return patterns
The following section describes the internal recursive function computation strategy for building results to be returned. There are two main ways recursive functions build values to return, top down, widely known as the accumulator pattern, and bottom up, which we’ll call result aggregation. There are pros and cons to each approach, and both are worth knowing about. Mixed into this discussion are the subtleties around returning multiple values from a function which is also different depending on if we’re using the accumulator pattern or result aggregation. We’ll also touch on how the accumulator pattern and tail recursion are closely related.
4.1 Accumulator pattern
The accumulator pattern introduces an additional parameter to a recursive function which is used to build up the return value as recursion proceeds down the callstack. The computation happens during the descent, with the modified accumulator passed forward at each level of the recursion. When the base case is reached, we simply return the accumulated value.
A simple way to demonstrate this is with a recursive implementation of the sum function.
int sum(std::span<const int> collection, int acc) { if (collection.empty()) { return acc; } auto [head, tail] = head_tail(collection); return sum(tail, head + acc);}
We use the same head/tail pattern from print_words
, and at each recursive call we take the current head, and add it to the accumulator. One advantage of this approach is in certain contexts (this depends on the language and compiler), we can take advantage of a technique called tail optimisation (see 7.2 Tail call optimisation). This is an optimisation technique and there is no guarantee the compiler will implement it.
The values of head
and acc
are shown below as the recursion descends.
# head + acc 5 + 0 4 + 5 3 + 9 2 + 12 1 + 14 15
When dealing with a collection of scalar values, it’s possible to swap the accumulator with an out parameter (a ‘collector’), that can have values appended to it (say if finding all even numbers). The tricky part is when dealing with a list of list values, where we expect different configurations for each internal value. This is common in such problems as combinations, subsets and permutations. In this scenario, an accumulator (the parameter building the internal value) and an out parameter (to hold the combined results) are needed.
One way to demonstrate this is with a recursive function to generate all permutations of an n
length binary digit.
void binary_strings_acc( int n, std::string acc, std::vector<std::string>& result) { if (n == 0) { result.push_back(acc); return; } binary_strings_acc(n - 1, acc + "0", result); binary_strings_acc(n - 1, acc + "1", result);}
Here we can see we keep the accumulator value as before, but have an additional out parameter, result
. The recursive function now doesn’t actually return anything directly. The computation happens as we move down the callstack (the acc + "0"
and acc + "1"
lines). We remember to decrement n
to reduce our input parameter to ensure the base case condition is met, and when it reaches zero, the function returns. It’s at this point that we append the accumulated value to the result
out parameter.
This style of recursion definitely feels more natural to imperative programmers, but it can sometimes lead to more convoluted solutions when compared to the alternative means of returning from recursive functions below.
It’s worth also briefly mentioning the interface to this kind of recursion can be improved by following the helper function pattern (see 5.1 Recursive helper function pattern section).
4.2 Result aggregation
The alternative to the accumulator pattern, which tends to be a more natural fit with recursion, is result aggregation (also known as head recursion). In this case, we proceed all the way down the callstack first, before performing any computation on the return value. When we reach the base case and return, computation occurs as the recursive calls unwind. On the way back up, at each level a partial result is computed from the results of the previous recursive calls. The final return value will include all previous results aggregated together.
For comparison, we can see the equivalent sum implementation using result aggregation (or head recursion).
int sum(std::span<const int> collection) { if (collection.empty()) { return 0; } auto [head, tail] = head_tail(collection); return head + sum(tail);}
We use the head/tail pattern again, only this time the recursive call is made before the sum is computed. This has the effect of recursing until we hit the base case and return 0
. We then return whatever the current head
value is added to 0
, and then the previous head
added to that value, and so on, until we reach the top of the callstack of the recursive function. This is what we mean when we say the computation happens during the ascent.
The values of head
and sum(tail)
are shown below as the recursive function returns.
# head + sum(tail) 5 + 0 4 + 5 3 + 9 2 + 12 1 + 14 15
Similar to the accumulator pattern above, we can see what a result aggregation version of binary_strings
would look like.
std::vector<std::string> binary_strings(int n) { if (n == 0) { return {{}}; } auto partials = binary_strings(n - 1); std::vector<std::string> result; for (const auto& partial : partials) { result.push_back("0" + partial); result.push_back("1" + partial); } return result;}
In this version, we make the recursive call before performing any computation. In the base case, we return a vector with an empty string, this is to ensure the vector has a size of 1 ({{}}
is just short hand for std::vector<std::string>(1, "")
). If we wanted to ignore this particular base case, we’d simply return an empty vector. We then iterate over the returned vector of strings, and build a new results vector with elements "1"
and "0"
. This bubbles up to the next invocation, which then appends "1"
and "0"
to the previous "1"
and "0"
, resulting in "00"
, "10"
, "01"
, "11"
. This repeats until all permutations are generated.
Result aggregation naturally fits recursion, and provides the ability to write a pure function with no side effects. There’s often no need for a wrapper/helper function as well. Result aggregation can be a little more difficult to understand if it’s not something you encounter often (this is definitely my experience), though the patterns relating to traversal and reduction show up again and again with different recursive problems, so gaining familiarity with the technique make solving these problems easier.
5 Interfaces to recursion
There are several useful patterns that control how the outside world interacts with recursive functions, and how they interact with one another. Some are outside the scope of this article, but two useful ones to know about are the helper function pattern, and mutual recursion.
5.1 Recursive helper function pattern
More often than not, a recursive call will need more information as part of its interface to pass down to subsequent calls than we’d like to expose to the caller. To work around this, it’s more often than not advised to write a wrapper around a recursive function to provide a clean interface to the outside would, and setup the arguments of the actual recursive call inside the helper.
For example, if we take a variation of the sum function we looked at earlier, we can rewrite it to use an index instead of the head/tail approach (this is essentially a recursive way of mimicking a for loop).
int sum(std::span<const int> collection, int i) { if (i == collection.size()) { return 0; } return collection[i] + sum(collection, i + 1);}
The problem is to use this, the caller would need to pass 0
along with the collection they’d like to sum.
std::vector<int> some_numbers = {1, 2, 3, 4, 5};
int total = sum(numbers, 0);
In this particular case we could add a default parameter, but it still risks users passing a value other than 0
. A much better approach is simply to wrap the call in a helper function, and make that the public interface.
int sum_internal(std::span<const int> collection, int i) { if (i == collection.size()) { return 0; } return collection[i] + sum_internal(collection, i + 1);}
int sum(std::span<const int> collection) { return sum_internal(collection, 0);}
This technique is particularly effective when applied to recursive functions using the accumulator pattern, or those using memoization (see 7.1 Memoization section).
5.2 Mutual recursion
Mutual recursion describes when two separate recursive functions call one another. This pattern is more relevant for functional languages that do not support loops, but it can crop up from time to time in other recursive algorithms. It’s particularly common when used in parsing.
A quick example is shown below using the same tree_node_t
type from the looping recursion example.
void process_children(std::span<const tree_node_t*> children) { if (children.empty()) { return; } auto [head, tail] = head_tail(children);
process_tree(head); process_children(tail);}
void process_tree(const tree_node_t* tree) { if (tree == nullptr) { return; } std::println("{}", tree->value);
process_children(tree->children);}
In the example above, instead of using a for loop to iterate over the tree’s child nodes, we use a separate recursive function (process_children
), which uses the head/tail pattern to work its way through the children, after first processing each node.
6 Traversal and reduction/search
There are a handful of really useful traversal (all matches) and reduction (single match) patterns that show up over and over again in recursive algorithms. These are different to the purely structural techniques we encountered at the start (head/tail, index etc.) The focus of traversal and reduction is on the generation of the results (return values), rather than simply exploring a data structure recursively.
6.1 Backtracking
Backtracking is very common in recursive functions, and comes about when recursive functions make decisions about which branches to explore. This implies that for a recursive function to use backtracking, it must make use of conditional multi-branch recursion (introduced in 2.2.1 Multi-branch recursion). By extension, this means an algorithm not using conditional multi-branch recursion, cannot use backtracking.
Backtracking comes in several forms, with both implicit and explicit tracking of explored branches. We’ll briefly cover these various forms with examples.
6.1.1 Implicit backtracking
Implicit backtracking occurs automatically when branches are explored and the recursive calls return. When the recursive call returns, it automatically ‘undoes’ the action it took on the previous path, so it can explore a different one.
A textbook example of this particular form of backtracking is generating all permutations for a fixed number of open and closed parentheses.
std::vector<std::string> generateParentheses(int n) { if (n == 0) { return {}; } return generateParentheses_internal(n, n);}
std::vector<std::string> generateParentheses_internal(int left, int right) { if (left == 0 && right == 0) { return {{}}; } std::vector<std::string> result; if (left > 0) { for (const auto& partial : generateParentheses_internal(left - 1, right)) { result.push_back("(" + partial); } } if (right > left) { for (const auto& partial : generateParentheses_internal(left, right - 1)) { result.push_back(")" + partial); } } return result;}
We make use of the recursive helper function pattern to track two separate variables for the left and right parentheses (we also handle the degenerate case where 0 is passed to the function). The function uses multi-path (tree) recursion with multiple conditional branches. In this recursive function, rather than using a position based formulation (tracking where we are in the input data), we’re using a resource based formulation (how many of each set of parentheses do we have remaining). This algorithm will only produce matched open and closing parentheses (e.g. for generateParentheses(3)
, we’d get ()()()
and ((()))
etc…, but not )()()(
), this is due to the if (right > left)
check, which guarantees we can’t place a right parenthesis before the corresponding left parenthesis. If we instead changed the condition to right > 0
, we’d generate all permutations of parentheses. This is an example of pruning, a feature of backtracking to prevent the recursion from exploring future branches it knows can’t lead to a valid solution. This is much more efficient than exploring all possible branches, and then filtering the results afterwards.
The key to understanding why this algorithm uses backtracking is that we try a path, see if it’s valid, and then undo that change to explore other possible alternatives. For example, if we reach what will become (()())
, we hit the base case and then start returning, when we reach the fourth character (index 3), instead of placing a (
, we undo that change, and place a )
, this allows us to explore a separate branch where we then add ()
again, resulting in (())()
as another valid option. Due to our use of result aggregation (using head recursion to return the values), the order these results get built up is a little counter intuitive, but it’s the base case that still determines when we’ve found a match.
6.1.2 Explicit backtracking (visited)
The other common form of backtracking is where we explicitly track values or positions we’ve already seen (visited), and then clear or unset those values as we undo the action after a recursive function returns.
Below is an example demonstrating how to find all possible routes from the starting location, (0, 0)
, to the destination, (size - 1, size -1)
of a grid (U, D, L, R
are returned to indicate the movement as up, down, left and right).
struct coord_t { int row; int col;};
std::string directions_str = "DRLU";std::vector<coord_t> directions = {{1, 0}, {0, 1}, {0, -1}, {-1, 0}};
// required by std::setbool operator<(const coord_t& lhs, const coord_t& rhs) { return std::tie(lhs.row, lhs.col) < std::tie(rhs.row, rhs.col);}
bool is_valid(coord_t coord, int rows, int cols, std::set<coord_t>& visited) { return coord.row >= 0 && coord.row < rows && coord.col >= 0 && coord.col < cols && visited.count(coord) == 0;}
std::vector<std::string> all_routes( std::vector<std::vector<bool>>& grid, coord_t coord, std::set<coord_t>& visited) {
if (!is_valid(coord, grid.size(), grid[0].size(), visited)) { return {}; }
if (coord.row == grid.size() - 1 && coord.col == grid[0].size() - 1) { return {{}}; }
visited.insert(coord); std::vector<std::string> result; for (int i = 0; i < directions.size(); i++) { auto next_coord = coord_t{ .row = coord.row + directions[i].row, .col = coord.col + directions[i].col}; auto partials = all_routes(grid, next_coord, visited); for (auto& partial : partials) { partial.insert(partial.begin(), directions_str[i]); } result.insert(result.end(), partials.begin(), partials.end()); }
visited.erase(coord); return result;}
The key lines are highlighted above. Each time we enter a cell, we add it to our visited
list. The is_valid
test includes checking that we are not attempting to add a node we have already visited as part of this branch. If/when the destination is reached, we return a vector with a single empty string to have the current direction prepended immediately after the recursive call. When the recursive call returns, we backtrack, and undo having visited the cell so it may be included in subsequent paths.
If we run this on a 3x3 grid, we’ll see the following results printed.
["DDRR", "DDRURD", "DDRUURDD", "DRDR", "DRRD", "DRURDD", "RDDR", "RDRD", "RDLDRR", "RRDD", "RRDLDR", "RRDLLDRR"]
In this specific example, we could have forgone the visited
set, and instead used the grid itself, by updating each cell to true
to indicate it being visited. Depending on the data we’re working with though this isn’t always possible. Both are valid options depending on the scenario.
6.2 Early return/short-circuiting
Early return/short-circuiting is a technique used across many kinds of recursive algorithms (it applies to both traversal and reduction/search kinds). The general idea is to cease execution as soon as a success condition is met, or to avoid wasted work (pruning in backtracking is one such use of early return/short-circuiting). Examples of the former include algorithms such as binary search (where we return as soon as we find the target value) and path finding in graph traversal (stop as soon as you reach the target node).
6.3 Include/exclude
Include/exclude is an incredibly useful technique. The pattern is a form of binary recursion (we usually, though not always, have two recursive calls). The idea is to recursively process the input data one element at a time, by either including or excluding the foremost value. When the input data is a list this is easy, it’s just the head of the list (the first element).
A good example to demonstrate this technique is finding which numbers from a list sum together to produce a given target value.
std::vector<std::vector<int>> sum_to_total_without_repeats( std::span<int> numbers, int target) {
if (target == 0) { return std::vector<std::vector<int>>(1, std::vector<int>()); } else if (numbers.empty() || target < 0) { return std::vector<std::vector<int>>(); }
auto [head, tail] = head_tail(numbers);
const auto including = sum_to_total_without_repeats(tail, target - head);
std::vector<std::vector<int>> result(including.size()); std::transform( including.begin(), including.end(), result.begin(), [head](auto included) { included.insert(included.begin(), head); return included; });
const auto excluding = sum_to_total_without_repeats(tail, target);
result.insert(result.end(), excluding.begin(), excluding.end()); return result;}
Let’s walk through the sections in the above code. The first base case consists of a success case, target == 0
, which indicates the numbers from the input list processed so far equal the target value. We invert things by subtracting instead of adding so we do not require a secondary accumulator value, we can then just test target
against 0
.
The second base case is when the recursive call fails to produce the target value. The first failure case is if we ran out of numbers to add (with the input {1, 2, 3, 4
} and target value 5
, in the case were we visit 4
first, there are then no more numbers to process/add, so we hit this case). The second failure case is if we overshoot our target (if we proceed from 1
, and add 2
and 3
, we’ll have a total of 6
, which we represent as -1
, so target
will be less than 0
).
We next have the familiar head/tail pattern (ideal for handling lists of data), immediately followed by the first recursive call to sum_to_total_without_repeats
. We pass the tail
of the list as the first argument, to ensure our input shrinks, but we also decrement the value of head
from our running total. By doing this, we say we include head
in this recursive branch, we are factoring it into the computation.
The following section is where we build our intermediate results. We take the result of the recursive call, and iterate over each vector of integers contained, prepending our current head value. This is the crux of most recursive algorithms. We assume sum_to_total_without_repeats
has returned to us the correct value for the smaller list, and smaller value of k
, and we then add the current head
we have to the front of each element in the list. Because of our base case, if there wasn’t a match for the particular combination, we’ll never perform the logic in the std::transform
call, as including
will be empty.
Last but not least, we have the second recursive call, which counts as the exclude call. The reason we refer to this as exclude is because we’re not including the current head
value we’ve pulled off our list at this level of the recursion. We’re effectively skipping it, and passing in the remaining tail, without using it at all.
We finally append the excluded results from the second recursive call with our previous results variable, and then return the concatenated result. This pattern allows us to visit all combinations of the numbers from our input. It turns out to be super useful and can be used for generating permutations, combinations and subsets which comes in handy when solving a range of recursive problems.
7 Recursive optimisations
There are two main approaches to speed-up the running of recursive solutions. Each method applies in slightly different scenarios, but they can be useful to be aware of.
7.1 Memoization
Memoization is the process of caching precomputed values so a recursive function does not need to make repeated calls to calculate solutions it already knows the answer to. The most famous example of this is Fibonacci. The recursive solution to Fibonacci is as follows.
int fibonacci(int n) { if (n <= 1) { return n; } return fibonacci(n - 1) + fibonacci(n - 2);}
This recursive function is an example of multi-path, unconditional branch recursion, resulting in tree recursion (many branching paths). The issue with this naive solution is we end up recomputing the same values over and over again (e.g. fibonacci(5)
calculates fibonacci(2)
three times, and fibonacci(3)
twice).
├── Fib 5 ├── Fib 4 │ ├── Fib 3 │ │ ├── Fib 2 │ │ │ ├── Fib 1 │ │ │ └── Fib 0 │ │ └── Fib 1 │ └── Fib 2 │ ├── Fib 1 │ └── Fib 0 └── Fib 3 ├── Fib 2 │ ├── Fib 1 │ └── Fib 0 └── Fib 1
The runtime of this function is O(2ⁿ), which gets out of hand insanely quickly.
To avoid this repeated work, we can introduce a cache (essentially just a collection of key/value pairs). In C++ this might look something like the following.
int fibonacci(int n) { std::unordered_map<int, int> cache; return fibonacci_internal(n, cache);}
int fibonacci_internal( int n, std::unordered_map<int, int>& cache) { if (n <= 1) { return n; }
if (auto it = cache.find(n); it != cache.end()) { return it->second; } int partial = fibonacci_recursive_memoized_internal_indent(n - 1, cache) + fibonacci_recursive_memoized_internal_indent(n - 2, cache);
cache.insert({n, partial}); return partial;}
If we log invocations to the recursive function, we now get the following.
├── Fib 5 ├── Fib 4 │ ├── Fib 3 │ │ ├── Fib 2 │ │ │ ├── Fib 1 │ │ │ └── Fib 0 │ │ └── Fib 1 │ └── Fib 2 └── Fib 3
This takes the algorithm from a time complexity of O(2ⁿ) to O(n) (space complexity if also O(n)).
It’s possible to apply this technique to any problem where there are overlapping subproblems (where the same subproblem is solved multiple times). The key used for the cache must be based exclusively on the parameters of the function. One small detail to be aware of is using the head/tail technique where the tail is passed as the reduced parameter isn’t a great fit for memoization, as it doesn’t work well as a key for a map/hash-table data structure. It’s instead better to use an offset, or position through the list. It may be necessary to also include the size (offset + size) or target, depending on the problem, as the key to guarantee uniqueness based on the state of the recursion.
7.2 Tail call optimisation
The accumulator pattern discussed earlier in this article can be combined with a compiler optimisation to eliminate the call stack overhead of recursive functions. To achieve this, the last operation in the recursive case of a function must be the recursive call, and the function must be able to return its result immediately when reaching the base case (as no stack unwinding will take place), this is why an accumulator parameter is required. Tail call optimisation is something often employed in purely functional languages, but it may not be supported in certain imperative languages (e.g. Java, JavaScript, Python, though most modern C++ compilers do implement tail call optimisation), so your mileage may vary.
The sum
function we saw earlier in the accumulator pattern section is a perfect example of this.
int sum(std::span<const int> collection, int acc) { if (collection.empty()) { return acc; } auto [head, tail] = head_tail(collection); return sum(tail, head + acc);}
Notice how the last operation in the recursive part of the function is a recursive call (no logic at the current level happens after it executes), and it passes through an accumulator value which it returns directly in the base case.
If we take a quick look at both implementations in Compiler Explorer, we can see the big difference in assembly code. The smoking gun that the tail recursive version has had the tail call optimisation applied, is that we see the call to sum
in the tail recursive version is using a jmp
instruction, as opposed to the head recursive version which allocates space on the callstack and then uses call
to invoke the function again (there’s some clever loop unrolling going on in the accumulator version too).
The kind of problems that adapt to tail call recursion are not often the ones that benefit from a recursive approach (e.g. backtracking), so chances are writing the algorithm iteratively will make the most sense in an imperative language, though functional languages benefit greatly from this approach and optimisation.
8 Bonus
8.1 head_tail
implementation
This section is off-topic when it comes to recursion, but is included for completeness to briefly explain the head_tail
implementation for those who are curious.
Throughout this article, a number of the examples use the incredibly convenient C++20 library type std::span
. It allows us to create a non-owning view of a contiguous sequence (most likely an array or std::vector
), without needing to use a pointer and a size. By using front
and subspan
, we can get a reference to the first element, and then a new range (or span) of the remaining elements (the tail). No copies are made, we’re simply adjusting some offsets to alter the view of the elements.
This can be a little verbose to type, and it can be easy to forget to take a reference to the head of the list and accidentally make a copy. Thankfully, this is where our little helper function called head_tail
comes in.
template<typename Range>auto head_tail(Range&& r) { std::span s{r}; return std::pair<decltype(s.front()), decltype(s.subspan(1))>( s.front(), s.subspan(1));}
The function template uses a forwarding reference to pass a range to an internal std::span
, and then ensures the types of the std::pair
to be returned match that of the span (this is so we don’t inadvertently make a copy when returning the head). It’s essentially syntactic sugar for the following.
auto& head = collection.front();auto tail = collection.subspan(1);
auto [head, tail] = head_tail(list);
This helps prevent errors and makes the examples a little more succinct. We can also add an overload for std::string_view
which can be quite useful as well.
auto head_tail(std::string_view s) { return std::pair<decltype(s.front()), decltype(s.substr(1))>( s.front(), s.substr(1));}
This is very much bonus/auxiliary content and not super relevant when it comes to everything else recursion related mentioned above.