Cracking the Coding Interview - Remove Duplicates From Unsorted Linked List

2.1 Write code to remove duplicates from an unsorted linked list. FOLLOW UP: How would you solve this if a temporary buffer is not allowed?

In this post, I’ll cover deduping both sorted and unsorted linked lists.

First, here’s a node class that I’ll use:

class Node:
    def __init__(self, v=None, next_node=None):
        self.v = v
        self.next_node = next_node

    def __repr__(self):
        return f"{self.v} -> {repr(self.next_node)}"

    def __eq__(self, other):
        return repr(self) == repr(other)

Deduping a sorted list

If a list is sorted, the duplicate elements will be grouped together.

For example:

1 -> 1 -> 2 -> 3 -> 3 -> 3

With the solution below, we maintain two pointers (prev_node and curr_node). prev_node is our left pointer and will represent the value of the current “run”. The right pointer curr_node acts as a runner that we’ll use to compare with the left. If they are duplicates, remove the dupe and only move the current pointer. Once there are no more dupes in the current “run”, we can safely update the prev_node value.

def dedupe_linkedlist(head):
    if not head:
        return head
    prev_node = head
    curr_node = head.next_node
    while curr_node:
        curr_value = curr_node.v
        next_node = curr_node.next_node
        if curr_value == prev_node.v:
            prev_node.next_node = next_node
        else: 
            prev_node = curr_node # only update the left pointer if the right pointer is a different element
        curr_node = next_node
    return head

This only works for sorted lists, because in an unsorted list, a duplicate of a left pointer can appear anywhere in the list (and not just in a consecutive sequence of nodes). The extra pointer, however, is unecessary.

Here’s the same approach without using two pointers. Rather than having a separate left pointer, we can just keep a single pointer in place when there are dupes.

def dedupe_linkedlist(head):
    if not head:
        return head
    curr_node = head
    while curr_node and curr_node.next_node:
        value = curr_node.v
        next_node = curr_node.next_node
        if curr_node.v == next_node.v:
            curr_node.next_node = next_node.next_node
        else:
            curr_node = next_node
    return head

Deduping an unsorted list

Here’s two versions: one that uses a temporary “visited” structure and one that doesn’t (in order to meet the “FOLLOW UP” requirements).

Here’s an example using a set datastructure to remember visited nodes:

def dedupe_linkedlist((head):
    curr_node = head 
    visited = {curr_node.v}
    while curr_node and curr_node.next_node:
        next_node = curr_node.next_node
        next_value = next_node.v
        visited.add(next_value) # we can also optimize this by only calling it if the next_value is not visited yet
        if next_value in visited: 
            curr_node.next_node = next_node.next_node 
        else:
            curr_node = next_node
    return head

If we want to do this without any additional memory, we need to have an inner loops / runner that checks for dupes:

def dedupe_linkedlist(head):
    curr = head 
    while curr:
        runner = curr
        while runner and runner.next_node:
            if runner.next_node.v == curr.v:
                runner.next_node = runner.next_node.next_node
            else:
                runner = runner.next_node
        curr = curr.next_node 
    return head

If you’re relatively new to working with singly linked list structures, tackling both traversal and deletion can feel tricky. Practice traversing and deleting separately in isolation, and make sure you verify the mutations with examples during interviews because it’s easy to make pointer mistakes (null pointers, updating to wrong node).

Hope this helps.