12 essential data structures for Python developers. Each one explained with Big O complexity, animated visuals, and real code samples you can copy.
listFixed-size, contiguous block of memory. Elements are stored sequentially and accessed by index in constant time. The foundation of most other data structures.
arr[0] = 7
Complexity
When to use
import array
# Fixed-type array (like C arrays)
nums = array.array("i", [10, 20, 30, 40, 50])
nums[0] = 99 # O(1) - direct index access
val = nums[2] # O(1) - read by index
nums.append(60) # O(1) amortized
nums.insert(1, 15) # O(n) - shifts elements right
idx = nums.index(30) # O(n) - linear search
nums.remove(15) # O(n) - shifts elements left
sorted_nums = sorted(nums) # O(n log n)listResizable array that automatically grows when capacity is exceeded. The most commonly used data structure in most languages. Doubles its internal storage when full, giving amortized O(1) appends.
arr[0] = 7
Complexity
When to use
names = ["Alice", "Bob", "Charlie"]
names.append("Diana") # O(1) amortized
names.insert(1, "Eve") # O(n) - shifts elements
names.pop() # O(1) - remove last
names.pop(0) # O(n) - shifts elements
has = "Bob" in names # O(n) - linear scan
idx = names.index("Bob") # O(n) - first occurrence
names.sort() # O(n log n) - Timsort
squares = [x * x for x in range(10)] # O(n) list comprehensionlist (as stack)Last-In-First-Out (LIFO) collection. Only the top element is accessible. Used for tracking state that must be unwound in reverse order.
Push 10
Complexity
When to use
stack: list[int] = []
stack.append(10) # O(1) - push
stack.append(20)
stack.append(30)
top = stack[-1] # O(1) - peek -> 30
val = stack.pop() # O(1) - pop -> 30
# Interview pattern: valid parentheses
def is_valid(s: str) -> bool:
pairs = {"(": ")", "[": "]", "{": "}"}
st: list[str] = []
for c in s:
if c in pairs:
st.append(pairs[c])
elif not st or st.pop() != c:
return False
return len(st) == 0collections.dequeFirst-In-First-Out (FIFO) collection. Elements are added at the back and removed from the front. Fundamental for breadth-first processing.
Enqueue 10
Complexity
When to use
from collections import deque
queue: deque[str] = deque()
queue.append("Task A") # O(1) - enqueue
queue.append("Task B")
queue.append("Task C")
first = queue[0] # O(1) - peek front
val = queue.popleft() # O(1) - dequeue -> "Task A"
# Interview pattern: BFS
def bfs(root):
q = deque([root])
while q:
node = q.popleft() # O(1)
print(node.val, end=" ")
if node.left: q.append(node.left)
if node.right: q.append(node.right)dictMaps keys to values using a hash function for near-constant-time lookups. The single most important data structure for coding interviews. Every language has a built-in implementation.
hash("age") = 0
Complexity
When to use
prices = {"apple": 3, "banana": 5}
prices["cherry"] = 2 # O(1) - add
prices["apple"] = 10 # O(1) - update
has = "banana" in prices # O(1) - key check
del prices["cherry"] # O(1) - remove
count = prices.get("mango", 0) # O(1) - safe lookup
# Interview pattern: Two Sum
def two_sum(nums: list[int], target: int) -> list[int]:
seen: dict[int, int] = {}
for i, num in enumerate(nums):
need = target - num
if need in seen: # O(1) lookup
return [seen[need], i]
seen[num] = i # O(1) insert
return []setUnordered collection of unique elements. Uses hashing internally for O(1) membership testing. Supports mathematical set operations like union, intersection, and difference.
hash("age") = 0
Complexity
When to use
s = {1, 2, 3, 4, 5}
s.add(6) # O(1)
s.discard(1) # O(1) - no error if missing
has = 4 in s # O(1) -> True
# Set operations
other = {4, 5, 6, 7}
s & other # intersection -> {4, 5, 6}
s | other # union
s - other # difference
# Interview pattern: contains duplicate
def contains_duplicate(nums: list[int]) -> bool:
return len(nums) != len(set(nums))collections.dequeSequence of nodes where each node points to the next (singly linked) or both next and previous (doubly linked). Efficient insertion and deletion at any known position, but no index-based access.
traversing: 5
Complexity
When to use
from collections import deque
# deque is a doubly-linked list under the hood
dll: deque[int] = deque()
dll.append(10) # O(1) - add right
dll.appendleft(5) # O(1) - add left
dll.pop() # O(1) - remove right
dll.popleft() # O(1) - remove left
# Interview pattern: reverse a linked list
class ListNode:
def __init__(self, val=0, next=None):
self.val, self.next = val, next
def reverse(head: ListNode | None) -> ListNode | None:
prev, curr = None, head
while curr:
curr.next, prev, curr = prev, curr, curr.next
return prevSortedList (sortedcontainers)Collection of unique elements maintained in sorted order, typically backed by a balanced binary search tree (red-black tree). Supports range queries and O(log n) min/max.
search(8)
Complexity
When to use
# pip install sortedcontainers
from sortedcontainers import SortedList
sl = SortedList([5, 3, 8, 1, 9])
# Maintains sorted order: [1, 3, 5, 8, 9]
sl.add(4) # O(log n)
sl.discard(3) # O(log n)
minimum = sl[0] # O(1) -> 1
maximum = sl[-1] # O(1) -> 9
# Range query
left = sl.bisect_left(4) # O(log n)
right = sl.bisect_right(8) # O(log n)
between = list(sl[left:right]) # [4, 5, 8]SortedDict (sortedcontainers)Key-value pairs maintained in sorted key order, typically backed by a balanced BST. Enables ordered iteration and range lookups that hash maps cannot provide.
search(8)
Complexity
When to use
# pip install sortedcontainers
from sortedcontainers import SortedDict
sd = SortedDict({"banana": 2, "apple": 5, "cherry": 1})
sd["date"] = 3 # O(log n)
val = sd["apple"] # O(log n) -> 5
del sd["cherry"] # O(log n)
# Iterates in sorted key order
for key, value in sd.items():
print(f"{key}: {value}")
# apple: 5, banana: 2, date: 3
first_key = sd.keys()[0] # O(1) -> "apple"
last_key = sd.keys()[-1] # O(1) -> "date"heapqCollection where elements are dequeued by priority rather than insertion order. Typically implemented as a binary heap. Essential for shortest-path algorithms and top-K problems.
min-heap
Complexity
When to use
import heapq
heap: list[int] = []
heapq.heappush(heap, 3) # O(log n)
heapq.heappush(heap, 1)
heapq.heappush(heap, 2)
top = heap[0] # O(1) - peek min -> 1
val = heapq.heappop(heap) # O(log n) - pop min -> 1
# Max-heap: negate values
max_heap: list[int] = []
heapq.heappush(max_heap, -10)
largest = -heapq.heappop(max_heap) # -> 10
# Interview pattern: K closest points
def k_closest(points, k):
heap = [(x*x + y*y, [x, y]) for x, y in points]
heapq.heapify(heap) # O(n)
return [heapq.heappop(heap)[1] for _ in range(k)]threading.Lock + dictThread-safe hash map designed for concurrent read/write access from multiple threads. Uses fine-grained locking or lock-free techniques instead of a single global lock.
hash("age") = 0
Complexity
When to use
import threading
class ThreadSafeDict:
def __init__(self):
self._data: dict = {}
self._lock = threading.Lock()
def get(self, key, default=None):
with self._lock:
return self._data.get(key, default)
def set(self, key, value):
with self._lock:
self._data[key] = value
def get_or_add(self, key, factory):
with self._lock:
if key not in self._data:
self._data[key] = factory(key)
return self._data[key]
cache = ThreadSafeDict()
cache.set("counter", 0)memoryviewZero-copy view over a contiguous region of memory. Lets you reference a portion of an array or buffer without allocating new memory. Critical for performance-sensitive parsing and processing.
Span[0..3] = [1, 2, 3]
Complexity
When to use
# memoryview - zero-copy view over bytes/bytearray
data = bytearray(b"Hello, World!")
view = memoryview(data)
# Zero-copy slicing
chunk = view[7:12] # O(1) - view of "World"
chunk[0] = ord("E") # mutates original
print(data) # bytearray(b"Hello, Eorld!")
# Efficient binary parsing
header = view[:5] # O(1) - no copy
body = view[7:] # O(1) - no copy
view.release() # free the buffer lockAverage-case time complexity. * = amortized.
| Structure | Access | Search | Insert | Delete |
|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) | O(n) |
| Dynamic Array | O(1) | O(n) | O(1)* | O(n) |
| Stack | O(n) | O(n) | O(1)* | O(1) |
| Queue | O(n) | O(n) | O(1)* | O(1) |
| Hash Map | O(1) | O(1) | O(1)* | O(1) |
| Hash Set | N/A | O(1) | O(1)* | O(1) |
| Linked List | O(n) | O(n) | O(1) | O(1) |
| Sorted Set | O(n) | O(log n) | O(log n) | O(log n) |
| Sorted Map | O(log n) | O(log n) | O(log n) | O(log n) |
| Priority Queue | O(n) | O(n) | O(log n) | O(log n) |
| Concurrent Map | O(1) | O(1) | O(1)* | O(1) |
| Memory View | O(1) | O(n) | N/A | N/A |
| I need to... | Use |
|---|---|
| Store items by index, resize dynamically | List / Dynamic Array |
| Map keys to values with O(1) lookup | HashMap / Dictionary |
| Track unique items, check existence in O(1) | HashSet / Set |
| Last-in-first-out (undo, DFS, brackets) | Stack |
| First-in-first-out (BFS, task queues) | Queue |
| Keep elements sorted at all times | SortedSet / TreeSet |
| Process items by priority (Dijkstra, top-K) | PriorityQueue / Heap |
| Insert/delete at a known position in O(1) | LinkedList |
| Sorted key-value pairs | SortedDictionary / TreeMap |
| Thread-safe shared cache | ConcurrentDictionary |
| Slice arrays/strings without copying | Span / Slice / memoryview |
The most commonly used are dynamic arrays (List/ArrayList/vector), hash maps (Dictionary/HashMap/dict), and hash sets. For interviews, also know stacks, queues, trees, and priority queues. These cover 90%+ of coding interview problems.
Start with the dynamic array and hash map. Together they solve the majority of interview problems. Then learn stacks (for DFS, bracket matching) and queues (for BFS). After that, tackle trees, heaps, and graphs.
No. Big O measures algorithmic complexity, not language-specific performance. A hash map lookup is O(1) whether you use Python dict, Java HashMap, or C# Dictionary. Constant factors differ (C++ is faster than Python in wall-clock time), but Big O is the same.
Yes. The heapq module provides a min-heap. Push with heapq.heappush() and pop with heapq.heappop(). For a max-heap, negate the values.
Practice with an AI interviewer that asks you to implement, optimize, and explain data structure choices in real time.
Try a free mock interviewarrow_forward