By Language
Data Structures in Python
12 essential data structures for Python developers. Each one explained with Big O complexity, animated visuals, and real code samples you can copy.
Array
listFixed-size, contiguous block of memory. Elements are stored sequentially and accessed by index in constant time. The foundation of most other data structures.
arr[0] = 7
Complexity
When to use
- +You know the exact size at creation time
- +You need the fastest possible index-based access
- +Working with fixed-length data like matrices or buffers
import array
# Fixed-type array (like C arrays)
nums = array.array("i", [10, 20, 30, 40, 50])
nums[0] = 99 # O(1) - direct index access
val = nums[2] # O(1) - read by index
nums.append(60) # O(1) amortized
nums.insert(1, 15) # O(n) - shifts elements right
idx = nums.index(30) # O(n) - linear search
nums.remove(15) # O(n) - shifts elements left
sorted_nums = sorted(nums) # O(n log n)Dynamic Array
listResizable array that automatically grows when capacity is exceeded. The most commonly used data structure in most languages. Doubles its internal storage when full, giving amortized O(1) appends.
arr[0] = 7
Complexity
When to use
- +You need a resizable collection (most common case)
- +You frequently access elements by index
- +You mostly add or remove at the end
names = ["Alice", "Bob", "Charlie"]
names.append("Diana") # O(1) amortized
names.insert(1, "Eve") # O(n) - shifts elements
names.pop() # O(1) - remove last
names.pop(0) # O(n) - shifts elements
has = "Bob" in names # O(n) - linear scan
idx = names.index("Bob") # O(n) - first occurrence
names.sort() # O(n log n) - Timsort
squares = [x * x for x in range(10)] # O(n) list comprehensionStack
list (as stack)Last-In-First-Out (LIFO) collection. Only the top element is accessible. Used for tracking state that must be unwound in reverse order.
Push 10
Complexity
When to use
- +Undo/redo functionality
- +Expression parsing and evaluation
- +DFS traversal of trees and graphs
- +Matching brackets, parentheses validation
stack: list[int] = []
stack.append(10) # O(1) - push
stack.append(20)
stack.append(30)
top = stack[-1] # O(1) - peek -> 30
val = stack.pop() # O(1) - pop -> 30
# Interview pattern: valid parentheses
def is_valid(s: str) -> bool:
pairs = {"(": ")", "[": "]", "{": "}"}
st: list[str] = []
for c in s:
if c in pairs:
st.append(pairs[c])
elif not st or st.pop() != c:
return False
return len(st) == 0Queue
collections.dequeFirst-In-First-Out (FIFO) collection. Elements are added at the back and removed from the front. Fundamental for breadth-first processing.
Enqueue 10
Complexity
When to use
- +BFS traversal of trees and graphs
- +Task scheduling and job queues
- +Message passing between components
- +Rate limiting, buffering
from collections import deque
queue: deque[str] = deque()
queue.append("Task A") # O(1) - enqueue
queue.append("Task B")
queue.append("Task C")
first = queue[0] # O(1) - peek front
val = queue.popleft() # O(1) - dequeue -> "Task A"
# Interview pattern: BFS
def bfs(root):
q = deque([root])
while q:
node = q.popleft() # O(1)
print(node.val, end=" ")
if node.left: q.append(node.left)
if node.right: q.append(node.right)Hash Map
dictMaps keys to values using a hash function for near-constant-time lookups. The single most important data structure for coding interviews. Every language has a built-in implementation.
hash("age") = 0
Complexity
When to use
- +Two Sum and frequency counting patterns
- +Caching computed results (memoization)
- +Grouping data by a key
- +Any problem requiring O(1) lookup by key
prices = {"apple": 3, "banana": 5}
prices["cherry"] = 2 # O(1) - add
prices["apple"] = 10 # O(1) - update
has = "banana" in prices # O(1) - key check
del prices["cherry"] # O(1) - remove
count = prices.get("mango", 0) # O(1) - safe lookup
# Interview pattern: Two Sum
def two_sum(nums: list[int], target: int) -> list[int]:
seen: dict[int, int] = {}
for i, num in enumerate(nums):
need = target - num
if need in seen: # O(1) lookup
return [seen[need], i]
seen[num] = i # O(1) insert
return []Hash Set
setUnordered collection of unique elements. Uses hashing internally for O(1) membership testing. Supports mathematical set operations like union, intersection, and difference.
hash("age") = 0
Complexity
When to use
- +Checking if an element exists in O(1)
- +Removing duplicates from a collection
- +Set operations: union, intersection, difference
- +Visited tracking in graph traversal
s = {1, 2, 3, 4, 5}
s.add(6) # O(1)
s.discard(1) # O(1) - no error if missing
has = 4 in s # O(1) -> True
# Set operations
other = {4, 5, 6, 7}
s & other # intersection -> {4, 5, 6}
s | other # union
s - other # difference
# Interview pattern: contains duplicate
def contains_duplicate(nums: list[int]) -> bool:
return len(nums) != len(set(nums))Linked List
collections.dequeSequence of nodes where each node points to the next (singly linked) or both next and previous (doubly linked). Efficient insertion and deletion at any known position, but no index-based access.
traversing: 5
Complexity
When to use
- +Frequent insertion/deletion in the middle
- +Implementing LRU cache (with a hash map)
- +When you need a deque (double-ended queue)
- +Problems involving pointer manipulation
from collections import deque
# deque is a doubly-linked list under the hood
dll: deque[int] = deque()
dll.append(10) # O(1) - add right
dll.appendleft(5) # O(1) - add left
dll.pop() # O(1) - remove right
dll.popleft() # O(1) - remove left
# Interview pattern: reverse a linked list
class ListNode:
def __init__(self, val=0, next=None):
self.val, self.next = val, next
def reverse(head: ListNode | None) -> ListNode | None:
prev, curr = None, head
while curr:
curr.next, prev, curr = prev, curr, curr.next
return prevSorted Set (BST)
SortedList (sortedcontainers)Collection of unique elements maintained in sorted order, typically backed by a balanced binary search tree (red-black tree). Supports range queries and O(log n) min/max.
search(8)
Complexity
When to use
- +Maintaining a sorted collection of unique items
- +Range queries (all elements between X and Y)
- +Sliding window problems needing sorted order
- +Leaderboards, ranking systems
# pip install sortedcontainers
from sortedcontainers import SortedList
sl = SortedList([5, 3, 8, 1, 9])
# Maintains sorted order: [1, 3, 5, 8, 9]
sl.add(4) # O(log n)
sl.discard(3) # O(log n)
minimum = sl[0] # O(1) -> 1
maximum = sl[-1] # O(1) -> 9
# Range query
left = sl.bisect_left(4) # O(log n)
right = sl.bisect_right(8) # O(log n)
between = list(sl[left:right]) # [4, 5, 8]Sorted Map (BST)
SortedDict (sortedcontainers)Key-value pairs maintained in sorted key order, typically backed by a balanced BST. Enables ordered iteration and range lookups that hash maps cannot provide.
search(8)
Complexity
When to use
- +You need sorted key-value pairs
- +Ordered iteration over entries
- +Range lookups by key
- +When insertion order or sorted order matters
# pip install sortedcontainers
from sortedcontainers import SortedDict
sd = SortedDict({"banana": 2, "apple": 5, "cherry": 1})
sd["date"] = 3 # O(log n)
val = sd["apple"] # O(log n) -> 5
del sd["cherry"] # O(log n)
# Iterates in sorted key order
for key, value in sd.items():
print(f"{key}: {value}")
# apple: 5, banana: 2, date: 3
first_key = sd.keys()[0] # O(1) -> "apple"
last_key = sd.keys()[-1] # O(1) -> "date"Priority Queue (Heap)
heapqCollection where elements are dequeued by priority rather than insertion order. Typically implemented as a binary heap. Essential for shortest-path algorithms and top-K problems.
min-heap
Complexity
When to use
- +Dijkstra's shortest path algorithm
- +Merge K sorted lists/streams
- +Top-K / Kth largest element problems
- +Event-driven simulation, scheduling
import heapq
heap: list[int] = []
heapq.heappush(heap, 3) # O(log n)
heapq.heappush(heap, 1)
heapq.heappush(heap, 2)
top = heap[0] # O(1) - peek min -> 1
val = heapq.heappop(heap) # O(log n) - pop min -> 1
# Max-heap: negate values
max_heap: list[int] = []
heapq.heappush(max_heap, -10)
largest = -heapq.heappop(max_heap) # -> 10
# Interview pattern: K closest points
def k_closest(points, k):
heap = [(x*x + y*y, [x, y]) for x, y in points]
heapq.heapify(heap) # O(n)
return [heapq.heappop(heap)[1] for _ in range(k)]Concurrent Hash Map
threading.Lock + dictThread-safe hash map designed for concurrent read/write access from multiple threads. Uses fine-grained locking or lock-free techniques instead of a single global lock.
hash("age") = 0
Complexity
When to use
- +Multi-threaded caching
- +Shared state across threads or async tasks
- +Producer-consumer patterns with keyed data
- +When you need concurrent reads and writes
import threading
class ThreadSafeDict:
def __init__(self):
self._data: dict = {}
self._lock = threading.Lock()
def get(self, key, default=None):
with self._lock:
return self._data.get(key, default)
def set(self, key, value):
with self._lock:
self._data[key] = value
def get_or_add(self, key, factory):
with self._lock:
if key not in self._data:
self._data[key] = factory(key)
return self._data[key]
cache = ThreadSafeDict()
cache.set("counter", 0)Memory View / Slice
memoryviewZero-copy view over a contiguous region of memory. Lets you reference a portion of an array or buffer without allocating new memory. Critical for performance-sensitive parsing and processing.
Span[0..3] = [1, 2, 3]
Complexity
When to use
- +Parsing strings or binary data without copies
- +Processing sub-arrays without allocation
- +High-performance, zero-allocation code paths
- +Interop with native or unmanaged memory
# memoryview - zero-copy view over bytes/bytearray
data = bytearray(b"Hello, World!")
view = memoryview(data)
# Zero-copy slicing
chunk = view[7:12] # O(1) - view of "World"
chunk[0] = ord("E") # mutates original
print(data) # bytearray(b"Hello, Eorld!")
# Efficient binary parsing
header = view[:5] # O(1) - no copy
body = view[7:] # O(1) - no copy
view.release() # free the buffer lockBig O Comparison
Average-case time complexity. * = amortized.
| Structure | Access | Search | Insert | Delete |
|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) | O(n) |
| Dynamic Array | O(1) | O(n) | O(1)* | O(n) |
| Stack | O(n) | O(n) | O(1)* | O(1) |
| Queue | O(n) | O(n) | O(1)* | O(1) |
| Hash Map | O(1) | O(1) | O(1)* | O(1) |
| Hash Set | N/A | O(1) | O(1)* | O(1) |
| Linked List | O(n) | O(n) | O(1) | O(1) |
| Sorted Set | O(n) | O(log n) | O(log n) | O(log n) |
| Sorted Map | O(log n) | O(log n) | O(log n) | O(log n) |
| Priority Queue | O(n) | O(n) | O(log n) | O(log n) |
| Concurrent Map | O(1) | O(1) | O(1)* | O(1) |
| Memory View | O(1) | O(n) | N/A | N/A |
Which collection should I use?
| I need to... | Use |
|---|---|
| Store items by index, resize dynamically | List / Dynamic Array |
| Map keys to values with O(1) lookup | HashMap / Dictionary |
| Track unique items, check existence in O(1) | HashSet / Set |
| Last-in-first-out (undo, DFS, brackets) | Stack |
| First-in-first-out (BFS, task queues) | Queue |
| Keep elements sorted at all times | SortedSet / TreeSet |
| Process items by priority (Dijkstra, top-K) | PriorityQueue / Heap |
| Insert/delete at a known position in O(1) | LinkedList |
| Sorted key-value pairs | SortedDictionary / TreeMap |
| Thread-safe shared cache | ConcurrentDictionary |
| Slice arrays/strings without copying | Span / Slice / memoryview |
Frequently Asked Questions
What are the most important data structures in Python?add
The most commonly used are dynamic arrays (List/ArrayList/vector), hash maps (Dictionary/HashMap/dict), and hash sets. For interviews, also know stacks, queues, trees, and priority queues. These cover 90%+ of coding interview problems.
Which Python data structure should I learn first?add
Start with the dynamic array and hash map. Together they solve the majority of interview problems. Then learn stacks (for DFS, bracket matching) and queues (for BFS). After that, tackle trees, heaps, and graphs.
Does Big O complexity change between languages?add
No. Big O measures algorithmic complexity, not language-specific performance. A hash map lookup is O(1) whether you use Python dict, Java HashMap, or C# Dictionary. Constant factors differ (C++ is faster than Python in wall-clock time), but Big O is the same.
Is there a built-in priority queue in Python?add
Yes. The heapq module provides a min-heap. Push with heapq.heappush() and pop with heapq.heappop(). For a max-heap, negate the values.
Know the structures.
Now use them under pressure.
Practice with an AI interviewer that asks you to implement, optimize, and explain data structure choices in real time.
Try a free mock interviewarrow_forward