What is a Container?

In Python, a Container is a data structure that can hold other objects.

# Various containers
my_list = [1, 2, 3, 4, 5]           # List
my_tuple = (1, 2, 3)                # Tuple
my_set = {1, 2, 3}                  # Set
my_dict = {"name": "Python"}        # Dictionary
my_string = "hello"                 # String is also a container!

They all share the common trait of holding multiple pieces of data.

Container Characteristics

Containers have two important characteristics.

1. Membership Test

numbers = [1, 2, 3, 4, 5]

# Check "is this value in the container?"
print(3 in numbers)      # True
print(10 in numbers)     # False
print(10 not in numbers) # True

Being able to use the in operator is the first characteristic of containers.

2. Iterable

# Can be accessed one by one with a for loop
for num in numbers:
    print(num)

You can traverse container elements one by one.

Python's Main Containers

ContainerOrderedDuplicatesMutablePurpose
listYesYesYesOrdered data collection
tupleYesYesNoImmutable data
setNoNoYesUnique data
dictYes (3.7+)No (keys)YesKey-value pairs
strYesYesNoCharacter sequence

Among these, the most commonly used and important is List!

List - The Representative Container

Lists are the most frequently used container in Python.

numbers = [1, 2, 3, 4, 5]
fruits = ["apple", "banana", "orange"]
mixed = [1, "hello", 3.14, True]  # Can hold different types!

List Internal Structure

How Lists Look in Memory

How does Python store a list in memory?

my_list = [10, 20, 30]

Lists are actually implemented as arrays, but slightly different from typical arrays.

Memory structure:

List object:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ size: 3     โ”‚  โ† Current element count
โ”‚ capacity: 4 โ”‚  โ† Allocated space size
โ”‚ items: โ”€โ”€โ”€โ” โ”‚  โ† Pointer to actual elements
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”˜
            โ”‚
            โ–ผ
Element array:
โ”Œโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”
โ”‚ โ€ข  โ”‚ โ€ข  โ”‚ โ€ข  โ”‚    โ”‚
โ””โ”€โ”ผโ”€โ”€โ”ดโ”€โ”ผโ”€โ”€โ”ดโ”€โ”ผโ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”˜
  โ”‚    โ”‚    โ”‚
  โ–ผ    โ–ผ    โ–ผ
  10   20   30

Lists maintain two pieces of information.
- size: Actual number of elements
- capacity: Allocated memory space size

Why is Capacity Larger Than Size?

Reallocating memory every time an element is added would be too slow.
So Python pre-allocates extra space.

my_list = []  # capacity: 0
my_list.append(1)  # capacity: 4 (allocates 4 slots at once)
my_list.append(2)  # capacity: 4 (still has room)
my_list.append(3)  # capacity: 4
my_list.append(4)  # capacity: 4
my_list.append(5)  # capacity: 8 (out of space, doubles capacity)

This approach is called a Dynamic Array.

Time Complexity of List Operations

Let's see how fast each list operation is.

Fast Operations (O(1) - constant time)

# Index access
value = my_list[2]  # Very fast

# Append to end
my_list.append(10)  # Fast on average

# Remove from end
my_list.pop()  # Fast

Slow Operations (O(n) - proportional to list size)

# Insert in middle
my_list.insert(0, 5)  # Slow (must shift all elements)

# Remove from middle
my_list.pop(0)  # Slow (must shift all elements forward)

# Search for element
if 10 in my_list:  # Slow (searches from start to end)
    pass

The Secret of List Copying

Be careful when copying lists.

# Shallow copy
original = [1, 2, 3]
copy1 = original  # Points to same list!

copy1.append(4)
print(original)  # [1, 2, 3, 4] - original changed too!

# Real copy
copy2 = original.copy()  # or original[:]
copy2.append(5)
print(original)  # [1, 2, 3, 4] - original unchanged

Be extra careful with nested lists.

# 2D list
matrix = [[1, 2], [3, 4]]
shallow = matrix.copy()

shallow[0].append(99)
print(matrix)  # [[1, 2, 99], [3, 4]] - inner lists are shared!

# Deep copy
import copy
deep = copy.deepcopy(matrix)
deep[0].append(100)
print(matrix)  # [[1, 2, 99], [3, 4]] - original is safe

Iterator and Iterable

To traverse containers, you need to understand Iterator and Iterable.

What is an Iterable?

An Iterable is an object that can be iterated. Simply put, anything you can put in a for loop.

# These are all Iterable
for x in [1, 2, 3]:        # List
    print(x)

for x in (1, 2, 3):        # Tuple
    print(x)

for x in "hello":          # String
    print(x)

for x in {1, 2, 3}:        # Set
    print(x)

What is an Iterator?

An Iterator is an object that actually retrieves values one by one.

numbers = [1, 2, 3]

# Create Iterator with iter()
iterator = iter(numbers)

# Retrieve values one by one with next()
print(next(iterator))  # 1
print(next(iterator))  # 2
print(next(iterator))  # 3
print(next(iterator))  # StopIteration error!

The Secret of for Loops

The for loops we write actually work like this.

# Code we write
for num in [1, 2, 3]:
    print(num)

# What Python actually does
iterator = iter([1, 2, 3])  # Convert Iterable to Iterator
while True:
    try:
        num = next(iterator)  # Get next value
        print(num)
    except StopIteration:     # Stop when no more values
        break

Iterable vs Iterator

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Iterable   โ”‚  "Iterable object" (list, tuple, str, etc.)
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚ iter()
       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Iterator   โ”‚  "Object that retrieves values one by one"
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚ next()
       โ–ผ
    Returns value

Key points:
- Iterable: Calling iter() returns an Iterator
- Iterator: Calling next() returns the next value

Creating Your Own

class CountUp:
    """Iterable that counts from 1 to n"""
    def __init__(self, max):
        self.max = max

    def __iter__(self):
        """Returns an Iterator"""
        return CountUpIterator(self.max)

class CountUpIterator:
    """Iterator that actually returns values"""
    def __init__(self, max):
        self.max = max
        self.current = 0

    def __next__(self):
        """Returns next value"""
        if self.current >= self.max:
            raise StopIteration
        self.current += 1
        return self.current

# Usage
counter = CountUp(5)
for num in counter:
    print(num)
# 1, 2, 3, 4, 5

Looks complex? There's a simpler way: Generators!

Generator - A Smart Iterator

Problem: Memory Waste

We want to process a million numbers.

# Creating as a list (Container)
numbers = [i for i in range(1000000)]  # All million in memory!

for num in numbers:
    print(num)

Problems:
- Stores all million numbers in memory
- Only needs one at a time - inefficient

Enter Generators

A Generator is a smart Iterator that creates values one at a time as needed.

# Generator version
def number_generator():
    for i in range(1000000):
        yield i  # Returns values one by one

for num in number_generator():
    print(num)

Differences:

List (Container)Generator (Iterator)
MemoryStores all values (1 million)Creates one at a time (1)
SpeedSlow initial creationStarts immediately
ReuseCan iterate multiple timesCan iterate once

Generators are an easy way to create Iterators!

The yield Keyword

yield is similar to return, but remembers the function's state.

def simple_generator():
    print("Creating first value")
    yield 1
    print("Creating second value")
    yield 2
    print("Creating third value")
    yield 3

gen = simple_generator()
print(next(gen))  # "Creating first value" โ†’ 1
print(next(gen))  # "Creating second value" โ†’ 2
print(next(gen))  # "Creating third value" โ†’ 3

When encountering yield:
1. Returns a value
2. Pauses the function
3. Resumes from where it left off on next call

Generator Advantages

# Can even create infinite sequences!
def infinite_numbers():
    num = 0
    while True:
        yield num
        num += 1

gen = infinite_numbers()
print(next(gen))  # 0
print(next(gen))  # 1
print(next(gen))  # 2
# Can continue indefinitely...

Impossible with lists - would require infinite memory!

for Loops and Generators

The Principle of for Loops

for loops actually use generators.

# Code we write
for num in [1, 2, 3]:
    print(num)

# What actually happens
iterator = iter([1, 2, 3])
while True:
    try:
        num = next(iterator)
        print(num)
    except StopIteration:
        break

Creating Generators with for Loops

def count_up_to(max):
    count = 1
    while count <= max:
        yield count
        count += 1

for num in count_up_to(5):
    print(num)
# 1
# 2
# 3
# 4
# 5

Using yield inside a while loop lets you keep generating values.

Real Example 1: Reading Files

Generators are useful for reading large files.

def read_large_file(file_path):
    """Generator that reads file line by line"""
    with open(file_path) as file:
        for line in file:
            yield line.strip()

# Process 100GB file without memory burden
for line in read_large_file("huge_file.txt"):
    process(line)

Real Example 2: Fibonacci Sequence

def fibonacci():
    """Infinite Fibonacci sequence generator"""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Get first 10 numbers
fib = fibonacci()
for _ in range(10):
    print(next(fib))
# 0, 1, 1, 2, 3, 5, 8, 13, 21, 34

Real Example 3: Data Filtering

def even_numbers(numbers):
    """Generator that returns only even numbers"""
    for num in numbers:
        if num % 2 == 0:
            yield num

# Usage
for num in even_numbers(range(10)):
    print(num)
# 0, 2, 4, 6, 8

Generator Expressions

Similar to list comprehensions but uses () instead of [].

# List comprehension (uses lots of memory)
squares_list = [x**2 for x in range(1000000)]

# Generator expression (saves memory)
squares_gen = (x**2 for x in range(1000000))

# Computed only when needed
print(next(squares_gen))  # 0
print(next(squares_gen))  # 1
print(next(squares_gen))  # 4

while Loops and Generators

You can create more complex generators with while loops.

Conditional Generation

def numbers_until_condition(limit):
    """Generate numbers until sum exceeds limit"""
    total = 0
    num = 1
    while total < limit:
        yield num
        total += num
        num += 1

for n in numbers_until_condition(20):
    print(n)
# 1, 2, 3, 4, 5 (1+2+3+4+5=15, adding 6 would make 21, so stop)

Stateful Generators

def countdown(start):
    """Countdown generator"""
    current = start
    while current > 0:
        yield current
        current -= 1
    yield "Liftoff!"

for count in countdown(5):
    print(count)
# 5, 4, 3, 2, 1, Liftoff!

Generator Chaining

def first_n(generator, n):
    """Take first n items from generator"""
    count = 0
    while count < n:
        yield next(generator)
        count += 1

# Infinite generator + limit
def all_numbers():
    num = 0
    while True:
        yield num
        num += 1

limited = first_n(all_numbers(), 5)
print(list(limited))  # [0, 1, 2, 3, 4]

Containers vs Generators

When should you use what?

Use Containers (Lists)

# 1. Need multiple accesses
data = [1, 2, 3, 4, 5]
print(data[2])  # Index access
print(len(data))  # Length check
print(data[:3])  # Slicing

# 2. Small data size
small_list = [x for x in range(100)]  # OK

# 3. Need all data
sorted_data = sorted([3, 1, 2])  # Sorting needs all data

# 4. Need multiple iterations
for x in data:
    print(x)
for x in data:  # Can iterate again!
    print(x * 2)

Use Generators (Iterators)

# 1. Large data
huge_data = (x for x in range(10000000))  # Saves memory

# 2. Single iteration
for item in huge_data:
    process(item)

# 3. Infinite sequences
def infinite():
    while True:
        yield get_next_value()

# 4. Pipeline processing (chaining)
data = (x for x in range(100))
filtered = (x for x in data if x % 2 == 0)
squared = (x**2 for x in filtered)

# 5. Need lazy evaluation
# Defers computation until the moment it's needed

Summary

Concept Hierarchy

graph TD
    A["Container
Most general concept
Data structures
list, tuple, set, dict, str"] B["Iterable
Iterable objects
Can use in for loop
Calling iter() returns Iterator"] C["Iterator
Retrieves values one by one
Returns next value with next()
Can iterate only once"] D["Generator
Convenient Iterator
Creates values with yield
Memory efficient, infinite sequences"] A -->|"All Containers are..."| B B -->|"iter()"| C C -->|"Easy way to create"| D style A fill:#e3f2fd style B fill:#fff3e0 style C fill:#f3e5f5 style D fill:#e8f5e9

Container

  1. Definition: Data structure that can hold multiple objects
  2. Features:
  • Membership test (in operator)
  • Iterable
  1. Types: list, tuple, set, dict, str

List

  1. Internal structure: Implemented as dynamic array
  2. capacity: Pre-allocates space for efficiency
  3. Time complexity:
  • Fast: index access, append/remove from end
  • Slow: insert/remove from middle, search
  1. Copy caution: shallow copy vs deep copy

Iterator and Iterable

  1. Iterable: Calling iter() returns an Iterator
  2. Iterator: Calling next() returns the next value
  3. for loops: Internally use iter() and next()
  4. Difference: Iterable can iterate multiple times, Iterator only once

Generator

  1. Definition: Easy way to create Iterators
  2. yield: Returns values one by one and maintains state
  3. Advantages: Memory efficient, infinite sequences possible
  4. Usage:
  • for loop with yield
  • while loop with yield (conditional generation)
  • Generator expressions (x for x in ...)

When to Use What?

SituationUseReason
Small dataContainer (List)Fast and convenient
Multiple iterationsContainer (List)Reusable
Index accessContainer (List)Random access
Large dataGeneratorMemory savings
Single iterationGeneratorEfficient
Infinite sequencesGeneratorRequired
PipelinesGeneratorChain composition