当前位置：首页 > Python > 正文

Python迭代器应用场景详解 - 实例教程

FangShe
Python
2025-08-10
1495

Python迭代器5大应用场景与实例解析

迭代器是Python中高效处理数据流的核心工具，它通过惰性计算实现内存优化。本文将深入探讨迭代器的实际应用场景，并通过具体代码示例展示其强大功能。

场景一：大数据文件处理

当处理GB级日志文件时，迭代器可避免内存溢出：

def read_large_file(file_path):
    with open(file_path, 'r') as file:
        for line in file:
            yield line.strip()

# 使用示例
log_parser = read_large_file('server.log')
for entry in log_parser:
    if 'ERROR' in entry:
        process_error(entry)  # 逐行处理

关键优势：仅单行数据加载到内存，支持无限大文件处理

场景二：无限序列生成

生成无限序列而无需预先生成所有元素：

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

fib_gen = fibonacci()
print(next(fib_gen))  # 0
print(next(fib_gen))  # 1
print(next(fib_gen))  # 1
# 可无限次调用next()

适用场景：数学序列、实时数据流、轮询操作

场景三：自定义对象迭代

在类中实现__iter__和__next__方法：

class Inventory:
    def __init__(self, products):
        self.products = products
        self.index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.index >= len(self.products):
            raise StopIteration
        result = self.products[self.index]
        self.index += 1
        return result

# 使用示例
store = Inventory(['apple', 'banana', 'orange'])
for item in store:
    print(f"Checking: {item}")

输出顺序：Checking: apple → Checking: banana → Checking: orange

场景四：管道式数据处理

组合多个迭代器构建数据处理管道：

def filter_positive(numbers):
    for n in numbers:
        if n > 0:
            yield n

def square(numbers):
    for n in numbers:
        yield n ** 2

# 构建处理管道
data = [-2, 5, 0, 13, -8]
pipeline = square(filter_positive(data))

print(list(pipeline))  # [25, 169]

内存效率：每个元素依次通过整个管道，避免中间列表存储

场景五：数据库分批查询

处理海量数据库记录的核心模式：

import psycopg2

def batch_query(query, batch_size=1000):
    conn = psycopg2.connect(DATABASE_URL)
    cursor = conn.cursor()
    cursor.execute(query)
    
    while True:
        records = cursor.fetchmany(batch_size)
        if not records:
            break
        yield from records

# 使用示例
user_generator = batch_query("SELECT * FROM users")
for user in user_generator:
    process_user(user)  # 处理每个用户对象

优势：控制内存占用，避免单次查询返回百万级结果