Python 基础

第十章 · Python 标准库

配合机器学习系列课程

本章目录

1. time 库

时间戳、格式化、计时器、休眠

2. random 库

随机数生成、随机选择、打乱顺序

3. collections 库

Counter、defaultdict、deque、namedtuple

4. itertools 库

无限迭代器、排列组合、分组

time 库 — 时间处理

time.time() 返回当前时间戳（秒，浮点数）
time.sleep(n) 暂停执行 n 秒
time.localtime() 将时间戳转为本地时间结构体
time.strftime() 格式化时间为字符串

importtime # 获取当前时间戳 now=time. time() print(now) # 1714381200.12345 # 格式化输出 formatted=time. strftime("%Y-%m-%d %H:%M:%S") print(formatted) # 2025-04-23 18:20:00 # 计时器 start=time. time() foriinrange(1000000): pass print(f"耗时: {time.time()-start: .4f } 秒") # 休眠 print("3秒后开始...")time. sleep(3)

time 库 — 结构化时间与字符串转换

struct_time 结构体

tm_year, tm_mon, tm_mday

tm_hour, tm_min, tm_sec

tm_wday（周一为0）, tm_yday

常用格式符

%Y — 四位年份，%m — 月份

%d — 日期，%H — 24小时

%M — 分钟，%S — 秒

import time # 时间戳 → 结构化时间 ts=time. time () st=time. localtime (ts) print (st.tm_year,st.tm_mon,st.tm_mday) # 字符串 → 结构化时间 str_time= "2025-04-23 18:30:00" st=time. strptime (str_time, "%Y-%m-%d %H:%M:%S" ) print (st) # 结构化时间 → 时间戳 ts=time. mktime (st) print (ts)

random 库 — 随机数生成

random.random() — [0, 1) 范围内的随机浮点数
random.randint(a, b) — [a, b] 范围内的随机整数
random.uniform(a, b) — [a, b] 范围内的随机浮点数
random.choice(seq) — 从序列中随机选择一个元素

importrandom # 随机浮点数 print(random.random()) # 0.372... print(random.uniform(10,20)) # 14.56... # 随机整数 print(random.randint(1,6)) # 模拟骰子：1~6 print(random.randrange(0,100,2)) # 0~98 的偶数 # 随机选择 cards=["A","2","3","4","5"] print(random.choice(cards)) # 设置随机种子（可复现） random. seed(42) print(random.random()) # 每次运行结果相同

random 库 — 采样与打乱

random.shuffle(list)

原地打乱列表顺序

不返回新列表

适用于洗牌、随机排序

random.sample(pop, k)

从总体中无放回地抽取 k 个

返回新列表，不改变原序列

适用于抽样、抽奖

importrandom # 打乱列表（原地） deck= list(range(1,53)) # 52张牌 random. shuffle(deck) print(deck[:5]) # 前5张牌 # 随机抽样（无放回） students=["Alice","Bob","Charlie","David","Eve"] winners=random. sample(students,3) print(winners) # 加权随机选择 choices=["A","B","C"] weights=[0.7,0.2,0.1] result=random. choices(choices,weights=weights,k=10) print(result)

collections — Counter 计数器

Counter 是 dict 的子类，用于统计可哈希对象的频次
自动为不存在的键返回 0，不会 KeyError
支持数学运算：+、-、&、|

from collections import Counter # 统计字符频率 text= "hello world" counts= Counter(text) print (counts) # Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1}) # 统计单词 words=[ "apple" , "banana" , "apple" , "cherry" , "banana" , "apple" ] word_counts= Counter(words) print (word_counts. most_common ( 2 )) # [('apple', 3), ('banana', 2)] # 更新计数 word_counts. update ([ "apple" , "date" ]) print (word_counts[ "apple" ]) # 4

collections — defaultdict

defaultdict 为不存在的键自动提供默认值
传入工厂函数：list、dict、set、int 等
免去手动检查键是否存在的繁琐代码

fromcollections importdefaultdict # 用 list 作默认值：自动分组 groups= defaultdict(list) students=[("A",85),("B",92),("A",78),("B",88)] forname,scoreinstudents: groups[name]. append(score) print(dict(groups)) # {'A': [85, 78], 'B': [92, 88]} # 用 int 作默认值：自动计数 counts= defaultdict(int) forcharin"hello": counts[char]+=1 print(dict(counts)) # {'h': 1, 'e': 1, 'l': 2, 'o': 1}

collections — deque 双端队列

deque 优势

两端插入/删除都是 O(1)

列表头部操作是 O(n)

适用于队列、滑动窗口

常用方法

append() / pop() — 右端

appendleft() / popleft() — 左端

rotate(n) — 循环旋转

fromcollections importdeque # 创建双端队列 dq= deque([1,2,3]) # 两端操作 dq. append(4) # 右端添加: [1,2,3,4] dq. appendleft(0) # 左端添加: [0,1,2,3,4] dq. pop() # 弹出右端: 4 dq. popleft() # 弹出左端: 0 # 滑动窗口 nums=[1,3,-1,-3,5,3,6,7] window= deque() fori, numinenumerate(nums): ifwindowandwindow[0]<=i-3: window. popleft() whilewindowandnums[window[-1]]<num: window. pop()window. append(i)

collections — namedtuple 命名元组

namedtuple 创建带有命名字段的元组子类
兼具元组的不可变性和类的可读性
内存占用与普通元组相同，比字典更省空间

fromcollections importnamedtuple # 定义一个 Point 类型 Point= namedtuple("Point",["x","y"]) # 创建实例 p= Point(10,20) print(p.x,p.y) # 10 20 print(p[0],p[1]) # 仍支持索引 # 实际应用：表示数据记录 Student= namedtuple("Student",["name","age","score"]) s= Student("Alice",20,92) print(f" {s.name} 的成绩是 {s.score} ") # 从字典转换 data={"x":1,"y":2} p2= Point(**data)

itertools — 迭代器工具箱

无限迭代器

count(start, step)

cycle(iterable)

repeat(elem, n)

组合迭代器

permutations(p, r) — 排列

combinations(p, r) — 组合

product(p, q) — 笛卡尔积

fromitertools importcount,cycle,permutations,combinations # 无限计数器 foriincount(10,2): ifi>20: break print(i,end=" ") # 10 12 14 16 18 20 # 排列 items=["A","B","C"] forpinpermutations(items,2): print(p,end=" ") # ('A','B') ('A','C') ('B','A') ('B','C') ('C','A') ('C','B') # 组合 forcincombinations(items,2): print(c,end=" ") # ('A','B') ('A','C') ('B','C')

itertools — 分组与拉链

groupby() — 按相同键值对连续元素分组（需先排序）
zip() — 并行遍历多个序列
chain() — 将多个迭代器串接为一个
islice() — 对迭代器进行切片

fromitertools importgroupby,chain,islice # groupby 分组（注意：需要先排序！） data=[("A",1),("A",2),("B",3),("B",4)] forkey, groupingroupby(data,key=lambdax:x[0]): print(f" {key} : {list(group)} ") # A: [('A',1), ('A',2)] # B: [('B',3), ('B',4)] # chain 串联迭代器 all_nums= list(chain([1,2],[3,4],[5,6])) print(all_nums) # [1, 2, 3, 4, 5, 6] # islice 迭代器切片 forxinislice(count(),5,10): print(x,end=" ") # 5 6 7 8 9

本章总结

▸ time：时间戳、格式化、计时器，掌握 strftime / strptime

▸ random：随机数、抽样、打乱，设置种子保证可复现

▸ collections：Counter 计数、defaultdict 自动默认值、deque 高效队列、namedtuple 命名元组

▸ itertools：无限迭代、排列组合、分组、串联，处理大规模数据不占用额外内存