Pandas DataFrame 按条件过滤行：期权无套利上下限筛选实例

问题描述

按行执行函数中按每行条件删除对应的行

现有如图df，传入的df是期权数据，index为行权价格，call，put分别是看涨和看跌期权的行权价格，diff为看涨看跌行权价格的绝对值，希望实现功能：根据已有数据，计算期权价格上下限，并删除不符合无套利原理的看涨看跌期权，bound_filter_sub（）如下

def bound_filter_sub(x: pd.DataFrame, rf_rate: float, maturity: float) -> pd.DataFrame:
#看涨期权无套利上下限
if x['call']-x['put']>=0.0:
x['c_down_bound'] = x['call']-x['put']
else:
x['c_down_bound']=0.0
x['c_upper_bound'] = x['call']-x['put']+x['K']*np.exp(-rf_rate * maturity)
#看跌期权无套利上下限
if x['put']-x['call']>=0.0:
x['p_down_bound'] = x['put']-x['call']
else:
x['p_down_bound']=0.0
x['p_upper_bound'] = x['K']*np.exp(-rf_rate * maturity)
#筛选删掉上下限以外的期权数据

return x
传入df为
Img
目前调用df.apply(lambda x: bound_filter_sub(x,rf_rate,maturity1), axis=1)，已计算出看涨看跌期权上下限，返回df为
Img
现希望在bound_filter_sub()中再按行执行一个功能：删除df中满足：'call‘< 'c_down_bound'或'call'>'c_upper_bound'或'put‘< 'p_down_bound'或'put‘< 'p_down_bound'的所有行。
请问bound_filter_sub()中应该怎么写？
求助各路大神

解决方案

在 Pandas 中，使用 df.apply(..., axis=1) 是逐行处理数据（传入的 x 实际上是一个 pd.Series 代表当前行）。在 apply 的函数内部直接“删除”原 DataFrame 的行是不合适的，也是做不到的。

要实现你的需求，有两种主要思路：

方法一：强烈推荐的“向量化”操作（抛弃 apply）

在 Pandas 中，尽量避免使用 apply 逐行遍历，因为它的运行速度非常慢。你的逻辑完全可以使用 Pandas 和 NumPy 的向量化操作来实现，代码更简洁，运行速度快几十倍甚至上百倍。

你可以直接对整个 DataFrame 进行列运算和布尔索引过滤：

import pandas as pd
import numpy as np

def filter_options_arbitrage(df: pd.DataFrame, rf_rate: float, maturity: float) -> pd.DataFrame:
    # 复制一份避免修改原数据（可选）
    df_res = df.copy()
    
    # 1. 向量化计算上下限
    # 使用 np.maximum 替代 if-else
    df_res['c_down_bound'] = np.maximum(df_res['call'] - df_res['put'], 0.0)
    df_res['c_upper_bound'] = df_res['call'] - df_res['put'] + df_res['K'] * np.exp(-rf_rate * maturity)
    
    df_res['p_down_bound'] = np.maximum(df_res['put'] - df_res['call'], 0.0)
    df_res['p_upper_bound'] = df_res['K'] * np.exp(-rf_rate * maturity)
    
    # 2. 构建保留条件 (注意：原问题描述中 put 的上限条件写重复了，这里修正为 put <= p_upper_bound)
    valid_call = (df_res['call'] >= df_res['c_down_bound']) & (df_res['call'] <= df_res['c_upper_bound'])
    valid_put = (df_res['put'] >= df_res['p_down_bound']) & (df_res['put'] <= df_res['p_upper_bound'])
    
    # 3. 筛选出满足条件的行
    df_filtered = df_res[valid_call & valid_put]
    
    return df_filtered

# 调用方式：
# df_final = filter_options_arbitrage(df, rf_rate, maturity1)

方法二：如果你坚持要用 `apply`

如果你必须在 apply 的函数中处理，你可以让不符合条件的行返回 None（或全为 NaN 的 Series），然后在外部调用 dropna() 将这些行删掉。

def bound_filter_sub(x: pd.Series, rf_rate: float, maturity: float):
    # 看涨期权无套利上下限
    if x['call'] - x['put'] >= 0.0:
        x['c_down_bound'] = x['call'] - x['put']
    else:
        x['c_down_bound'] = 0.0
    x['c_upper_bound'] = x['call'] - x['put'] + x['K'] * np.exp(-rf_rate * maturity)
    
    # 看跌期权无套利上下限
    if x['put'] - x['call'] >= 0.0:
        x['p_down_bound'] = x['put'] - x['call']
    else:
        x['p_down_bound'] = 0.0
    x['p_upper_bound'] = x['K'] * np.exp(-rf_rate * maturity)
    
    # 检查是否满足条件，如果不满足，返回 None
    if (x['call'] < x['c_down_bound']) or (x['call'] > x['c_upper_bound']) or \
       (x['put'] < x['p_down_bound']) or (x['put'] > x['p_upper_bound']): # 注意这里修正了你的笔误
        return None
        
    return x

# 调用方式：
# 1. apply 处理，不符合的行会变成 None/NaN
# 2. dropna(how='all') 删掉这些全为 NaN 的行
df_final = df.apply(lambda x: bound_filter_sub(x, rf_rate, maturity1), axis=1).dropna(how='all')

总结建议：强烈建议使用方法一。在量化数据处理中，养成使用向量化（Vectorization）替代循环和 apply 的习惯，对处理大规模期权数据至关重要。

问题描述

解决方案

方法一：强烈推荐的“向量化”操作（抛弃 apply）

方法二：如果你坚持要用 apply

相关推荐

JoinQuant 聚宽 get_price 函数 Panel 与 DataFrame 转换指南

聚宽双均线策略 Python 实现教程

聚宽选股教程：如何有效过滤 ST 与 *ST 股票

聚宽波动性过滤器：Choppiness Index

聚宽数据可视化：使用 record 函数绘制自定义曲线

聚宽核心概念：股息、送股和拆分如何影响回测

方法二：如果你坚持要用 `apply`