The .pipe() method in pandas allows you to chain operations in a more functional and readable way. It allows for cleaner code, especially when you have multiple transformations that need to be applied to a DataFrame. You can use .pipe() together with other pandas methods like .assign() and .apply() to perform various transformations.
How .pipe() Works
.pipe() passes the current DataFrame (or Series) to a function as its first argument. You can chain multiple operations in a functional style, which is similar to the %>% pipe operator in R’s dplyr.
Let’s build a complete example where we:
Use .assign() to create new columns.
Use .apply() to apply a custom function to each row.
# Define a function to sum three columnsdef add_columns(df):return df.assign(col_sum=df['col1'] + df['col2'] + df['col3'])# Define a function to apply across rowsdef apply_product(df): df['col_product'] = df.apply(lambda row: row['col1'] * row['col2'] * row['col3'], axis=1)return df
# Chaining operations with pipedf_transformed = (df .pipe(add_columns) # Add a column using assign .pipe(apply_product) # Apply a custom function using apply )df_transformed
col1
col2
col3
col_sum
col_product
0
1
5
9
15
45
1
2
6
10
18
120
2
3
7
11
21
231
3
4
8
12
24
384
14.1.2 Pipe Inline
import pandas as pd# Sample DataFramedf = pd.DataFrame({'col1': [1, 2, 3, 4],'col2': [5, 6, 7, 8],'col3': [9, 10, 11, 12]})# Chaining operations using pipe with lambda functionsdf_transformed = (df# First pipe: Add a new column 'col_sum' using lambda in pipe .pipe(lambda df: df.assign(col_sum=df['col1'] + df['col2'] + df['col3']))# Second pipe: Apply a row-wise product calculation using apply .pipe(lambda df: df.assign(col_product=df.apply(lambda row: row['col1'] * row['col2'] * row['col3'], axis=1))) )df_transformed
col1
col2
col3
col_sum
col_product
0
1
5
9
15
45
1
2
6
10
18
120
2
3
7
11
21
231
3
4
8
12
24
384
14.2 Examples
14.2.1 Transformations with .assign() and Conditional Logic
import pandas as pd# Sample DataFramedf = pd.DataFrame({'col1': [1, 2, 3, 4],'col2': [5, 6, 7, 8],'col3': [9, 10, 11, 12]})# Chaining transformations using pipe and lambdadf_transformed = ( df .pipe(lambda df: df.assign(col_sum = df['col1'] + df['col2'] + df['col3'])) # Sum of columns .pipe(lambda df: df.assign(col_flag = df['col_sum'].apply(lambda x: 'high'if x >20else'low'))) # Conditional column .pipe(lambda df: df.assign(col_cumsum = df['col_sum'].cumsum())) # Cumulative sum of col_sum)df_transformed