import numpy as np
import pandas as pd
13 How to Vectorize
To vectorize a non-vectorized function in Python, especially when working with pandas
or numpy
, there are several approaches you can use to apply the function efficiently to whole arrays or pandas columns instead of looping through elements one by one. Vectorized operations are generally much faster because they leverage optimized C or Fortran code under the hood
13.1 Use numpy.vectorize()
numpy.vectorize()
essentially wraps the function to allow it to operate element-wise on arrays, making it behave like a vectorized function.
# Non-vectorized function
def my_func(x):
return f"{x}^2 = {x**2}"
# Use numpy.vectorize() to vectorize the function
= np.vectorize(my_func) vectorized_func
# Sample DataFrame
= pd.DataFrame({
df 'col1': [1, 2, 3, 4],
})
"new_col"] = vectorized_func(df["col1"])
df[ df
col1 | new_col | |
---|---|---|
0 | 1 | 1^2 = 1 |
1 | 2 | 2^2 = 4 |
2 | 3 | 3^2 = 9 |
3 | 4 | 4^2 = 16 |
13.2 Use pandas .apply()
with axis=0
# Non-vectorized function
def my_func(x):
return f"{x}^2 = {x**2}"
= pd.DataFrame({
df 'col1': [1, 2, 3, 4],
})
# Apply the function element-wise using apply()
'new_col'] = df['col1'].apply(my_func)
df[ df
col1 | new_col | |
---|---|---|
0 | 1 | 1^2 = 1 |
1 | 2 | 2^2 = 4 |
2 | 3 | 3^2 = 9 |
3 | 4 | 4^2 = 16 |
# With Pipe
= (df
df lambda df: df.assign(new_col2 = df['col1'].apply(my_func))))
.pipe( df
col1 | new_col | new_col2 | |
---|---|---|---|
0 | 1 | 1^2 = 1 | 1^2 = 1 |
1 | 2 | 2^2 = 4 | 2^2 = 4 |
2 | 3 | 3^2 = 9 | 3^2 = 9 |
3 | 4 | 4^2 = 16 | 4^2 = 16 |
13.3 Use numpy.where()
for conditional logic
If your non-vectorized function contains conditional logic, you can often replace it with numpy.where()
, which is a vectorized alternative to if-else statements.
def my_func2(x):
if x > 2:
return x ** 2
else:
return x + 2
You can refactor this into a vectorized version using numpy.where()
:
# Vectorized conditional logic with numpy.where()
'new_col'] = np.where(df['col1'] > 2, df['col1'] ** 2, df['col1'] + 2)
df[ df
col1 | new_col | new_col2 | |
---|---|---|---|
0 | 1 | 3 | 1^2 = 1 |
1 | 2 | 4 | 2^2 = 4 |
2 | 3 | 9 | 3^2 = 9 |
3 | 4 | 16 | 4^2 = 16 |