# Universal Functions In Pandas

## 1. UNIVERSAL FUNCTIONS: INDEX PRESERVATION

All NumPy Ufunc will work on Pandas `Series` and `DataFrame`
First, let’s create Pandas `Series` of random integers
import numpy as np
import pandas as pd
# creating random state
rand = np.random.RandomState(42)
# Creating Pandas Series of random integers
ser1 = pd.Series(rand.randint(10, size=4))
print(ser1)
0 6
1 3
2 7
3 4
dtype: int64
Second, create a Pandas `DataFrame` of random integers
# Creating Pandas DataFrame
df1 = pd.DataFrame(rand.randint(10,size=(3,4)),
columns=['a','b','c','d'])
print(df1)
a b c d
0 6 9 2 6
1 7 4 3 7
2 7 2 5 4
Now, if we apply any Numpy Ufunc on these objects (`Series` or `DataFrame`) the result will be another Panda object with indices preserved
# Taking exponent of all element in the Series, sr1
np.exp(ser1)
0 403.428793
1 20.085537
2 1096.633158
3 54.598150
dtype: float64
# Doing arithmatic on each element of dataframe, df1
print(np.multiply(df1,10))
a b c d
0 60 90 20 60
1 70 40 30 70
2 70 20 50 40

## 2. UNIVERSAL FUNCTIONS: INDEX ALIGNMENT

### 2.1. Index Alignment in Series

When we try to `add` two `Series` with non-identical index, the resulting sum will keep the index alignment
# First, define two series whose index are not identical
A = pd.Series([1,2,3], index=[0,1,2]) #index[0,1,2]
B = pd.Series([10,20,30], index=[1,2,3]) #index[1,2,3]
# Second, perform addition of these two series
print(A); print(B)
0 1
1 2
2 3
dtype: int64
1 10
2 20
3 30
dtype: int64
0 NaN
1 12.0
2 23.0
3 NaN
dtype: float64
As we can tell from above example, when we perform the sum, the indices of both series are preserved.

• When Python doesn’t find any corresponding value on same index, it returns `NaN`
• For example, in Series `A` there is index 0 but no corresponding value for Series `B`, index 0
• To handle this NaN, we can use kwarg `fill_value` with Pandas `.add()` method
0 1.0
1 12.0
2 23.0
3 30.0
dtype: float64

### 2.2. Index Alignment in DataFrame

When we try to `add` two `DataFrame` with non-identical index, the resulting sum will keep the index alignment
# First, defining two dataframes with not identical indices or columns
C = pd.DataFrame(rand.randint(10, size=(2,2)),
columns=['a','b'])
D = pd.DataFrame(rand.randint(10, size=(3,3)),
columns=['a','b','c'])
print(C); print(D)
a b
0 1 7
1 5 1
a b c
0 4 0 9
1 5 8 0
2 9 2 6
# Secondly, we add these two dataframes and see how results are handled
a b c
0 5.0 7.0 NaN
1 10.0 9.0 NaN
2 NaN NaN NaN

• When Python doesn’t find any corresponding value on same index and column, it returns `NaN`
• For example, in DataFrame `D` there is index 0, column ‘c’ but no corresponding value for Series `C` under index 0, column ‘c’
• We can use keyword argument, `fill_value` with Pandas `.add()` method to handle the NaN
a b c
0 5.0 7.0 9.0
1 10.0 9.0 0.0
2 9.0 2.0 6.0

### 2.3. Python Operators and their equivalent Pandas Methods

Python operator
Parameter method(s)
+
-
sub(),subtract()
*
mul(),multiply()
/
div(),divide(),truediv()
//
floordiv()
%
mod()
**
pow()

## 3. UNIVERSAL FUNCTIONS: OTHER OPERATIONS

### 3.1. Understanding ‘axis’ keyword argument

#### One way to look at `axis` kwarg:

Remember that we mention, `axis=0` or `axis=index` the operation will be performed column wise and when we mention `axis=1` or `axis=column`, the operation will be performed row wise.

#### Another way to look at `axis` kwarg:

• `axis=0` or `axis=index` means to perform operation on all the rows in each column
• `axis=1` or `axis=column` means to perform operation on all the columns in each row

### 3.2. Operations on Self

Let’s subtract values of first row of the `df1` from all rows in `df1`. In this case, the default value of kwarg, `axis` is `1` or `columns`
print(df1)
print(df1.subtract(df1.iloc))
a b c d
0 6 9 2 6
1 7 4 3 7
2 7 2 5 4
a b c d
0 0 0 0 0
1 1 -5 1 1
2 1 -7 3 -2
However, If we would like to apply this arithmetic operation index-wise, we can use, `axis=0` or `axis=index`
print(df1.subtract(df1['a'], axis=0))
a b c d
0 0 3 -4 0
1 0 -3 -4 0
2 0 -5 -2 -3

### 3.3. Operation between Series and DataFrame

Operations between a `DataFrame` and `Series` object are similar to operations between a two-dimensional and one-dimensional NumPy array
# Series
ser11 = pd.Series(rand.randint(12, size=3))
ser11
0 2
1 9
2 11
dtype: int64
# DataFrame
df11 = pd.DataFrame(rand.randint(10,size=(3,4)),
columns=['a','b','c','d'] )
print(df11)
a b c d
0 7 5 7 8
1 3 0 0 9
2 3 6 1 2
Let add `Series` to `DataFrame` with kwarg, `axis=0` or `axis=index`, which matches the index . Both `ser1` and `df1` have identical index