# What is the difference between Group By and Pivot Table
>[!cite] What is the difference between `groupby` and `pivot_table`?
>
>**`pivot_table` = `groupby` + `unstack`**
>**`groupby` = `pivot_table` + `stack`**
>
>In particular, if `columns` parameter of `pivot_table()` is not used, then `groupby()` and `pivot_table()` both produce the same result (if the same aggregator function is used).
```python
# sample
df = pd.DataFrame({"a": [1,1,1,2,2,2], "b": [1,1,2,2,3,3], "c": [0,0.5,1,1,2,2]})
# example
gb = df.groupby(['a','b'])[['c']].sum()
pt = df.pivot_table(index=['a','b'], values=['c'], aggfunc='sum')
# equality test
gb.equals(pt) #True
```
In general, if we check the [source code](https://github.com/pandas-dev/pandas/blob/main/pandas/core/reshape/pivot.py), `pivot_table()` internally calls `__internal_pivot_table()`. This function creates a single flat list out of index and columns and calls `groupby()` with this list as the grouper. Then after aggregation, calls `unstack()` on the list of columns.
```python
gb = (
df
.groupby(['a','b'])[['c']].sum()
.unstack(['b'])
)
pt = df.pivot_table(index=['a'], columns=['b'], values=['c'], aggfunc='sum')
gb.equals(pt) # True
```
As `stack()` is the inverse operation of `unstack()`, the following holds True as well:
**pivot_table = groupby + unstack** and **groupby = pivot_table + stack** hold True.
In particular, if `columns` parameter of `pivot_table()` is not used, then `groupby()` and `pivot_table()` both produce the same result (if the same aggregator function is used).
```python
# sample
df = pd.DataFrame({"a": [1,1,1,2,2,2], "b": [1,1,2,2,3,3], "c": [0,0.5,1,1,2,2]})
# example
gb = df.groupby(['a','b'])[['c']].sum()
pt = df.pivot_table(index=['a','b'], values=['c'], aggfunc='sum')
# equality test
gb.equals(pt) #True
```
---
In general, if we check the [source code](https://github.com/pandas-dev/pandas/blob/main/pandas/core/reshape/pivot.py), `pivot_table()` internally calls `__internal_pivot_table()`. This function creates a single flat list out of index and columns and calls `groupby()` with this list as the grouper. Then after aggregation, calls `unstack()` on the list of columns.
If columns are never passed, there is nothing to unstack on, so `groupby` and `pivot_table` trivially produce the same output.
A demonstration of this function is:
```python
gb = (
df
.groupby(['a','b'])[['c']].sum()
.unstack(['b'])
)
pt = df.pivot_table(index=['a'], columns=['b'], values=['c'], aggfunc='sum')
gb.equals(pt) # True
```
As `stack()` is the inverse operation of `unstack()`, the following holds True as well:
```python
(
df
.pivot_table(index=['a'], columns=['b'], values=['c'], aggfunc='sum')
.stack(['b'])
.equals(
df.groupby(['a','b'])[['c']].sum()
)
) # True
```
However, there's a performance difference between the two methods. In short, `pivot_table()` is slower than `groupby().agg().unstack()`. You can [read more about it from this answer](https://stackoverflow.com/a/74048672/19123103).
**TL;DR:** `pivot_table` loops over `aggfunc` no matter what's passed to it while `groupby` checks if cython-optimized implementation is available first and loops if not.
## Source
- [cottontail](https://stackoverflow.com/a/72933069/20647829) (stackoverflow), [cottontail](https://stackoverflow.com/questions/44229489/pandas-performance-pivot-table-vs-groupby/74048672#74048672)