# My Pandas Playbook --- ## Aggregating ```python sales_data.groupby(“month”).agg( {‘purchase_amount’: [sum, np.mean], ‘year’: [max]}) ``` ## Aggregating and renaming ```python sales_data.groupby(“month”).agg( total_sales=pd.NamedAgg(column='purchase_amount', aggfunc=sum), avg_sales=pd.NamedAgg(column='purchase_amount', aggfunc=np.mean), max_year=pd.NamedAgg(column='year', aggfunc=max)) ``` ## Grouping on Multiple Columns ```python sales_data.groupby(["month", "state"]).agg(sum)[['purchase_amount']] ``` ## Access a single value for a row/column label pair ```python df.at[4, 'B'] ``` ## Using `nlargest` to filter only top values ```python df.nlargest(3, 'Score', keep='first') ``` ## Reshape data using melt #todo/today - [x] Reshape data using melt - [x] dataframe querying ## `set_option` to Format Values ```python pd.set_option('precision', 2) ``` ## "Text to Columns" from Excel in Pandas ```python df[['City', 'State']] = df['Location'].str.split(pat=',', expand=True) ``` ## Explode a column ```python df.explode('Cities Visited') ``` ## Pivot table ```python pd.pivot_table( data=df, index='Region', aggfunc='sum' ) pd.pivot_table(data=df, index='Region', aggfunc=['mean', 'sum']) pd.pivot_table( data=df, index='Region', aggfunc={'Sales': 'mean', 'Units': 'sum'} ) df.pivot_table(index='Name', columns='Cities Visited', values = 'count', aggfunc='count', fill_value=0) ``` ## Rolling averages ```python df['Sales'].rolling(7).mean() ``` ## Change to numeric ```python df['Score'].apply(pd.to_numeric, errors='coerce') ``` ## Impute missing values ```python df.fillna(df.mean()) ``` ## Generate a frequency table ```python df['Region'].value_counts() # normalized df['Region'].value_counts(normalize=True) ``` ## Joins vs Merge Lookup on right table: `df1.join(df2)` always joins via the index of `df2`, but `df1.merge(df2)` can join to one or more columns of `df2` (default) or to the index of `df2` (with `right_index=True`). The following two lines are equivalent: ```python left.join(right, on=key_or_keys) pd.merge(left, right, left_on=key_or_keys, right_index=True, how='left', sort=False) ```