我删除了额外的列,只关注必要的列进行演示。
输入:
的 DF1 强>
Home 2012-13 2013-14 2014-15 2015-16 2016-17 0 Cleveland Cavaliers 1 1 2 1 3 1 Los Angeles Lakers 2 1 1 1 0 2 Miami Heat 3 3 2 2 1 3 Chicago Bulls 2 1 2 2 1 4 Detroit Pistons 0 0 0 1 1 5 Los Angeles Clippers 2 2 2 1 1 6 New Orleans Pelicans 0 1 1 1 1 7 Philadelphia 76ers 1 0 0 0 0 8 Phoenix Suns 0 0 0 0 0 9 Portland Trail Blazers 1 2 2 0 0 10 Toronto Raptors 0 1 1 2 2
的 DF2 强>
Visitor Home Season 0 Washington Wizards Cleveland Cavaliers 2012-13 1 Dallas Mavericks Los Angeles Lakers 2012-13 2 Boston Celtics Miami Heat 2012-13 3 Dallas Mavericks Utah Jazz 2012-13 4 San Antonio Spurs New Orleans Pelicans 2012-13
的 第1步:熔化df1以获取allstars列 强>
df3 = pd.melt(df1, id_vars='Home', value_vars = df1.columns[df.columns.str.contains('20')], var_name = 'Season', value_name='H_Allstars')
输出继电器:
Home Season H_Allstars 0 Cleveland Cavaliers 2012-13 1 1 Los Angeles Lakers 2012-13 2 2 Miami Heat 2012-13 3 3 Chicago Bulls 2012-13 2 4 Detroit Pistons 2012-13 0 5 Los Angeles Clippers 2012-13 2 6 New Orleans Pelicans 2012-13 0 7 Philadelphia 76ers 2012-13 1 8 Phoenix Suns 2012-13 0 ...
的 第2步:将此新数据框与df2合并,以获取H_Allstars和V_Allstars列 强>
df4 = pd.merge(df2, df3, how='left', on=['Home', 'Season'])
输出:
Visitor Home Season H_Allstars 0 Washington Wizards Cleveland Cavaliers 2012-13 1.0 1 Dallas Mavericks Los Angeles Lakers 2012-13 2.0 2 Boston Celtics Miami Heat 2012-13 3.0 3 Dallas Mavericks Utah Jazz 2012-13 NaN 4 San Antonio Spurs New Orleans Pelicans 2012-13 0.0
的 第3步:添加V_Allstars列 强>
# renaming column as required df3.rename(columns={'Home': 'Visitor', 'H_Allstars': 'V_Allstars'}, inplace=True) df5 = pd.merge(df4, df3, how='left', on=['Visitor', 'Season'])
Visitor Home Season H_Allstars V_Allstars 0 Washington Wizards Cleveland Cavaliers 2012-13 1.0 NaN 1 Dallas Mavericks Los Angeles Lakers 2012-13 2.0 NaN 2 Boston Celtics Miami Heat 2012-13 3.0 NaN 3 Dallas Mavericks Utah Jazz 2012-13 NaN NaN 4 San Antonio Spurs New Orleans Pelicans 2012-13 0.0 NaN
您可以使用 pandas.melt 。将您的数据df2转换为长格式,即Home和Season作为列,Allstars作为值,然后合并到'Home'和'Season'上的df1。
pandas.melt
import pandas as pd df2['Home'] = df2.index df2 = pd.melt(df2, id_vars = 'Home', value_vars = ['2012-13', '2013-14', '2014-15', '2015-16', '2016-17'], var_name = 'Season', value_name='H_Allstars') df = df1.merge(df2, on=['Home','Season'], how='left')