Featuretools提供集成功能以处理分类变量
variable_types={"product_id": ft.variable_types.Categorical} https://docs.featuretools.com/loading_data/using_entitysets.html
但是,为与Featuretools最佳兼容,这些类型strings或pandas.Category类型应该是吗?
编辑另外,是否需要手动指定所有列,例如https://github.com/Featuretools/predict-appointment-noshow/blob/master/Tutorial.ipynb中的列, 还是会从适合的熊猫数据类型中自动推断出它们?
import featuretools.variable_types as vtypes variable_types = {'gender': vtypes.Categorical, 'patient_id': vtypes.Categorical, 'age': vtypes.Ordinal, 'scholarship': vtypes.Boolean, 'hypertension': vtypes.Boolean, 'diabetes': vtypes.Boolean, 'alcoholism': vtypes.Boolean, 'handicap': vtypes.Boolean, 'no_show': vtypes.Boolean, 'sms_received': vtypes.Boolean}