Apprentice Chef, Inc. is an innovative company with a unique spin on cooking at home. Developed for the busy professional that has little to no skills in the kitchen, they offer a wide selection of daily-prepared gourmet meals delivered directly to your door. Each meal set takes at most 30 minutes to finish cooking at home and also comes with Apprentice Chef's award-winning disposable cookware (i.e. pots, pans, baking trays, and utensils), allowing for fast and easy cleanup. Ordering meals is very easy given their user-friendly online platform and mobile app.
Apprentice Chef is looking to expand is business. To attract new customers, they have been conducting a cross-sell promotion for their customers. They are now looking to develop a classification machine learning model the gain deeper insights to which type of customers their promotions would be more likely to have a positive outcome. To develop this model we will use the dataset provided by them "Cross_Sell_Success_Dataset_2023.xlsx".
Their data science team assures that their dataset engineering techniques are statistically sound and represent the true picture of Apprentice Chef’s customers. Taking this assumption as true, to build a better model we will only analyze the dataset to understand which engineering feature might be built to construct a more robust model.
To ensure we will use the best model possible we will create a model tournament on five models (Logistic Regression, Random Forest Classifier, Gradient Boosting Classifier, K Neighbors Classifier, and Decision Tree Classifier) to ensure to pick the best one for the task. The winner will be chosen based on the model whose AUC score is the highest. AUC is particularly useful when the dataset is imbalanced, i.e., when one class is much rarer than the other. In such cases, accuracy alone may not be a good metric to use because a model that simply predicts the majority class all the time will have a high accuracy but may not be very useful in practice. In addition, AUC takes into account both the true positive rate and the false positive rate, regardless of the class distribution. This makes it a more robust and reliable measure of the model's performance, especially when dealing with imbalanced datasets.
Hello there! Thank you for visiting my data science portfolio. If you have any questions or would like to discuss potential collaborations, please don't hesitate to reach out to me through the contact form below. I'm always excited to connect with fellow data enthusiasts and explore new opportunities. Looking forward to hearing from you!
Thanks for checking out my data science portfolio! I'm Leonardo Lecci, and I'm passionate about using data to solve complex problems.
Whether you're looking for data-driven insights or want to collaborate on a project, I'd love to hear from you.
Contact me today to get started.