Context
In the 21st century, cars are an important mode of transportation that provides us the opportunity for personal control and autonomy. In day-to-day life, people use cars for commuting to work, shopping, visiting family and friends, etc. Research shows that more than 76% of people prevent themselves from traveling somewhere if they don’t have a car. Most people tend to buy different types of cars based on their day-to-day necessities and preferences. So, it is essential for automobile companies to analyze the preference of their customers before launching a car model into the market. Austo, a UK-based automobile company aspires to grow its business into the US market after successfully establishing its footprints in the European market.
In order to be familiar with the types of cars preferred by the customers and factors influencing the car purchase behavior in the US market, Austo has contracted a consulting firm. Based on various market surveys, the consulting firm has created a dataset of 3 major types of cars that are extensively used across the US market. They have collected various details of the car owners which can be analyzed to understand the automobile market of the US.
Objective
Austo’s management team wants to understand the demand of the buyers and trends in the US market. They want to build customer profiles based on the analysis to identify new purchase opportunities so that they can manipulate the business strategy and production to meet certain demand levels. Further, the analysis will be a good way for management to understand the dynamics of a new market. Suppose you are a Data Scientist working at the consulting firm that has been contracted by Austo. You are given the task to create buyer’s profiles for different types of cars with the available data as well as a set of recommendations for Austo. Perform the data analysis to generate useful insights that will help the automobile company to grow its business.
Data Description
austo_automobile.csv: The dataset contains buyer’s data corresponding to different types of products(cars).
Data Dictionary
Age: Age of the customer
Gender: Gender of the customer
Profession: Indicates whether the customer is a salaried or business person
Marital_status: Marital status of the customer
Education: Refers to the highest level of education completed by the customer
No_of_dependents: Number of dependents(partner/children/spouse) of the customer
Personal_loan: Indicates whether the customer availed a personal loan or not
House_loan: Indicates whether the customer availed house loan or not
Partner_working: Indicates whether the customer’s partner is working or not
Salary: Annual Salary of the customer
Partner_salary: Annual Salary of the customer’s partner
Total_salary: Annual household income (Salary + Partner_salary) of the customer’s family
Price: Price of the car
Make: Car type (Hatchback/Sedan/SUV)
Submission Guidelines
There are two ways to work on this project:
i. Full-code way: The full code way is to write the solution code from scratch and only submit a final Jupyter notebook with all the insights and observations.
ii. Low-code way. The low-code way is to use an existing solution notebook template to build the solution and then submit a business presentation with insights and recommendations.
The primary purpose of providing these two options is to allow learners to opt for the approach that aligns with their individual learning aspirations and outcomes. The below table elaborates on these two options.
Submission type
Who should choose
What is the same across the two
What is different across the two
Final submission file [IMP]
Submission Format
Full-code
Learners who aspire to be in hands-on coding roles in the future focussed on building solution codes from scratch
Perform exploratory data analysis to identify insights and recommendations for the problem
Focus on code writing: 10 – 20% grading on the quality of the final code submitted
Solution notebook from the full-code template submitted in .html format
.html
Low-code
Learners who aspire to be in managerial roles in the future-focussed on solution review, interpretation, recommendations, and communicating with business
Focus on business presentation: 10 – 20% grading on the quality of the final business presentation submitted
Business presentation in .pdf format with problem definition, insights, and recommendations
.pdf
Please follow the below steps to complete the assessment. Kindly note that if you submit a presentation, ONLY the presentation will be evaluated. Please make sure that all the sections mentioned in the rubric have been covered in your notebook/presentation.
i. Full-code version
Download the full-code version of the learner notebook.
Follow the instructions provided in the notebook to complete the project.
Clearly write down insights and recommendations for the business problems in the comments.
Submit only the solution notebook prepared from the learner notebook [format: .html]
ii. Low-code version
Download the low-code version of the learner notebook.
Follow the instructions provided in the notebook to complete the project.
Prepare a business presentation with insights and recommendations to the business problem.
Submit only the presentation [format: .pdf]
2. Any assignment found copied/plagiarized with other submissions will not be graded and awarded zero marks.
3. Please ensure timely submission as any submission post-deadline will not be accepted for evaluation.
4. Submission will not be evaluated if
it is submitted post-deadline, or,
more than 1 file is submitted.
Best Practices for Full-code submissions
The final notebook should be well-documented, with inline comments explaining the functionality of code and markdown cells containing comments on the observations and insights.
The notebook should be run from start to finish in a sequential manner before submission.
It is important to remove all warnings and errors before submission.
The notebook should be submitted as an HTML file (.html) and NOT as a notebook file (.ipynb).
Please refer to the FAQ page for common project-related queries.
Best Practices for Low-code submissions
The presentation should be made keeping in mind that the audience will be the Data Science lead of a company.
The key points in the presentation should be the following:
Business Overview of the problem and solution approach
Key findings and insights which can drive business decisions
Business recommendations
Focus on explaining the key takeaways in an easy-to-understand manner.
The inclusion of the potential benefits of implementing the solution will give you the edge.
Copying and pasting from the notebook is not a good idea, and it is better to avoid showing codes unless they are the focal point of your presentation.
The presentation should be submitted as a PDF file (.pdf) and NOT as a .pptx file.
Please refer to the FAQ page for common project-related queries.
CSV file
Low code File
Full Code File
Happy Learning!
Scoring guide (Rubric) – Rubric – Austo
Criteria Points
Understanding the structure of the data
Answer all the key questions asked in this section
5
Univariate Data Analysis
– Explore all the variables and provide observations on the distributions of all the relevant variables in the dataset – Answer all the key questions asked in this section
15
Multivariate Data Analysis
– Perform bivariate/multivariate analysis to explore relationships between the important variables in the dataset – Answer all the key questions asked in this section
20
Quality & Use of visualizations
Use proper visualizations for the analysis and provide observations on the plots
6
Conclusion and Recommendations
Conclude with the key insights/observations
6
Presentation/Notebook – Overall quality
– Structure and flow – Crispness – Visual appeal – All key insights and recommendations covered? OR – Structure and flow – Well commented code – All key insights and recommendations covered?
8