Data Science interviews, or in general IT interviews, are probably the most challenging ones.
You get asked a wide range of types of questions. Statistics, ML, Software Engineering, Systems Designs, etc.
I did over 40 interviews for Data Science positions.
In this post, I will show you the toughest/most asked/most interesting questions and tasks. Of course, I also did many take-home assignments which I can't show here.
1. Imagine, you built a classifier for your customer. They are happy with the model but want it to have a higher precision. How can you accomplish that easily? What trade-offs do you have to consider?
2. Explain Insertion Sort and discuss its time complexity. When is Insertion Sort preferred over faster solutions like Merge Sort or Quick Sort?
3. Explain the difference between Boosting and Bagging. When would you choose which type of model?
4. Implement Linear Regression from scratch with Gradient Descent.
5. Explain K-Means Clustering. What are the limitations of this model and which alternatives exist?
6. What is a Star Schema?
7. What is a Snowflake Schema?
8. How would you write unit tests that involve an API or Database call?
9. What is Data Drift?
10. Implement a sorting algorithm with O(n log n) time complexity
11. What kind of loss function exists for Demand Forecasting? Discuss the limitations and advantages.
12. Which challenges arise with forecasting intermittent time series?
13. Explain Generative Adversarial Networks for Image generation like you would explain it to a child.
14. What are the challenges with using Feature Importance?
15. What is Bootstrapping?
16. What is the difference between a confidence interval and a prediction interval?
17. What is quantile regression?
18. Explain vanishing and exploding gradients.
19. You deployed your ML model as an API. You notice that inference time is slow. What are some techniques to reduce inference time?
20. What techniques do you know to tackle class imbalance?
21. Explain the Query Execution Order of SQL
22. What is Database Normalization? Explain First Normalization, Second Normalization, and Third Normalization.
23. What is CI/CD?
24. What is pre-commit?
Conclusion
I know, Data Science interviews can be much harder than the real job. It’s important to know the basics instead of learning a new tool every week.
Hi thank you for your helpful post. I have been working for more than 4 years as project manager. And from this year (2024), I decided to change my career and move on Data science / ML career. Hence, I’m doing a master degree in that field. Do you have any advice for me to get more prepared so that I can be able to find my first job as data scientist after my graduation please ? Since it looks a not easy field
Thanks a bunch in advance