Updated in 2023
As a data scientist, you might see the word “portfolio” and assume this article is for designers. And, while of course portfolios are important for designers, they are also becoming a very important aspect of the data scientists’ applications too. As defined by DataCamp Chief Data Scientist David Robinson (on Mode Analytics blog), a data science portfolio is public evidence of your data science skills.
With the meteoric rise in data science jobs (344% growth since 2013), more and more people are coming into the market to fill these positions. Therefore, you need to do more to establish your experience and stand out from the competition so you can land a great data science job.
This is where the data science portfolio comes in. By including a portfolio, you can highlight the real-world experience that you have and show the hiring manager exactly what impact you have had in your past positions.
How do you start creating a data science portfolio?
The first piece of advice actually comes before you start your job search. Because there is so much that goes into the work that you do on a regular basis, you need to keep pretty good records of what you are doing and why you are doing it. This information will help you create your portfolio pages and ensure that you have everything you need to accurately explain the business case and the work you did on it.
If you have to look back at the work you did years ago and remember the minute details, you will probably miss something. Similarly, if you don’t have the right information about a project, including it in your portfolio will confuse the reader. So, make sure you stay prepared by keeping track of what you are working on and adding each project to your portfolio as you work.
In your portfolio, you should have the code you wrote in a Markdown file followed by an explanation for each line. What analysis are you making? What problem are you working to solve? Why does it matter? You want the hiring manager or recruiter to understand the case, so make it as clear as possible.
What should your data science portfolio look like?
Data science portfolios can live in a few different places and in a couple different formats. The first place is a repository on GitHub. This is a good place for your Markdown files and any other code you might have written in your past experiences.
Here is an example of a portfolio homepage on GitHub.
Once you click on one of his projects, you are brought to a separate GitHub page, which has a longer summary of the project and the code he used. It also includes graphs, data set analyses, models, and descriptions to go with everything. For more information on how to use GitHub to find a job, check out our guide.
Next, you should create presentations in PowerPoint or Google Slides that act as business cases for the projects that are in your portfolio. These should be easy-to-understand and coherent explanations of the problems, analyses, and solutions. You can build these on your own or use data science specific templates. Practice going through these slides so that you are comfortable presenting them in interviews because you will definitely be asked about them. Try recording and critiquing yourself or presenting for friends, so you can see where you need help to polish.
Checklist: what goes into your portfolio?
- Brief intro, which includes who you are, what you enjoy working on, what you are looking for (a job, mentor, feedback, more projects, etc).
- Table of contents with all of your projects and a brief summary of each to pique the reader’s interests.
- Each project should have its own page with:
- A larger summary of the problem, hypothesis, and solution
- A breakdown of the code you wrote, data sets analyzed, and models built. You should include explanations with everything – assume that the reader has no background information so make sure you provide as much as you can.
- Predictions or conclusions, depending on the project
How much should you include in your portfolio?
Similar to your resume, you want to have a few items in your portfolio, so recruiters and hiring managers realize that you have had a fair amount of experience. But, don’t sacrifice quality for quantity. If you only have 3 really good pieces, that is better than if you had 8 low or medium quality additions. In general, it is better to have 2 or more, so if you don’t feel like you have enough past experience, maybe you should work on an independent project.
Just make sure that you are presenting the entire data science workflow: discover a problem you want to solve, identify a data set, work through the data files, comb through statistics and establish hypotheses, highlight visualizations, apply any algorithms that might be relevant, and explain the metrics, and show the business case presentation.
Once you have a data science portfolio with at least 2 projects, a presentation that you are comfortable giving, and a solid resume, you should go into your data science applications and interviews with more confidence. Don’t forget that you should be continually adding to your portfolio with every new project you do, either on your own or in your new work experiences.
If you are finished with your portfolio and ready to begin your job search, check out our guide to getting a great data science job to help you ensure you take all the right steps to success.
Pathrise is a career accelerator that works with students and professionals 1-on-1 so they can land their dream job in tech. If you want to work with any of our mentors to get help with your data science portfolio or with any other aspect of the job search, become a Pathrise fellow.