Friday 19th, Feb 2021 (Published on Wednesday 17th, Feb 2021)

Best Open Source Data Science Projects Overview to Make a Statement

Quick steps to build a strong portfolio with Open Source Data Science Projects

A data story project

It is very important for you to be able to tell stories. The ability to tell stories and be able to offer some surprising and impressive insight into data, as well as using data to convince others to see things the way you do, are all very valuable skills to have as a data scientist. It doesn’t matter if you just did the best analysis in the world. It’s all useless if you can’t get management to either understand it or take action.

A data story project is all about taking your readers on a journey, so they can explore all the routes, mountains, and valleys on their way to reaching the same conclusion you just reached, without having a background in either statistics or coding.

Data visualization isn’t just about data science, it’s also about communication skills, as you’ll want to explain what’s going on in your code. You could present it all in R Markdown or a Jupyter Notebook, but you’ll definitely get extra points if you customize chart designs or make your data visualization interactive.

End-to-end system 

Many data science jobs in the industry involve not just analyzing data, but building entire systems that can analyze data sets as they stream in on a regular. You might, for example, be required to help the sales team visualize company sales data via a dashboard that updates as new sales data keeps coming in.

An end-to-end project should show that you can build a system that can perform a particular set of analyses procedure on data as it comes in. The system should also be such that others can easily understand and use it. The best way to do this is to write code that’s capable of taking in data from a public data set that’s regularly updated and analyze it in some way. The code should be well commented, and the README should explain how that code can be used by others. The project should also be easy for others to install and run on their local machines.

If you really want to go the extra mile, consider building a full-stack project with a dashboard on the web or build a system that handles real-time data. What you want to show potential employers is that you can build a system that others can easily understand and use.


These are the 2 best open source projects you can include in your data science portfolio. We tried to keep it as general as possible so you can use your creativity to determine which project you want to work on specifically. Of course, you can consider doing other kinds of projects, such as writing explanatory articles or doing group projects, such as showing your contributions to large existing open source projects. However, the above projects are definitely a good place to start.


