Building a compelling Data Science Portfolio with writing
Writing in Data Science can have a transformative effect not only in your journey but also in your career. Made by Parul Pandey using Weights & Biases
As part of the FastBook Reading Sessions organised by Aman Arora, I'll be coming to meet you all on 1st July. Therefore I wrote this piece on why writing matters in data science and how it can be used as a tool to leverage your portfolio.
The best way to learn any concept, especially in data science, is by writing about it. It helps you understand the topic in detail, and your work might, in turn, help others. But this is easier said than done. Even though many people want to write, it takes them months and sometimes years to move past the initial challenges of self-doubts. Rachel Thomas's blog post on Why you (yes, you) should blog
very aptly touches upon this issue. Infact it covers all the vital points that you should keep in mind when starting to write. My professional journey started with writing, and in this post, I'll like to share how I utilized my writing acumen to not only transition into Data Science but also to create a compelling portfolio.
Who am I & How writing transformed my journey?
I am Parul Pandey
, working as a Data Science Evangelist at H2O.ai
, an AutoML company based in San Francisco. My work lies at the intersection of Data Science and community, and writing is a core part of what I do. Prior to H2O.ai, I worked as an Electrical Engineer. Even though I have a background in math and programming, my daily work wasn't related to using Machine Learning.
I first started writing publically in 2018
, and to date, I have written for publications like O'Reilly, KDNuggets, ODSC, Neptune. ml, H2O.ai, etc., to name a few. Apart from this, I also write on Medium. My initial articles were pretty naive, devoid of proofreading and good images. When I look back now, I can find more flaws than the number of words contained in them. But the point is, you can never be perfect, and your writing style improves with time and experience. What is even interesting is that I took a three-year career break from 2016 to 2019
, and it was only during the later part of 2018 that I could muster enough courage to publish my articles. Since then, I have been writing every single day of my life, even though it is just a paragraph on some days.
Why is writing useful?
Writing, especially in Data Science, is an important skill set. It gives you both voice and visibility. According to me, the benefits of writing can be summed up by the four R(s).
Retention: Writing helps you to retain a new concept.
Research: Writing helps to develop a research mindset.
Reputation: Writing helps to build a reputation in the community. People cite your articles, mention you in their talks, etc.
Revenue: Writing can also be a self-sufficient career on its own. With the creator economy booming, writing can lead to potential job offers.
How to begin your Writing journey?
Half the battle is won when you decide to write an article. However, the second part is to decide on things like what and where to write, topics, length, etc. It would be best if you created your own writing path. Start with the most comfortable concepts; This will give you the required confidence to start. Gradually diversify your writing portfolio. Start touching on new things and try writing about them. Writing is an iterative process. The more you write, the more you learn, and the better you write.
Things you can write about:
These are just a handful but can act as a great starting point.
Showcase your work
The amount of hard work that goes into writing an article is the same whether you write it for one person or one thousand. As such, make sure to share them on other platforms too, like Linkedin, Twitter, and other community groups, of which you are part. Please don't ask them to like or share per se. This is because there is a fine line between sharing your work and spamming. If people like it and find it interesting, trust me, they would want to share it themselves.
Another way to showcase your work is to use them in meetups, presentations, and conferences. Content creation is challenging, but once it is done, it can be reused in multiple ways.
Creating a public portfolio
You can write on open blogging platforms or create your website. This is entirely up to you. But make sure to start building on a good portfolio right from the start. A Github page, a Kaggle profile, a Stack Overflow, etc., can support your resume.
Contribute to the community
Community is the backbone of Data Science. Get involved with the local Meetup community. Talk at conferences - from local to regional and even national. You could even mentor others who are new to this field. Answer and help others in forums. Try contributing to the documentation of open source libraries.
This is all good but will anyone read my article?
This is by far the most common question that I come across. Self-doubt before even starting is pretty common not only in data science but in almost all fields. Some common doubts that people have are:
There are already tons of articles on the topic. What difference would my article make?
Let's say you want to write about a library. You do a quick search and find that there are ten more good articles on the topic. So would you drop the idea? No, definitely not. Try giving different treatment to your article by using an interesting dataset and focusing on a lesser-known aspect of the library. For instance, I wrote an article about Lux - A Python library, where I showcased Simpson's effect using the library. The library creators loved it and included it as part of the documentation.
Here is another example. I recently wrote an article Interpretably or Accurate? Why Not Both?
, where I explained the concept of Explanable Boosting Machines - models designed to have accuracy comparable to Boosted Trees while being highly intelligible and explainable. EBM is an open-source library by Microsoft. I got a personal message from the team for writing the article.
These are just a few examples highlighting the immense value that writing could bring not only to your resume but in terms of networking within the community too.
Resources to get started
Now that you have decided to take the plunge, here are some open-source tools that will help you to get started.
- An easy-to-use blogging platform, with support for Jupyter notebooks, Word docs, and Markdown. Another option could be Medium
, and Pexels
are some sites that provide free images and photos that you can download and use for any project.
for open-source illustrations that can be customized.
Custom Images: Canva
for designing and publishing almost anything.
freemium version for spell checking and catching the most basic grammar mistakes. Small SEO Tools
, on the other hand, provides free plagiarism check in addition to checking grammar.
Feedback is essential but don't get disheartened.
Leadership expert Ken Blanchard aptly coined the phrase, "Feedback is the breakfast of champions.". Getting critical feedback is imperative to improve, and the same applies to writing also. However, giving feedback is important but denouncing someone's work( especially publically) is totally wrong. Sadly, many a time, you will encounter such situations. Some people leave very distasteful remarks on your articles, without giving any reason. During my writing career, I used to get very disheartened, and at times I had decided to stop publishing because of the same.
However, my advice for you would be not to let negativity overcome your passion. I resonate with Suzana's tweet above. Creating anything is hard and requires a lot of patience, perseverance, and persistence. The most important thing, in my opinion, is to find the topic you are interested in that creates questions you want to answer. So get out there, get your hands dirty and have fun writing.
Connect with me :