November 2, 2022

Learn to Code

Over the past couple of years I’ve taught myself programming and data science. I’d been at a big computer nerd and at least programming-nerd-adjacent for a long time (linux user since 2005, emacs users since 2011) but in 2020 I made a conscious decision to pursue learning to code as part of a plan to find an alternative career. Over the past couple of years I’ve tried out quite a few different resources for learning programming and data science. Here are some highlights. Note, these are resources I’ve used, and not a suggested curriculum: play around and find the tools that work for you. Pretty much all the resources listed here are free. Most of them have been explored by yours truly, some have been recommended by people I trust.

Learn to program

It’s incredible how many good resources there are out there to help you learn to code. There’s so many good completely free resources too! I mean, it’s kind of prima facie surprising, but once you consider how much time and effort people put into the free software ecosystem, it’s not that surprising that a lot of freely available material exists. Although, I still think it’s remarkable how much of the material is actually good.

Practice

The best way to learn is by doing, so practice, practice, practice! My favourite site for practicing coding exercises is exercism. You can choose from dozens of languages, and get a list of exercises that teach you various programming concepts and get increasingly difficult as you learn more. Pick a language that has “Learning Mode”: this will give you more guidance and information on the concepts being taught.

There are a number of other websites where you can try coding challenges, including CodeWars and LeetCode.

In a similar vein, there are the series of puzzles from previous years' Advent of Code. Each year consists of a series of increasingly challenging puzzles. A new puzzle unlocks every day during December (up to the 25th), and they’re kind of linked by a goofy story. I got as far as day 18 last year and I had a lot of fun. You can access previous years' puzzles from the website.

I’m not going to weigh in on which language you should start learning first but I will say that from my totally unscientific trawling of the jobs boards, the languages that appear most often are javascript (and derivatives like typescript etc), python and C++. Of those, for data science work, python is clearly the best choice.

Web development

There are lots of free courses to get you into programming that focus on teaching you the tools you need to become a web developer, and thus focus on teaching you javascript as your main language (and HTML, CSS etc and so on). These include, for example The Odin Project and Full Stack Open. Both of these require you to set up and install some stuff you’ll need to write and test code on your own machine. If you want a course that lets you write code in the browser, you can explore Free Code Camp. I actually think you probably get more out of learning if you set it up yourself and run stuff locally. It (a) gives you more options to explore when you’re debugging and (b) teaches you a little more about the infrastructure required to make your code actually work. But I recognise that local installation might not always be an option, so it’s nice to have alternatives.

General programming

I didn’t really know where to list this, but I think Calm Code deserves a mention. Short tutorials on a bunch of topics.

If you’ve got some money to spend, you could explore, for example, code academy. If you’re looking to learn programming as a way in to data science, I’d say you’d probably be better off just looking at the paid options for data science I mention below.

You’ve also got Linkedin Learning which has the advantage that the courses will directly help you get badges and skills accreditations on linkedin. Believe it or not, that can be something that can help you stand out, and it might be a way recruiters filter their searches when looking for job candidates on linkedin. Linkedin learning has courses on a bunch of topics that also covers data science, but I’m not going to mention it every time it’s relevant.

Automate the Boring Stuff with Python looks like a good book to learn some beginner programming from. The Author has a number of other freely available books that teach programming.

Documentation

Pretty much any language you use will have its documentation available online. For example, the Python Docs. If you don’t know how to do something, look it up! If the docs are too terse, or confusing, or what you’re trying to do isn’t covered by some built in function, then searching for how to do it will almost certainly lead you to Stack Overflow, a huge Question and Answer website.

A bit of theory

If, like me, you want to understand the theoretical foundations of a topic, as well as the practical applications, you might want to explore Teach Yourself Computer Science, an opinionated reading list that guides you through the theoretical side of computer-poking. Where possible, it links you to freely available versions of the material it recommends.

Data science

I’m using the term “data science” quite loosely to also include AI and Machine Learning.

Data science curriculum

I have been recommended this blog post that lays out a 26 week data science curriculum, though I haven’t yet gone through it. It mostly links to videos, and I honestly don’t think that’s a way I find a good method for learning, but I know lots of people like to learn from videos.

You’ve got DataCamp, DataQuest and SoloLearn. Each of these has a free option, and a much more complete paid option. I honestly wouldn’t bother with just the free version of any of these: you won’t get enough out of the sparse material. You’re better off picking one and paying for it. I have almost finished the DataQuest Data Science with Python track (and I have to finish it before December when my subscription runs out).

Given the number of great resources for ML and deep learning (see below), it’s kind of surprising to me that there isn’t some website filling the “traditional data science” learning niche in the same way. The blog post mentioned at the top is part of a big data science blog network (Towards Data Science), but I haven’t really found anything that rivals the fast.ai course but focused on the basics of clustering, regression and so on.

Practice

Once you have learned a bit, you’ll want to practice. A good place to do this is Kaggle, where you will not only be able to find good data sets and free compute time, but you can test your solutions against other people’s and see how they did things differently.

Databases

You will almost certainly reach a point where you go “Oh, huh, this is why they say to actually learn SQL…” and when that day comes, you can use the Mode SQL tutorial, or the tutorials on SQLZoo. You could also try your hand at putting those skills to use in the neat little SQL Murder Mystery. I guess you could actually learn SQL before you need it, but honestly, until you understand why you need it, it is hard to motivate yourself to figure it out.

ML and Deep Learning

The Fast.ai is great and does a better job than some other offerings at teaching you the foundations as well as the practical stuff. There’s a book that goes along with it that you can buy if you like. Fast.ai have a couple of other courses too. One thing I like about this one is that they don’t ignore the ethical dimension of deep learning. (Also, one of the founders of fast.ai – Jeremy Howard – studied philosophy, woo!)

There are a number of courses on offer through deeplearning.ai, though I haven’t yet tried these out. Likewise for Full Stack Deep Learning.

Data

You are going to need data to do data science. Among the many places you could get data are data.world, UCI Machine Learning Repository, and reddit.

Other bits and pieces

Version control

You will almost certainly be expected to know about version control, and that will almost certainly come in the form of git. If you want to understand the basic idea of how git functions and what it’s doing, read The Git Parable, and the resources listed at the end of the post.

There is also a neat little visual tutorial: Learn Git Branching.

Regular Expressions

RegEx is also something like SQL where, until you have the “huh I should probably learn this” moment, you aren’t going to make yourself do it. But RegEx101 is a useful website to practice and experiment. You could try to learn a little RegEx from Regex Crossword.

Cheat sheets

Terse explanations of a great many concepts can be found at Learn X in Y Minutes. Another great resource is OverAPI which has a big list of programming languages, and for each of them, it houses links to where in their documentation you can find information about specific kinds of things you might want to do (string manipulation, maths functions etc).

Algorithms

Implementations of different algorithms in a bunch of programming languages can be found at the-algorithms. Sometimes one can learn by looking at other peoples solutions to a puzzle on exercism or codewars, and seeing how something is implemented in different programming languages is also sometimes useful.

Code by Charles Petzold

The book that had the biggest influence on my thinking about computers is Code by Charles Petzold. It’s a wonderful gentle walk through how we have managed to make sand think.

What do I do with all this stuff?

OK. There’s a lot of links here. What you shouldn’t do is start at the top, and work your way through each of these. There’s some redundancy in there, there’s some stuff that might be irrelevant depending on your interests. Pick and choose which sites you like the look of and which fit with your needs. I personally got the most out of using Exercism to learn and practice the basics of a language, and then trying to use the language to build an actual thing.

The only way you really learn any of this stuff is by using it. Build your own little projects on stuff that interests you. Publish your code on codeberg or github. Once you’re comfortable with your skills, collaborate with people on open source projects. Have a look at Up For Grabs, and this blog post about getting started with open source.

Like any other language, you can’t really claim to know a programming language until you’ve actually used it to do stuff. So go and do stuff: once you’ve learned to make sand think, use that power to make the world a better place.

© Seamus Bradley 2021–2

Powered by Hugo & Kiss.