My Learning Process So Far
12 May 2020 -
In June 2019 I quit my job as a high school principal to take time off and travel. I had hit a point in my career where any forward momentum to continue on that career path or to any natural next step felt more like letting my circumstances decide what I do next and not what would ultimately make me happy. I set off on a long term international trip that was meant to last about a year before the virus sent me back to the states to quarantine.
During my trip, I thought that inspiration would find me and I’d start working on something that would help me make a meaningful decision about what to do next in my career. And at times it did. I did an online neuroscience class, I did a full tutorial on Python (without realizing it was for Python 2 and not 3), and I did what I could to improve my Spanish. But, the lure of seeing new places and meeting new people kept me more focused on living in the moment and not worrying too much about the future. And, I was ok with that. I had no plans to force myself to be productive.
When the virus sent me back to the states in mid-March, I spent two weeks with friends isolating then moved to my sister’s house in the town I grew up in. The first few weeks of isolation went fine, binging on the Sopranos while playing video games and getting to spend time with my niblings once I moved to my sister’s house. But, I started to struggle mentally with the long hours and a lack of structure. I decided it was time to teach myself something. On a whim I googled data science MOOCs and podcasts and started my journey.
Reading - The Elements of Statistical Learning Textbook
One of the first pages I found in my search for a good MOOC also listed this textbook. And it said, if you can understand what is in this book, you might not need to take most of these intro MOOCs. With a math degree and studying some applied math in my masters program, I thought it was worth giving it a shot. I read the first five or six chapters and I was able to understand the algorithms and general approaches, but I was not able to remember enough linear algebra to do all the matrix operations and follow along with the math formulas. I was able to explain what a k-nearest neighbors and what the limitations were to this approach. I was able to understand that using too many parameters will lead to overfitting and that there were several ways to add penalties or limits to the process to avoid fitting too much to the data. This felt like enough to move to the next step, I knew I’d be programming most of these processes by simply plugging in the variables to a program and didn’t need to know the deep math behind everything, yet.
Listening - Towards Data Science Podcast
I have been taking long walks every day and decided to listen to at least one episode about data science every day. I thought this would be helpful mapping out what I needed to learn and that aligned with the meta learning step I was reading about in Ultra Learning. I learned a ton really quickly and built up my confidence a lot. I’m thinking about doing a longer post about listening to podcasts as a new learner so I won’t go into too much detail here. I also kept a google keep note and added to it each time I heard something that I wanted to learn more about, or that I thought I should at least know what it was. Eventually, these episodes started to feel a bit redundant for learning new things. The topics and people were still very interesting, and I plan to continue listening to new episodes as they come out, but I found after fifteen episodes or so that I was not taking down new notes and often my mind would wander to other things.
Reading - Ultra Learning
I saw this pop up in a few Reddit posts and mentioned in some podcast episodes so I downloaded it and read it. Most of the concepts were review for me as a former educator but it was helpful in overall motivation. The most helpful concept was meta-learning which has helped me consistently step back and make sure I’m spending enough time reflecting on what all I need to learn and if my current habits are focusing on the right things. I also took some notes and reflections when I finished so that I had something to look back on when I inevitably get stuck in the future.
Doing - Kaggle
Kaggle Learn came up in a bunch of the podcasts so I jumped on there and started. So far, I’ve done several courses. (Python, Intro to ML, Intermediate ML, Data Visualization, PANDAS, Intro to SQL, and Advanced SQL). This was really helpful to build my confidence in writing code in Python and learning the basics of ML and SQL. I plan to do the other seven courses at some point in the future, but I wanted to start working on something more practical so I saved those for when they become more relevant to future projects I work on. I did a bunch of sql exercises on Kaggle to reinforce those skills but I ended up on HackerRank and thought those exercises were a little more interesting.
Buying - New Computer
One of the things I realized at that point was that my $200 Chromebook machine wasn’t going to cut it if I was going to do any real programming and machine learning. I heard enough about GPUs and hit enough deadends with finding ways to get some programs to run on a chromebook that I decided it was time to invest in a new machine. I went with a Windows machine out of comfort. Which, I know isn’t ideal for a lot of the programming I want to do. But, I really didn’t want to add learning Linux to my list of things to learn, at least not yet.
Reading - Data Science from Scratch
This was a quick read that felt similar to the elements of statistical learning. I read it and understood the approaches and didn’t spend too much time reading the code. At times I would check to make sure that the code structure made sense to me. But, ultimately, I figured I would be looking it up again in the future when I needed to use it so I didn’t spend too much time trying to decipher everything.
Doing - HackerRank
I started doing a bunch of SQL exercises on HackerRank and that helped me with developing my skills. But, I hit a wall in some of the Medium difficulty exercises. Some of them felt difficult because of the wording of the exercise and some felt difficult because it seemed like the exercise was forcing a solution in SQL that could have been found much easier with a different approach. So, I stalled out on those and started only doing the 10 days of Stats and 30 days of code exercises each day. Those have been helpful. But, some days the exercises either feel less relevant or poorly worded so I end up trying for a bit then looking at the discussion to get answers. I can usually understand what I should be doing but don’t know all the options within Python to do it, so looking at the answers of others has helped me see those possibilities better.
Doing - Starting my first projects
Once my new computer came in, I quickly downloaded postgres, R, and a Python compiler. In the three weeks since the computer came in, I’ve only used postgres and haven’t touched R or Python yet. That will likely change soon though. It took me a little while to get Postgres up and running and the Phish.in setlist database up and running since I haven’t done much command prompt stuff in the past. But, I got it working and started playing with some SQL queries to look at some fun data.
I used a combination of Postgres, google sheets, and Flourish to do two projects on Phish setlists. The Day of Week Bias project and the visualizing of changes in song frequency over time. Both of these projects tested my SQL skills and brought me to a level that I’m comfortable with for future projects. There are a few other skills I want to develop but I was able to compensate for the lack of those skills in Google Sheets for now. I also really like Flourish, but I would like to try some of the data visualization libraries in Python too as a way to improve my Python skills.
Doing - Github Pages and Jekyll
As I was accumulating projects and other writing about my process, it was time to find a way to host it all online and share. I didn’t want to spend any money yet and I heard a handful of people on podcasts mention how they’ve used github pages so I started looking there and quickly learned it would meet my needs while also teaching me the basics of github. I needed to learn the basics of jekyll and fortunately their documentation was plenty to get me to a place where I can add posts to my blog and add links and photos. I hit some snags when I accidentally opened up two different github page paths within my account and when I didn’t realize that the default layout I created in the tutorial was overwriting the one I wanted to get from a theme. But once those were sorted out, I was able to publish a bunch of stuff and share it with some friends.