Machine Learning Resource and Coding

Will · 11 February 2020 21:49

As there’s interest in ML (as seen in Liam’s intro thread) I figured this deserved it’s own topic.

Firstly - excellent local meetup group for machine learning - https://bathml.org

Secondly - this might need a Wiki if we go into explanations, but for now I think we can just go with learning resources people are using/have used or courses people are following right now.

There’s a couple of links over on the Coding Resources Thread

To Kick Off

Over the last couple years, I’ve been learning a bunch of the foundations of ML, trying to back-fil my Maths knowledge & working through some online courses etc. So I have a decent grounding in a lot of this stuff now, but couldn’t write out a back-propogation formula from memory or anything & importantly I’ve yet to apply any of it outside of coursework for online courses or in a code-alongs at BathML.

Most recently, I went back to an old Udacity Deep Learning course as there is some decent practical stuff in it (old, but free & still really valuable, but sadly not being maintained,)

Due to out of date code/missing started code etc. I got a bit frustrated & I decided to try out the FastAI courses as they too are free, but are up to date.

FastAI Course: Practical Deep Learning for Coders, v3

https://course.fast.ai/

These courses use the PyTorch ML framework with their own library/extensions on top to simplify things further. PyTorch appears to be the most popular framework in a research setting with the older (but still awesome and very actively developed) Tensorflow still in the lead in an industry setting (PyTorch is catching up though).

The course starts with code, then goes back to theory & detail, which seems like a good way to get up to speed.

I had a brief scan through the content a while back but have only started the material today. If anyone wants to join in, we might get some collaboration/discussions going.

p.s. I’m using Google Colab for free (GPU-backed) hosting for Jupyter Notebooks (common in ML/data science)

Will · 11 February 2020 22:19

Note: when using Google Colab, I did hit a snag in the first exercise whereby the FastAI library failed to download a pre-trained weights file.

Solution was to add a cell to download it manually and copy it to the correct location:

!wget https://download.pytorch.org/models/resnet34-333f7ec4.pth
!mkdir /root/.cache/torch/checkpoints/
!cp ./resnet34-333f7ec4.pth /root/.cache/torch/checkpoints/resnet34-333f7ec4.pth

before the call to learn = cnn_learner(data, models.resnet34, metrics=error_rate)

Snowy · 12 February 2020 18:57

Excellent; thanks for putting this together Will! I’ll need to add this to the list of things that I need to get around to at some point!

I’ve definitely heard of Jupyter Notebooks before, as my friend from uni works with it as part of his ML stuff for his job.

Ed_S · 13 February 2020 15:54

I wonder if a Study Circle would be appropriate? Two or three motivated people, aiming to bring all up to speed together. If anyone happens to be ahead, they need to explain and help the others to catch up. Might work up to four, maybe six would be too many - need to split into two groups.

Snowy · 15 February 2020 15:46

That does sound like a good idea, although I would be worried about additional time commitments!

james.griffin · 19 February 2020 15:04

Count me in, it seems a really good thing to get more hands on theoretical experience and knowledge in.

Snowy · 15 March 2020 16:49

Good news Will; I’ve decided to finally start looking at all of this today! So far, I’ve managed to download Miniconda, used that to install the necessary Juypter Notebook packages, and also installed the Python extension in VS Code (as well as doing lots of related reading on the subject!).

I’ve only looked at Python briefly before on Codecademy, so I really need to learn the syntax and practice that before I start the course proper. Does anyone else have an recommended resources that they have used for getting up to speed with Python?

Will · 15 March 2020 18:09

Nice.

Syntax-wise it’s pretty friendly - I picked most of it up from machine learning so far so not the best person to ask for general Python resources.
Main things syntax-wise - very few brackets, no semicolons, colon at the end of a line starts a code block/scope, indents continue it - so like

is_test = False
def this_is_a_function(with, some, params):
    # stuff happening in function
    # everything in the scope of the function is indented
    needs_to_be_indented()
    if is_test:
        #and nested scopes need indenting too
        return do_test_stuff()
    else: 
        return dont_test()

# Stop indenting and you're back to the main scope
# call the function without indent, from the main scope
this_is_a_function()

Note: For training models, I’d definitely give one of the below a try for GPUs/speed - both have some generous free stuff (I started out on Colab, but have now switched to Gradient):

https://gradient.paperspace.com/ (slightly more standard notebook setup, but you have to set a bit more up via their UI [not much though])
https://colab.research.google.com/notebooks/intro.ipynb#recent=true - it’s ready to go, but you lose files when it shuts down and I found a few non-standard bits of behaviour

Will · 15 March 2020 20:56

@Snowy I’d just give it a go from the FastAI materials and shout on here (or the FastAI forum) if you need a hand.

Snowy · 17 March 2020 21:43

Yeah, I’ve had a look at that syntax already, and was impressed at how simple it was. I’ve started looking at the semantics of the language a bit more, but I’ll need to spend some more time sitting down and having a read through it all. I just prefer to feel confident with the code I’m looking at first, as opposed to learning both Python and ML at the same time, but that’s just me!

I was a bit confused about all of the hosted services for GPUs, so I’m glad you’ve recommended something!

Ed_S · 8 May 2023 13:06

A book (or booklet) as well as slides videos and course materials from François Fleuret’s Deep Learning Course

The book is formatted for the small screen - 145 small pages
The Little Book of Deep Learning François Fleuret (pdf)

via the discussion at HN