Machine Learning on AWS: Getting Started with SageMaker and More

Ready to get started with machine learning (ML) on AWS? ML requires a lot of processing capability, more than you’re likely to have at home. That’s where a cloud platform such as AWS can help. But how do you get started? Here are some tips to add ML to your career.

Machine Learning in General

First, learn as much as you can about ML independent of AWS. To maximize your career opportunities, you want your experience and knowledge to be broad and not focus exclusively on AWS. 

ML is not for the faint of heart. It requires serious study. However, opportunities for those with machine-learning skills abound, with routine six-figure salaries for engineers and developers who focus on deep learning, machine learning, and artificial intelligence (A.I.). According to Burning Glass, which collects and analyzes millions of job postings from across the country, machine-learning engineers with even a few years of experience can unlock pretty healthy compensation—and that’s before you throw in benefits such as stock options:

Job interviews for ML-related positions are often tough and require quite a bit of preparation, as well. Even “everyday” developers and analysts (i.e., those who don’t primarily focus on ML in their work) may very well end up using more ML tools and principles in coming years. If you’re a student specializing in computer science or a related field, that’s as good a reason as any to build out your ML and A.I. knowledge

Read books, take online classes, and invest as much time as you can into learning it. TensorFlow, the open-source library for deep-learning software that was created by Google, has a nice page of learning resources.

Next, look at the coding frameworks available. The aforementioned TensorFlow is considered one of the top, as is PyTorch, which was created by Facebook. Although AWS has great tools for building ML with little coding, you’re still going to want to know how to use ML coding frameworks.

Learning AWS ML

AWS presently has 17 services related to ML, and they’re likely to add more in the years to come. This is too much to learn all at once, so we recommend a couple of things: First, make sure you’re completely familiar with “basic” computing via AWS, including how to provision EC2 servers and, most importantly, how much it’s going to cost you per hour to allocate those servers. You can’t afford surprises, especially when dealing with the kind of processing resources you need. 

Second, of the 17 services, the one you want to start with is SageMaker. This is AWS’s flagship ML product and it includes a complete IDE called SageMaker Studio. 

SageMaker Studio offers a Quick Start; get to the Studio from the main SageMaker page, scroll down, and you’ll see the Quick Start:

Fill in the name and choose the permissions. (You’ll likely need to create a role; you can learn about that here.) Then you’ll be asked for your VPC ID and subnet, so make sure you have a basic understanding of those, as well. Click ‘Next,’ and you’ll see your SageMaker Studio dashboard. After a few minutes, you’ll see your new Studio show up in a list with the word “Ready” by it.

Click the “Open Studio” link to go into the Studio. The Studio will open in a new window; the first time it will take a couple minutes to load.

In the lower-right you’ll see a pane with a demonstration video and a “video tutorials” link with more information to help you get started. There’s also a link to a tour guide, which provides a complete walkthrough for setting up multiple experiments and trials. 

With ML, experiments are the processes that you run many times over, as the system learns. Trials are the individual outcomes from the experiments. You provide different data with the experiments and observe the trials. Typically, each time you only modify the data slightly; this is known as an incremental change. Over time your system continues to gather more and more data and learn from the outcomes.

More ML on AWS

If you’re into pattern and facial recognition and aren’t paranoid, you might try out the AWS DeepLens, which is a hardware camera built to integrate with AWS ML. (You probably want to put tape over its lens when you’re not using it.)

One place where you can stay on top of it all is through the official AWS ML blog. Many of the articles are really advanced, but if you at least skim through them, you’ll pick up tidbits of knowledge here and there—even if you’re just starting out on your machine-learning journey.

Conclusion

Machine learning is a huge topic and there’s a lot to learn. Start slowly, study as much as you can, and just keep practicing with the different tools available. Over time, you’ll become competent, and if you keep at it, you’ll eventually become an expert. Have patience and perseverance!