Python Could Rule the Machine Learning/A.I. World

What do developers actually use Python for?

According to a developer survey by JetBrains (which also introduced Kotlin, the up-and-coming language for Android development), some 49 percent say they use Python for data analytics, ahead of web development (46 percent), machine learning (42 percent), and system administration (37 percent).

Significant numbers of developers also use the language for software testing (25 percent), software prototyping (22 percent), and “educational purposes” (20 percent). Far fewer chose it for graphics, embedded development, or games/mobile development.

This data just reinforces the general idea that Python is swallowing the data-analytics space whole. Although highly specialized languages such as R have their place among academics and more research-centric data analysts, it’s clear that Python’s relative ease of use (not to mention its ubiquity) has made it many friends among those who need to crunch data for some aspect of their jobs.

This trend has also been underway for quite some time: In February 2018, a KDnuggets poll showed a slow decline in R usage in favor of Python among tech pros who utilized both languages. During that same period, a separate survey from Burtch Works revealed that the language’s use among analytics professionals grew from 53 percent to 69 percent over that same time two-year period, while the R user-base shrank by nearly a third.

But you also can’t ignore Python’s use in machine learning, which is widely viewed as an important part of virtually every company’s future tech strategy. If developers are using Python to build out machine learning tools, that means the language will have a big lock on the ML/A.I. ecosystem considered so central to how future software develops.

If you don’t know Python, it’s clearly a vital language to learn. Fortunately, there’s a variety of websites, books, and other resources that can get you up to speed quickly.

Which Version of Python Do You Use?

The JetBrains data suggests that a substantial majority of developers (87 percent) are on Python 3, while 13 percent are still on Python 2. This is pretty rapid growth for Python 3, which had three-quarters of the market last year.

The language is up to version 3.7.3, and new iterations add more useful features for developers. For instance, 3.7.0 incorporated new time functions, forced UTF-8 mode, built-breakpoint, data classes, and development runtime mode. In other words, it’s hard for developers to keep working with an older version when there’s so much stuff in the later versions to make their lives easier—of course, unless their job requires them to wrestle with legacy code that can’t be updated.

Depending on what you want to get done, you may want to use Python 2. Check out this page for a comparison of 2.x and 3.x, which will perhaps simplify your decision-making process.

3 Responses to “Python Could Rule the Machine Learning/A.I. World”

  1. Python is single threaded and will not process large amounts of data. Currently tensorflow keras spark is the industry standard in my opinion. Transfer between python and spark is problematic because of serialization bottlenecks, so the jury at the moment is still out regarding an architecture to address the throughput issues of python. Big data analytics currently is spark and sparkml not python. Data scientists may prefer python but if it can’t run at scale it’s going nowhere.

  2. While mostly true, I think your comment is a bit misleading. A lot of python code actually calls C and Fortran code. And if you are using PySpark, it actually calls Scala and furthermore, even automatically runs this code in a distributed manner! Python is a great interface because it allows one to prototype quickly while still maintaining speed by the facts I just mentioned.

  3. Don’t sell R short. Python is great for building libraries and frameworks, but R was designed for interactive data analysis, and python was not. Data analysis is fundamentally an interactive activity, so R continues to have the edge in my book. Of course, people should know both. The real difference between the languages is their libraries. If you want to do traditional statistics then R is superior. If you want to do machine learning then prefer python.