R vs Python: What do Data Scientists prefer?

R and Python are the most common programming languages in the data science world, but what exactly is the difference between the two?

This remains as a common topic debated within the Data Science community. Nevertheless, both programming languages do have their own strengths and limitations in their application.

If you are a professional that is looking to start a career in this field, here are some key takeaways for both R and Python along with trends that we are seeing in the Singapore and Hong Kong markets.  

 

History of the two programming languages

R

R is a statistical computing and graphics language and environment. According to R Project, it is a GNU Project – an operating system and an extensive collection of computer software – developed at Bell Laboratories. Similar to S language, R provides several options for statistical and graphic techniques.

Its functionalities include but are not limited to:

  • Linear and nonlinear modelling
  • Classical statistical tests
  • Time-series analysis
  • Classification and clustering.

Shared by R Project, R’s strengths include

  • Free software which runs on a wide variety of UNIX platforms namely Linus, Windows and MacOS
  • Ease with which well-designed publication-quality plots
  • Making design choices in graphics where user retains full control
  • Allows data manipulation calculation and graphical display
  • Storing and handling data

Overall, it is a simple and effective programming language which supports data scientists and experts to create and control conditionals, loops, user-defined recursive functions and input and output facilities.

Python

Python is a widely used, general-purpose, yet high-level programming language. Developed by Python Software Foundation, its main purpose was focusing on code readability to assist programmers to express concepts in a compressed form compared to Java, C++ and C. The objective is to provide code readability and advanced developer productivity.

Its functionalities include:

  • Developing and scripting code
  • Generation of code and software testing

Due to its elegance and simplicity, top technologically-driven organisations like Dropbox, Google, Quora, Mozilla, Hewlett-Packard, Qualcomm, IBM, and Cisco have implemented Python. Python is also an inspiration to the creation of many other coding languages such as Ruby, Cobra, Boo, CoffeeScript ECMAScript, Groovy, Swift Go, OCaml, Julia etc.

 

R vs Python: which is the preferred choice?

Dr Norm Matloff, Professor of Computer Science at University of California, wrote a paper on the key differences between the two Languages. He compared R and Python across the following multiple domains to determine which programming language was the better choice:

Elegance

Winner: Python

While this is subjective, Python greatly reduces the use of parentheses and braces when coding, making it more sleek, Matloff shared.

Machine Learning

Winner: Python (but not by much)

Python's massive growth in recent years is partially fuelled by the rise of machine learning and artificial intelligence (AI). Python offers a number of finely-tuned libraries for image recognition.

In Maltoff’s words, the Python libraries' power comes from setting certain image-smoothing operations.

Learning curve

Winner: R

Shared by Maltaff, data scientists working with Python must learn a lot of material to get started, including NumPy, Pandas and matplotlib. Nevertheless, matrix types and basic graphics are already built into base R. Novices can now be doing simple data analyses within minutes as R packages run automatically.

Statistical correctness

Winner: R (by far)

Advocates for Python – namely professionals working within machine learning – may seem to have a poor understanding of the statistical issues involved with the language. R, on the other hand, was written by statisticians, for statisticians. This suggests that subject matter experts in R will be able to ensure that the math behind analyses are as accurate as possible.

Parallel computation

Winner: It’s a draw

Matloff suggests that the base versions of R and Python do not have strong support for multicore computation. What he means by this is that both R’s parallel package, and Python's multiprocessing package is not a good workaround for its other issues. Nevertheless, external libraries supporting cluster computation are good in both languages, while Python has better interfaces to GPUs.

Libraries

Winner: Python

Python’s machine learning library – Scikit-learn – is deemed to be highly recognised as ‘gold-standard’. It provides a wide selection of supervised and unsupervised learning algorithms. Reported by Toward Data Science, this library, “by far the easiest and cleanest ML library”. Scikit learn was created with a software engineering mind-set. Its core API design revolves around being easy to use, yet powerful, and still maintaining flexibility for research endeavours. This robustness makes it perfect for use in any end-to-end ML project, from the research phase right down to production deployments.

 

What are the trends in Singapore and Hong Kong markets?

Shared by Donnie Maclary, Principal Consultant of Huxley Singapore, around 90% of all of the jobs that he is filling in Data Science and Analytics are looking for candidates that are well versed in Python. This is because Python offers a lot of flexibility as compared to R.

If you are looking to grow your career in this field, it is thus best to focus on being familiar with the full suite of Python. Additionally, other in-demand skills for data professionals include SQL, Spark, Hadoop, Java, Amazon Web Services (AWS), Scala, and Kafka.

 

Huxley can help!

If you are a Data Science and Analytics professional that is looking to add top-tier talent to your team, please reach out to us via the contact form below. Do keep your eyes peeled for more updates within this space on our LinkedIn page

 

If you would like to find out more information about the market outlook within the sector, please leave your details below:


New ways of working: five important soft skills to know about

01 Jun 2020

The new way of remote working has led to a growth in demand for certain soft skills, which break away from the more traditional skills we see as being ‘’highly sought after’’.

How to successfully onboard new employees remotely

05 Apr 2020

During these uncertain times, companies are faced with an additional challenge in this context: a purely digital onboarding. How can this process function effectively without direct personal contact? Below we have compiled the most important tips for you.

Tags: TIPS
Skyscraper_view_from _the_ground

Global Government Assistance

20 Apr 2020

We know that every business is having to navigate new rules, advice and government support, often with differing regional, national and city-specific nuances. With that in mind, we’ve pulled together some information on the various government assistance programmes from across the world

3 skills that will help you become a data scientist

14 Sep 2020

The growth of data science is not only picking up pace but is spreading its presence across dominant industries such as finance. Such burgeoning needs for data scientists and analysts will be coupled with a drive to secure the best talent in the field. So what are the top 3 skills that will help you transform into the next most sought-after data science candidate?