R vs Python: What do Data Scientists prefer?

R and Python are the most common programming languages in the data science world, but what exactly is the difference between the two?

This remains as a common topic debated within the Data Science community. Nevertheless, both programming languages do have their own strengths and limitations in their application.

If you are a professional that is looking to start a career in this field, here are some key takeaways for both R and Python along with trends that we are seeing in the Singapore and Hong Kong markets.  

 

History of the two programming languages

R

R is a statistical computing and graphics language and environment. According to R Project, it is a GNU Project – an operating system and an extensive collection of computer software – developed at Bell Laboratories. Similar to S language, R provides several options for statistical and graphic techniques.

Its functionalities include but are not limited to:

  • Linear and nonlinear modelling
  • Classical statistical tests
  • Time-series analysis
  • Classification and clustering.

Shared by R Project, R’s strengths include

  • Free software which runs on a wide variety of UNIX platforms namely Linus, Windows and MacOS
  • Ease with which well-designed publication-quality plots
  • Making design choices in graphics where user retains full control
  • Allows data manipulation calculation and graphical display
  • Storing and handling data

Overall, it is a simple and effective programming language which supports data scientists and experts to create and control conditionals, loops, user-defined recursive functions and input and output facilities.

Python

Python is a widely used, general-purpose, yet high-level programming language. Developed by Python Software Foundation, its main purpose was focusing on code readability to assist programmers to express concepts in a compressed form compared to Java, C++ and C. The objective is to provide code readability and advanced developer productivity.

Its functionalities include:

  • Developing and scripting code
  • Generation of code and software testing

Due to its elegance and simplicity, top technologically-driven organisations like Dropbox, Google, Quora, Mozilla, Hewlett-Packard, Qualcomm, IBM, and Cisco have implemented Python. Python is also an inspiration to the creation of many other coding languages such as Ruby, Cobra, Boo, CoffeeScript ECMAScript, Groovy, Swift Go, OCaml, Julia etc.

 

R vs Python: which is the preferred choice?

Dr Norm Matloff, Professor of Computer Science at University of California, wrote a paper on the key differences between the two Languages. He compared R and Python across the following multiple domains to determine which programming language was the better choice:

Elegance

Winner: Python

While this is subjective, Python greatly reduces the use of parentheses and braces when coding, making it more sleek, Matloff shared.

Machine Learning

Winner: Python (but not by much)

Python's massive growth in recent years is partially fuelled by the rise of machine learning and artificial intelligence (AI). Python offers a number of finely-tuned libraries for image recognition.

In Maltoff’s words, the Python libraries' power comes from setting certain image-smoothing operations.

Learning curve

Winner: R

Shared by Maltaff, data scientists working with Python must learn a lot of material to get started, including NumPy, Pandas and matplotlib. Nevertheless, matrix types and basic graphics are already built into base R. Novices can now be doing simple data analyses within minutes as R packages run automatically.

Statistical correctness

Winner: R (by far)

Advocates for Python – namely professionals working within machine learning – may seem to have a poor understanding of the statistical issues involved with the language. R, on the other hand, was written by statisticians, for statisticians. This suggests that subject matter experts in R will be able to ensure that the math behind analyses are as accurate as possible.

Parallel computation

Winner: It’s a draw

Matloff suggests that the base versions of R and Python do not have strong support for multicore computation. What he means by this is that both R’s parallel package, and Python's multiprocessing package is not a good workaround for its other issues. Nevertheless, external libraries supporting cluster computation are good in both languages, while Python has better interfaces to GPUs.

Libraries

Winner: Python

Python’s machine learning library – Scikit-learn – is deemed to be highly recognised as ‘gold-standard’. It provides a wide selection of supervised and unsupervised learning algorithms. Reported by Toward Data Science, this library, “by far the easiest and cleanest ML library”. Scikit learn was created with a software engineering mind-set. Its core API design revolves around being easy to use, yet powerful, and still maintaining flexibility for research endeavours. This robustness makes it perfect for use in any end-to-end ML project, from the research phase right down to production deployments.

 

What are the trends in Singapore and Hong Kong markets?

Shared by Donnie Maclary, Principal Consultant of Huxley Singapore, around 90% of all of the jobs that he is filling in Data Science and Analytics are looking for candidates that are well versed in Python. This is because Python offers a lot of flexibility as compared to R.

If you are looking to grow your career in this field, it is thus best to focus on being familiar with the full suite of Python. Additionally, other in-demand skills for data professionals include SQL, Spark, Hadoop, Java, Amazon Web Services (AWS), Scala, and Kafka.

 

Huxley can help!

If you are a Data Science and Analytics professional that is looking to add top-tier talent to your team, please reach out to us via the contact form below. Do keep your eyes peeled for more updates within this space on our LinkedIn page

 

If you would like to find out more information about the market outlook within the sector, please leave your details below:


5 Top Tips for writing your Business Analyst CV

04 Dec 2019

Creating the perfect Business Analyst CV can be difficult, so our consultant Brittany Arlove has put together some of Top Tips which you can use in order to build or enhance yours.

More to expect from Australia’s tech scene in 2020?

20 Nov 2019

Following on from the successful and transformative year within technology that was 2019, 2020 is set to push the barriers even further in terms of technological trends.

Huxley partnered with major technology event CEBIT

18 Nov 2019

Huxley were delighted to attend and showcase at major technology event CEBIT Australia recently at Sydney's ICC.

5 reasons to work in Fintech

14 Nov 2019

The fintech industry is evolving very quickly and has supported the financial sector in overcoming various challenges. This article will introduce you to five reasons why you should build a career in the fintech field today. For those who are already pursuing a career within the industry, this would further reinforce your choice to stay within the fintech sector.