Member-only story
Carry out an exploratory data analysis on a trees dataset in R
The R programming language is a statistical programming language that was created by Ross Ihaka and Robert gentleman at the University of Auckland, New Zealand in 1993. Up until 2015, R was the premier programming language in the field of data science. Even now Kaggle, a data science website, gives its users a choice of whether they would like to program in R or Python.
With the development of libraries such as numpy, pandas, scikit learn, matplotlib, and seaborn, programming became much easier in Python, so this language began to overtake R in popularity. By 2020 about 60% of data scientists programmed in Python, while only 40% programmed in R.
The truth be known, however, is that data scientists are expected to know both Python in R because they are both good for different projects. For instance, R has visualisation functions inbuilt into the language, while Python has to rely on libraries. In addition, R has datasets inbuilt into the language that people can practice on, while Python does not.
I don’t tend to use the R programming language a lot because of the difficulties I have had in finding a free compiler to execute the code that I have written. I have a laptop that runs on a microsoft operating system, but sadly, there is no R compiler in the Microsoft store. I don’t want to…