In my last post I used the open source library, spacy, to carry out predictions on a sentiment analysis. In this post I intend to make predictions on the same dataset using sklearn’s natural language processing, NLP, facilities. The reason why I am making predictions on the same dataset is because I would like to carry out a comparison check to see which algorithm works out the best.My most recent post concerning sentiment analysis, where I made predictions using spacy, can be found at:- https://medium.com/mlearning-ai/conduct-a-twitter-sentiment-analysis-test-using-spacy-7715db4b2c97

Sklean has its own functions for sentiment analysis, being TfIdfVectorizer in this case. TfIdfVectorizer converts…


I have spent the last couple of days working on the spacy library, which is an open source library used to convert words to vectors in Natural Language Processing, or NLP.

Today I spent several hours tackling a complex problem. For some reason I could not get spacy to be install and import on Kaggle and all of my efforts to search for the correct code to facilitate this failed to come to fruition.

I decided to check the code that I was given out on Google Colab, which is a free online Jupyter Notebook, and this code worked on…


Ever since I have been working on Natural Language Programming(NLP) datasets, I have been using sklearn’s NLP functions to vectorise and make predictions on the data. Because I wanted to improve upon my skill set, I began a Kaggle microcourse on NLP and was quite surprised that the course was not covering sklearn, but a new library called spacy.

Because I wanted to know more about spacy, I took a break from the microcourse and decided to undertake some research on this library. According to spacy’s documentation, spacey is a free open source library for advanced NLP in Python. If…


In my last post I discussed how to use sklearn’s dataset facility to create a bicluster, plot it on a graph, and then make predictions on the dataset. The link to that post can be found here:- https://tracyrenee61.medium.com/how-to-make-a-bicluster-plot-and-predict-on-it-using-sklearn-cce1ac718f94

In this post I have endeavoured to make a similar, yet different, configuration, being a checkerboard. Sklearn’s dataset facility can create the dataset of a checkerboard, which is an array with a block checkerboard structure for biclustering.

I have written the script in Google Colab, which is a free online Jupyter Notebook. The great thing about Google Colab is the fact that…


One of the great things about sklean’s dataset facility is the fact that the user can create all manner of datasets and plot them in his experiments. It’s nice because sometimes a dataset that meets certain parameters might not be readily available, so the data scientist only needs to use sklearn to whip one up that meets the requirements he is looking for. My most recent post on sklearn’s datasets can be found here:- https://medium.com/geekculture/how-to-create-and-plot-a-swiss-roll-using-sklearn-cbd2c81fddc6

In this post I intend to discuss sklearn’s bicluster, which is an array of a constant block structure diagonal structure for biclustering.

I created the…


In my last video I discussed how to create a s curve and plot it on a graph (as well as make predictions on it). The link to that post can be found here:- https://medium.com/geekculture/how-to-make-and-plot-a-s-curve-using-sklearn-17c98ddbeb4d

In this post I intend to discuss how to use Python’s machine learning library to create a swiss roll, plot it on a graph, and make predictions on it.

Both the s curve and swiss roll are part of sklearn’s manifold datasets. Manifold learning is an approach to non-linear dimensionality reduction. …


Python’s library on machine learning has facilities to allow users to make datasets. I have written a few posts where users can make datasets to use in their own data science projects, to include blobs, circles, half moons, regressors and classifiers. Python has other dataset generators that create shapes as well. The most recent post I have made concerning sklearn’s dataset method is here:- https://medium.com/geekculture/how-to-create-a-bell-curve-using-only-python-9fdca95e7967

The s curve is a type of manifold learning. Manifold learning is an approach to non-linear dimensionality reduction. …


Although I practice the various genres of data science on datasets I have acquired on the internet, such as Kaggle and GitHub, in this post I intend to use a real world dataset that is based on current events. The dataset used in this post concerns the training requirements for an anonymous department in a non profit organisation. I had initially created the dataset in 2020, but decided to update it to incorporate the values from 2021.

Because this time series dataset covers the period of the Coronavirus pandemic, it will be interesting to see how training requirements have been…


Time series analysis is a method of collecting data over a specific period of time. It is a specific way of analysing a sequence of data points collected over an interval of time. The analysis record the data points over specific interval of time, such as a day, week, month or year.

There are many areas where time series analysis can come in handy, such as recording sunspot data, energy consumption, sales, or even the stock market.

In this post I have taken stock market data from Facebook to cover the period from the beginning of the pandemic, being the…


Photo by Johannes Plenio from Pexels

I was so looking forward to taking Coursera’s Machine Learning course tutored by Andrew Ng. When I signed up for the course, however, I discovered that It was not going to be taught in Python, my programming language of choice, but in Octave. Octave is a Matlab compatible scientific programming language that is much closer to the machine language than Python. Python, on the other hand, is a very high level programming language that is very close to English, and that is why it is easy to learn and many people, including children, use it. …

Tracyrenee

I have over 46 years experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store