Member-only story

Large language models don’t always get it right when writing machine learning scripts

3 min readJul 5, 2024

I have been working on a Kaggle competition problem that requires the data scientist to predict the probability of a certain event occurring. I had previously used sklearn’s logistic regression predict_proba function to predict the probabilities of this very large dataset, but having read on the stackoverflow website that the passive aggressive classifier is a good model to use on large datasets, I decided to give it a try.

Unfortunately sklearn’s passive aggressive classifier does not have a predict_function, so there is no way that I can easily predict on probabilities using this algorithm.

I don’t normally use large language models, but in this instance decided to give it a try, hoping to learn if there as an alternative to the predict_proba function, which is not part of the passive aggressive classifier model.

I asked the question of Microsoft Bing’s Copilot. My question and Copilot’s response can be seen in the screenshot below:-

Copilot was also kind enough to provide me with code in Python, which is the language that I happen to program in. Sadly…

Large language models don’t always get it right when writing machine learning scripts

Written by Crystal X

No responses yet