Member-only story
I am trying to find bigquery datasets that are not too memory intensive because Kaggle only allows me to use 5 terabytes of data a month. I therefore selected the US census dataset from the Kaggle selection, found here:- https://www.kaggle.com/datasets/census/census-bureau-usa
What I found was that this dataset was more difficult for me to work with because when I tried to load it into the program I received an error message from Kaggle. I therefore had to import a library that I saw in a sample notebook in order to get the program to work. The libraries that I imported into the Jupyter Notebook that I created for this purpose were:-
- Bigquery to make queries on bigquery datasets and
- bq_helper , which simplifies common read-only tasks in bigquery.
I defined the variable, client, which is needed to make queries in bigquery.
I also defined the variable, bq_assistant, which is used to call up the bigquery dataset:-