The AI I/O PhD Part II: The Data Creator

 — 

This article explores the use of Open AI's new Chatbot GPT to generate datasets that allow users to train large language models on downstream tasks leveraging a few shot learning approach.

Category: data science, artificial intelligence Tags:

The AI I/O PhD Part I: The SME Item Writer

 — 

This article explores the use of Open AI's ChatGPT algorithm, to generate high quality specific item content from a simple natural language prompt.

Category: data science, artificial intelligence Tags:

Data Science on a Chromebook?

 — 

I've always advocated that a small machine and the power of the cloud is all you need for data science. In this article I put it to the test on my new to me Google Pixelbook. I walk through how to setup a data science environment on my $395 chromebook.

Category: data science Tags:

Loading Multiple Files

 — 

In practice it's often the case that you will be presented with multiple data files that need to be merged together. Perhaps it's survey data that was collected on a monthly basis over a period of a year, or performance data that was collected from a number of locations. It's easy to manually do this with 5-10 files, but what about if you have 100 files? This article will walk through the basics of the terminal and how to leverage the OS module in Python to load hundreds of files easily.

Category: data science, programming Tags:

Using Dict Mapping to Predict Personality (SIOP ML Competition) Part II

 — 

This article explores leveraging a package that maps words to a pre-built dictionary. I then use those dictionary word counts as features to predict self-report big 5 personality. I then combine this with the sentiment features created in the previous article to produce a combined mean score that would have placed 18th on the public leaderboard for the 2019 SIOP Machine Learning Competition.

Category: data science Tags:

© N. Koenig 2016

Powered by Pelican

Fork me on GitHub