This article leverages a pre-built sentiment python package to create features from open-ended text that are used to predict self-reported personality traits. The data used comes from the 2019 SIOP Machine Learning Competition and this simple methodology produces a 20th place score.
Often times as data scientists we are looking for ways to speed up our data pipeline. In this article I discuss and briefly test a new dataframe that allows you to leverage multiple CPU cores. I also explore 3 different ways to process dataframe columns and compare their speeds across both the traditional dataframe (pandas) and the parallel dataframe (modin). This is meant to be a brief article highlighting some research I have been doing lately.
This is the third of three articles in my series on using unsupervised machine learning algorithms in Python to understand open-ended survey responses. I'll again re-visit the "Cons" responses, but this time I will use the K-means clustering algorithm. After the responses our clustered I will examine the responses within each cluster to identify themes and examine the breakdown of themes across companies.
This is the second of three articles in my series on using topic modeling in Python to understand open-ended survey responses. I'll again re-visit the "Cons" responses, but this time I will use Latent Dirichlet Allocation (LDA). I'll also walk through how to use an interactive visualization library to view the results of the LDA model.
Unsupervised learning is an important part of machine learning and as I/Os we often find ourselves with data that we are asked to make sense of but we don't have any target to optimize for. When it comes to NLP in surveys, employee feedback forms, and customer reviews, a common request is to help break down all the responses into general categories. This is where a method like topic modeling may be useful. In this article we'll walk through how to leverage Singular Value Decomposition (SVD) to do topic modeling on company reviews.
© N. Koenig 2016
Powered by Pelican