Nick Penney

Newsmilk

RISD Class Project

March 2013

Newsmilk is a news aggregation site

Newsmilk was a RISD semester project I developed, which used a machine learning algorithm – latent Dirichlet allocation – to analyze the full text of 5,000—10,000 news articles from global sources each morning, and compiled a list of the words or phrases with the most coverage.

3—5 topics were curated from the list and posted to the site, with at least 3 reliable citations for each story and a link to the original source, if available.

I used a machine learning library for Python to analyze the article contents, and the site was built in PHP.