Introduction to Network Visualization: Part 2 (Cytoscape)

This post is the second half of a two-part beginner’s introduction to network visualization. The first post outlined preparing a dataset for upload into Gephi and covered how to get started with the styling options and layouts available in Gephi. In this half of the tutorial, we’ll do the same for Cytoscape.

Continue reading “Introduction to Network Visualization: Part 2 (Cytoscape)”

Introduction to Network Visualization: Part 1 (Gephi)

This introductory tutorial to Network Visualization is the first of a two-part series. This first post will provide an introduction to generating network visualizations with Gephi. The second post will be an introduction to Cytoscape. Along the way, we will contrast the interfaces and the layouts available for each platform.

Continue reading “Introduction to Network Visualization: Part 1 (Gephi)”

Stanford’s Natural Language Processing Software: Text Tagging and Finding Named Entities


Stanford NLP Logo

Stanford Natural Language Processing (NLP) group at Stanford University has an open suite of language analysis tools that are available for the public to use. Most of the tools are only available in English but some have been translated into Chinese, Spanish, German, and Arabic. This tutorial will focus on the English tool sets, specifically the Named Entity Recognizer and the Parts of Speech Tagger. This is helpful is being able to pinpoint and extract specific locations / organizations from a text; Or if you wanted to look at the complexity of sentence structure; Or even looking for hesitations in transcripts for english as a second language learners and where they pause the longest. There are various applications to this technology in research and learning.

Continue reading “Stanford’s Natural Language Processing Software: Text Tagging and Finding Named Entities”

Visualizing Twitter Status Data with Wordle

Wordle Visual

The process of Visualizing Twitter status data can be informative and revealing about connections to your brand or any topic that you might not have known existed. For example, when I was going through the Pure Michigan data to make the visual, I had to choose what to include and what to exclude. I kept coming across the word ‘xe2’ over and over in the data. After some digging, I discovered that there were several photographers that were using the new Fujifilm XE2 camera to photograph Michigan’s natural beauty and tweeting at Pure Michigan in the process.

Continue reading “Visualizing Twitter Status Data with Wordle”

RMarkdown Tutorial

R Markdown is an authoring format that enables easy creation of dynamic documents, presentations, and reports that use R plots and data analysis. It combines the core syntax of markdown (an easy-to-write plain text format) with embedded R code chunks that are run so their output can be included in the final document. R Markdown documents are fully reproducible (they can be automatically regenerated whenever underlying R code or data changes). Continue reading “RMarkdown Tutorial”

Getting Started: Scraping Twitter Data


The aim of this blogpost is for a beginner level user to be able to scrap data from Twitter. In this example, I’ll scrap the 20 most recent statuses from @PureMichigan‘s Twitter feed. My end goal of scraping these posts is to find out quickly who has been talking about @PureMichigan on Twitter most recently and what they are saying. You can also use the count feature to pull up to 200 statuses at a time and analyze the content.
Continue reading “Getting Started: Scraping Twitter Data”

Working with Large Data Sets

This tutorial provides a walk-through of managing a large data set in R.  The sample data set used is on Precipitation in the Great Lakes Region retrieved from GLERL.  It is a multi-tab excel file that needs to be cleaned up in R before it can be used efficiently.  General methods of dealing with large datasets and the problems one can run into are included so that information in this tutorial can be applied to various types of data.

Continue reading “Working with Large Data Sets”