This post is the second half of a two-part beginner’s introduction to network visualization. The first post outlined preparing a dataset for upload into Gephi and covered how to get started with the styling options and layouts available in Gephi. In this half of the tutorial, we’ll do the same for Cytoscape.
This introductory tutorial to Network Visualization is the first of a two-part series. This first post will provide an introduction to generating network visualizations with Gephi. The second post will be an introduction to Cytoscape. Along the way, we will contrast the interfaces and the layouts available for each platform.
Stanford Natural Language Processing (NLP) group at Stanford University has an open suite of language analysis tools that are available for the public to use. Most of the tools are only available in English but some have been translated into Chinese, Spanish, German, and Arabic. This tutorial will focus on the English tool sets, specifically the Named Entity Recognizer and the Parts of Speech Tagger. This is helpful is being able to pinpoint and extract specific locations / organizations from a text; Or if you wanted to look at the complexity of sentence structure; Or even looking for hesitations in transcripts for english as a second language learners and where they pause the longest. There are various applications to this technology in research and learning.
This is a step by step tutorial about using GDAL2Tiles to transform a georeferenced map into a TMS-map which can be added as a layer to a leaflet map.
This post will explore how to analyze multiple texts using Voyant. In the post below I’ll look specifically at word use through the tools that Voyant offers and explain how to use them.
The process of Visualizing Twitter status data can be informative and revealing about connections to your brand or any topic that you might not have known existed. For example, when I was going through the Pure Michigan data to make the visual, I had to choose what to include and what to exclude. I kept coming across the word ‘xe2’ over and over in the data. After some digging, I discovered that there were several photographers that were using the new Fujifilm XE2 camera to photograph Michigan’s natural beauty and tweeting at Pure Michigan in the process.
This is a companion glossary for a previous post on working with large data sets. Its purpose is to highlight the relevant arguments for dealing with reading and working with large data sets. Continue reading “Glossary for Working with Data sets”
R Markdown is an authoring format that enables easy creation of dynamic documents, presentations, and reports that use R plots and data analysis. It combines the core syntax of markdown (an easy-to-write plain text format) with embedded R code chunks that are run so their output can be included in the final document. R Markdown documents are fully reproducible (they can be automatically regenerated whenever underlying R code or data changes). Continue reading “RMarkdown Tutorial”
The aim of this blogpost is for a beginner level user to be able to scrap data from Twitter. In this example, I’ll scrap the 20 most recent statuses from @PureMichigan‘s Twitter feed. My end goal of scraping these posts is to find out quickly who has been talking about @PureMichigan on Twitter most recently and what they are saying. You can also use the count feature to pull up to 200 statuses at a time and analyze the content.
Continue reading “Getting Started: Scraping Twitter Data”
This tutorial provides a walk-through of managing a large data set in R. The sample data set used is on Precipitation in the Great Lakes Region retrieved from GLERL. It is a multi-tab excel file that needs to be cleaned up in R before it can be used efficiently. General methods of dealing with large datasets and the problems one can run into are included so that information in this tutorial can be applied to various types of data.