[R] graphically representing frequency of words in a speech?
    Mike Lawrence 
    Mike.Lawrence at dal.ca
       
    Mon Jun  8 02:00:16 CEST 2009
    
    
  
Below are various attempts using using ggplot2
(http://had.co.nz/ggplot2/). First I try random positioning, then
random positioning with alpha, then a quasi-random position scheme in
polar coordinates:
#this demo has random number generation
# so best to set a seed to make it
# reproducible.
set.seed(1)
#generate some fake data
a = data.frame(
	word = month.name
	, freq = sample(1:10,12,replace=TRUE)
)
#add arbitrary location information
a$x = sample(1:12,12)
a$y = sample(1:12,12)
#load ggplot2
library(ggplot2)
#initialize a ggplot object
my_plot = ggplot()
#create an object for the text layer
my_text = geom_text(
	data = a
	, aes(
		x = x
		, y = y
		, label = word
		, size = freq
	)
)
#create an object for the text size limits
my_size_scale = scale_size(
	to = c(3,20)
)
#create an object to expand the x-axis limits
# (ensures that text isn't cropped)
my_x_scale = scale_x_continuous(
	expand = c(.5, 0)
)
#ditto for the y axis
my_y_scale = scale_y_continuous(
	expand = c(.5, 0)
)
#create an opts object that removes
# plot elements unnecessary in a tag cloud
my_opts = opts(
	legend.position = 'none'
	, panel.grid.minor = theme_blank()
	, panel.grid.major = theme_blank()
	, panel.background = theme_blank()
	, axis.line = theme_blank()
	, axis.text.x = theme_blank()
	, axis.text.y = theme_blank()
	, axis.ticks = theme_blank()
	, axis.title.x = theme_blank()
	, axis.title.y = theme_blank()
)
#show the plot
print(
	my_plot+
	my_text+
	my_size_scale+
	my_x_scale+
	my_y_scale+
	my_opts
)
#to aid readability amidst overlap, set alpha in
# the call to geom_text
my_text_with_alpha = geom_text(
	data = a
	, aes(
		x = x
		, y = y
		, label = word
		, size = freq
	)
	, alpha = .5
)
#show the version with alpha
print(
	my_plot+
	my_text_with_alpha+
	my_size_scale+
	my_x_scale+
	my_y_scale+
	my_opts
)
#alternatively, in polar coordinates,
# which maps x to angle and y to radius,
# making a nice circle
print(
	my_plot+
	my_text_with_alpha+
	my_size_scale+
	my_opts+
	coord_polar()
)
#(note omission of my_y_scale &
# my_x_scale, which seem to be ignored
# when coord_polar() is called. I'll
# report this possible bug to the ggplot2
# maintainer)
#a possible way to avoid overlap is to
# map radius (y) to frequency so that
# larger text is in the periphery
# where there is more room. This
# necessitates adding some random
# noise to the frequency so that
# the low frequency words don't
# jumble in the center too badly
a$freq2 = a$freq+rnorm(12)
#now map radius (y) to freq2
my_text_with_alpha_and_freq2 = geom_text(
	data = a
	, aes(
		x = x
		, y = freq2
		, label = word
		, size = freq
	)
	, alpha = .5
)
#show the version with alpha & radius mapped to freq2
print(
	my_plot+
	my_text_with_alpha_and_freq2+
	my_size_scale+
	my_opts+
	coord_polar()
)
-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar
~ Certainty is folly... I think. ~
    
    
More information about the R-help
mailing list