A quick-and-dirty Python word generator for Scrabble

I haven’t had a chance to do much coding this semester, and I’ve been itching to get back into it.  My problem with coding has always been finding interesting projects that aren’t too difficult – everything I think of is either boring or beyond my capabilities.

While I was home on Thanksgiving break, during a family Scrabble game, it occurred to me that there were probably anagram generators available online specifically for playing Scrabble, and after looking at a few I decided I wanted to try to make my own.  I figured it would be a pretty easy project, but would allow me to refresh my rusty Python.

I began by looking for a dictionary to use for checking whether I’ve generated valid words.  I thought I would use the built-in Linux word list in /usr/share/dict, but reading around I learned about Python’s Enchant library.  This library allows a user to define dictionary objects using a language tag (such as “en_US” for American English), and to check strings against them.  The library also contains more advanced spell-checking methods such as “suggest”, which generates a list of suggestions for a misspelled word.  I only touched the surface in my little script, but for more information a complete tutorial can be found here.

After setting up an Enchant dictionary, I started writing some loops to go through a set of letters and generate anagrams.  I started getting bogged down quickly, and during a Google search I stumbled upon the wonderful itertools module.  It’s a Python library for working with iterable datasets, and it happens to contain a function called permutations() that does exactly what it sounds like – generate permutations of an iterable object.  Using this function I was able to accomplish most of the “complicated” logic of my program in one line.

Having these two libraries in my toolkit, it remained only for me to put together the main logic.  Two for loops and an enchant dict.check() later I had a working program.  In the original version, the user was prompted to enter their set of letters while the program was running.  I later changed this to make the input a command line argument.  This allowed for combination with other command line tools (specifically grep).

The resulting script was only a few lines long.  I left the commented line because I wasn’t sure I wanted to keep the command line style input.

import enchant
import itertools
import sys

d = enchant.Dict("en-US")

#combo = raw_input("Enter your letters: ")

list = []
combo = sys.argv[1]

for num in range(len(combo), 2, -1):
    for x in itertools.permutations(combo, num):
        test = ''.join(x)
        if (d.check(test) == True) and not(test in list):
            list.append(test)
            print test

Because the output of this program was intended for a human user, it seemed ok to just print out the resulting word list, instead of storing it somewhere.  Originally, the shortest words were printed first, but I reversed the order because the most important words to a Scrabble player are obviously the longer ones.

Running the program.

Running the program.

The last thing I did with this program was to pipe the output to grep, as I mentioned above, and search it with regular expressions.  I figured this would help a Scrabble player find only words that would fit in a certain location on the board, say a place with an O and a T separated by one blank space (this command would then be “$python scrabble.py 'asdfghj' | grep o.t“).

As soon as I got a working prototype going I put this script to use in my current Scrabble game, but it was too late to save me.  It turned out my gameplay had suffered while I was distracted by coding…  Despite this, it was a really fun, albeit very short, project, and it introduced me to some new Python tools that are definitely worth knowing about.  This was probably my best ‘one-evening project’ yet, and I certainly hope not my last.

Advertisements

1 Comment

Filed under Python

USGS DEM Vertical Scales

The other day I downloaded a few USGS 7.5 minute DEMs to use with GRASS GIS.  I haven’t done any real projects with GRASS, but I’ve been having fun messing around with it.  I loaded the DEMs and displayed them together, but they didn’t match.  At least, not all of them.  It was like they were using different scales or something.  I though this was strange because they were all standard USGS maps and were downloaded from the same source at the same time.

The four DEMs together.  The ones on the left match, and the ones on the right match, but they don't match each other.

The four DEMs together. The ones on the left match, and the ones on the right match, but they don’t match each other.

I looked at the histogram for this raster and found that the spike for the right side was about three times further along the axis than the spike for the left side.  This meant the values for the right were, on average, about three times the values for the left.

The histogram for the DEMs, showing the two sides of the image clearly.

The histogram for the DEMs, showing the two sides of the image clearly.  Out of curiosity, I looked into the vertical spike at about 800(in the purple), and found that it corresponded to the surface of a large lake in one of the quadrangles.

This led me to wonder whether the discrepancy could be caused by the DEMs being measured in different vertical units.  I figured the factor of about three could be explained by the difference between feet and meters, or feet and yards.  I then went to the USGS website and found a PDF called the “DEM Data User Guide”.  This document revealed that “Elevations for the continental U.S. are either meters or feet…”.  Well that’s annoying!  Further reading states that “DEM’s of low-relief terrain or generated from contour maps with intervals of 10 ft (3 m) or less are generally recorded in feet. DEM’s of moderate to high-relief terrain or generated from maps with terrain contour intervals greater than 10 ft are generally recorded in meters.”  So that explains the difference, but I still don’t know how to get GRASS to display them using the correct scaling.  Next project: find out how to change the vertical scale of a DEM in GRASS.

Leave a comment

Filed under GRASS GIS

Scientific Python Project

I’m a big fan of Python, and I’ve been having a lot of fun with scientific Python ever since I discovered the ipython shell and the Matplotlib, Numpy, and Scipy libraries.  For anyone who doesn’t know, these libraries can be used to turn Python into a MATLAB-like scientific graphing and analysis platform.  Now that I’ve got the libraries installed and the environment set up, I’ve been exploring the commands and getting a feel for it.  The only problem is that I don’t have any data to work with.  I’ve been using randomly generated normal distributions and other sample data to practice, but it’s a lot more fun to chart meaningful data.  I’ve been trying to think of a project that I could do to plot some real data, and I finally came up with something: internet activity in my dorm.  I live in a dorm at school and access the internet through an ethernet connection in my building.  I’d been noticing differences in my download speeds depending on the time of day (which is no surprise), and I thought I’d try to gather some numbers.  I also decided to use the Windows net view command to get the number of users logged into the network.  (Unfortunately I do use Windows on a regular basis.  I would love to finally get free of it, but that will take some time.  For now I just spend as much time as possible working in my Ubuntu partition or my Debian VM.)  I figured there should be a correlation between the two datasets (download speed and active users).  While this isn’t exactly groundbreaking research, it’s a lot more interesting than someone else’s sample data so I gave it a shot.

The data collection for this project was a pain.  I didn’t even know where to begin to automate the process of logging the download speed, so I figured I’d do it manually.  Basically, I put a large (~3 GB) file in the Firefox download queue and began the download.  I looked at the speed listed and recorded it in a text file, along with the time and date.  I then canceled the download and for each subsequent record, I hit ‘restart download’ and did the same thing.  The only problem is that in order to record data points I had to be at my computer and logged into the network.  This isn’t ideal, but it worked.  Then I thought about how I could get the user data from net view into a file.  I ended up writing a Windows batch file to take the time, date, and number of users.  Then, due either to my lack of knowledge or Windows’ lack of support for string functions, I wrote a Python script to concatenate the date, time, and users and return the result to the command line, allowing the batch script to write it to a text file.  This already sounds like a mess, but it gets worse.  I wasn’t able to find a way to pass multiple values to my python script from the batch file, so I had to write everything to a temporary file, the contents of which were read by the concatenation script.  This means that there were three different files strung together just to get some command line output and write it to a file formatted how I wanted.  There has got to be a better way to do this.  In case you’re interested, here are the scripts I wrote:

The Windows batch:

@echo off
for /f “delims=” %%a in (‘date/t’) do @set vardate=%%a
for /f “delims=” %%a in (‘time/t’) do @set vartime=%%a
net view | find/c “\\” > tempfile.txt
python conc.py %vardate% %vartime% >> C:\Users\Nat\Documents\Python\netusers.txt

 

And the Python script:

 

   import sys

   def main(argv):
       output = str(argv[0]) + ‘ ‘ + str(argv[1]) + ‘ ‘ + str(argv[2])
       f = open(‘tempfile.txt’, ‘r’)
       print output + ‘ ‘ + f.read()[:-1]
   if __name__ == “__main__”:
      main(sys.argv[1:])

 

The result of running the batch file was satisfactory, but the system had a bad flaw (besides being cumbersome): the Windows time is in 12 hour cycles.  My plotting scripts can only take 24 hour times.  (Because I don’t know how to use any other format.  There’s probably a way, but working with times in Python is complicated.)  This meant that every time I took a data point after  12 noon I had to go into the text file and manually add 12 hours to the time.  Basically, the data collection was a pain.  I really need to work out a more automated method for doing this if I’m going to keep gathering data.
Once I had some data, I started playing with ways to display and analyze it.  My first graph is just a plot of the download speeds and number of users throughout the day.
A graph.

The data. It seems like college students tend to stay in bed later on weekends. No surprises here!

My biggest problem here was getting the times into a usable format.  This was done with the Python datetime module and matplotlib.dates.  The code got messy, but it worked.

My next step was to plot the two datasets against each other.  I figured there might be a linear relationship going on, so I did a linear regression with the scipy.stats.linregress function.  I got a determination coefficient of about 0.6228, which isn’t great, but maybe it’ll settle down if I get more data.

A graph.

The download speed plotted against the number of users on the network.

This was an interesting exercise, and it yielded the unprecedented conclusions that college students like to sleep in on weekends, and that when more users log into a network, the download speed drops.  As I said, not really groundbreaking research – just an interesting way to explore scientific Python utilities with real-world data.  Hopefully I’ll be able to get some more points to flesh out these graphs, but maybe it’s time to move on to a new project.  (Hopefully one that doesn’t involve date formatting!)

The strange looks I get when people ask me what I do for fun are priceless.

2 Comments

Filed under Scientific Python