FacebookTwitter
Hatrack River Forum   
my profile login | search | faq | forum home

  next oldest topic   next newest topic
» Hatrack River Forum » Active Forums » Books, Films, Food and Culture » programming help: randomizing a subset of data

   
Author Topic: programming help: randomizing a subset of data
xnera
Member
Member # 187

 - posted      Profile for xnera   Email xnera         Edit/Delete Post 
I am creating a !quote command for my AIM-bot that will return a random quote from a list of quotes said by various people.

That part's easy. The problem is that I want to extend the command so that if you do !quote [person], it returns a random quote by that person, rather than the entire list of quotes.

I'm not sure how to store my data and how to efficiently return the random quote on a subset of the entire list. Right now it's stored as an array in the form of "character: funny quote", and I use perl's split command to divide the data and check to see if it matches the character I am looking for, and if not, run again. Lather, rinse, repeat an unbounded loop. Not good.

There's got to be a better way. It would be much better to first retrieve the list of quotes for that character only, and then find a random quote on the list.

But I don't know how to do this. I have nearly a hundred people, and each has the potential to have many, many quotes associated to them. And the people are also split into two different groups, which we will need to distinguish. I also will likely be adding quotes from time to time, so whatever I use, it's got to be easy to update the list of quotes. Which means storing the quotes in a text file at the very least, and maybe even a database program if they get large enough.

So how would I store my data programmaticly, and access it efficiently? an array of arrays? Maybe a hash? My experience with hashes is very limited, but I'm willing to read up on them if that's the best solution. Or should I install MySQL or some other database program? That seems a bit of overkill since this is the only application I will be doing that might require a database.

Thoughts? I don't necessarily need code itself, just a discussion of method. In case it matters, I'm programming in perl on a Windows box.

Posts: 1805 | Registered: Jun 1999  |  IP: Logged | Report this post to a Moderator
Dagonee
Member
Member # 5818

 - posted      Profile for Dagonee           Edit/Delete Post 
I'd use MySQL, but I am more comfortable in databases than any other environment.

Make two tables:

Person:
pid: autonumber (primary key)
name: varchar

Quotation:
qid: autonumber (primary key)
pid: integer (foreign key to Person)
quote: text (or whatever MySQLs large text type is)

To run the query, use "SELECT quote FROM quotation WHERE (pid = <pid for selected person> ORDER BY RAND() LIMIT 1"

ORDER BY RAND() sorts the rows randomly, LIMIT 1 takes just the top 1.

Leave out the WHERE clause if you want any random quotation.

Dagonee
EDIT: I forgot, you have to join in table Person in order to show the person's name.

[ December 31, 2004, 12:13 PM: Message edited by: Dagonee ]

Posts: 26071 | Registered: Oct 2003  |  IP: Logged | Report this post to a Moderator
xnera
Member
Member # 187

 - posted      Profile for xnera   Email xnera         Edit/Delete Post 
SQL I have a lot of experience with. [Smile] Never installed a SQL database, though, or interfaced it with perl. That might be tricky.
Posts: 1805 | Registered: Jun 1999  |  IP: Logged | Report this post to a Moderator
Dagonee
Member
Member # 5818

 - posted      Profile for Dagonee           Edit/Delete Post 
Installing MySQL was pretty easy. I've never used it with Perl, though.

Dagonee

Posts: 26071 | Registered: Oct 2003  |  IP: Logged | Report this post to a Moderator
King of Men
Member
Member # 6684

 - posted      Profile for King of Men   Email King of Men         Edit/Delete Post 
I don't know how your data are organised, here, so this might be useless. But what I'd do is make an array of arrays : Each array contains all the quotes by one person. So you just need to specify the index of your person, and the random number generator picks one of his quotes. You might have to do some sorting at startup, but that shouldn't be a problem on a modern computer.
Posts: 10645 | Registered: Jul 2004  |  IP: Logged | Report this post to a Moderator
xnera
Member
Member # 187

 - posted      Profile for xnera   Email xnera         Edit/Delete Post 
I haven't gathered the data yet, but I imagine it will be very unorganized since this is collecting quotes from an ongoing game, which means new quotes will be added all the time.

So if I were going to do an array, I might store the data like so:

game, character, quote

So far there are two games with 40-50 characters each. Character names overlap between the two games, so I need to differentiate by game name.

I suppose instead of using one text file to store ALL the data, I could have a seperate text file for each game, and just load the proper text file into the array of arrays when necessary. Hmm.

Dagonee, if I was going to use MySQL, which exact product do I install? I am looking at their product page, and I am very confused as to which product is the proper one for my needs.

Posts: 1805 | Registered: Jun 1999  |  IP: Logged | Report this post to a Moderator
Dagonee
Member
Member # 5818

 - posted      Profile for Dagonee           Edit/Delete Post 
The one I installed was for Windows, and the filename is mysql-4.0.18-win.zip

I also installed mysqlcc-0.9.4-win32.zip, which is a decent gui front end.

Dagonee

Posts: 26071 | Registered: Oct 2003  |  IP: Logged | Report this post to a Moderator
xnera
Member
Member # 187

 - posted      Profile for xnera   Email xnera         Edit/Delete Post 
Bah, GUI! We need no GUI. [Wink]

I think I will give MySQL a try. Yeah, there are ways to do what I wish without using a database, but it will be fun to try something new. Good experience, even though I am not planning on getting back into the computer biz. I've at least seen SQL in perl code from poking around the LiveJournal source. And this article doesn't make it sound nearly as scary as I thought it would be.

Posts: 1805 | Registered: Jun 1999  |  IP: Logged | Report this post to a Moderator
Dagonee
Member
Member # 5818

 - posted      Profile for Dagonee           Edit/Delete Post 
I use databases at the drop of a hat.

It shocks people to see me whip out my laptop because my baseball cap falls to the ground, but go with what works, I always say.

By the way, you'll want to add a table for game as well. [Smile]

Dagonee

Posts: 26071 | Registered: Oct 2003  |  IP: Logged | Report this post to a Moderator
King of Men
Member
Member # 6684

 - posted      Profile for King of Men   Email King of Men         Edit/Delete Post 
By the way, if you're going to use array-of-arrays, you may want to not use the obvious random number generator, but weight probabilities according to the number of quotes a person has. For example, consider the case of two persons, A with one quote, B with 50 quotes. If you pick a person in the obvious way, A's quote has a 50% probability of being chosen, while B's quotes each have a 1% chance. Clearly this is not ideal. Of course, if your quotes are more evenly distributed, you may not have a problem; still, it's something to think about.
Posts: 10645 | Registered: Jul 2004  |  IP: Logged | Report this post to a Moderator
WheatPuppet
Member
Member # 5142

 - posted      Profile for WheatPuppet   Email WheatPuppet         Edit/Delete Post 
Here's how I'd do it:
1. Get the quoter's name and store it
2. Open the file and push it into an array
3. Use a foreach and a regexp to grab all instances of quotes by that person, and use push/unshift to put it into a stack
4. Grab a random quote off of the stack.

I'd be temped to XML-ify your flatfile database, so you could use XML parsing modules on it. And it means you can have a separate block for every author, instead of silly database keys.

BTW--I hate databases. I know how to use them, but I don't trust them any farther than I can throw my monitor (which weighs between 45 and 60lbs, I think).

Posts: 903 | Registered: May 2003  |  IP: Logged | Report this post to a Moderator
Dagonee
Member
Member # 5818

 - posted      Profile for Dagonee           Edit/Delete Post 
Once you have the DB up and running, you'll find a million uses for it.

I like my ACID transactions (not that MySQL is fully acidic).

Dagonee

Posts: 26071 | Registered: Oct 2003  |  IP: Logged | Report this post to a Moderator
xnera
Member
Member # 187

 - posted      Profile for xnera   Email xnera         Edit/Delete Post 
quote:

Install the DBD::CSV perl module. It lets you access your CSV file just like you would access a MySQL table.

Ooo, adam, thanks! I might do this instead.

You know, this is one of the things I most love AND most hate about perl. There's a module for everything! It's great, because it can make a complicated task be just one line of code, but trying to find the modules you need is a pain. It took me eight hours to find the correct combination of modules to do time/date manipulation. [Roll Eyes]

WheatPuppet, thanks for the suggestion. While I know theoretically how stacks work, the actual use of them is rather mystifying to me. I'll stick to an array or a DB. But that's not a bad idea to XML the file. Problem is, I know nothing about XML.

Posts: 1805 | Registered: Jun 1999  |  IP: Logged | Report this post to a Moderator
xnera
Member
Member # 187

 - posted      Profile for xnera   Email xnera         Edit/Delete Post 
I did try the DBD::CSV module, but I had problems binding the columns, so gave up on it.

I also thought about it some more, and decided that I really did want to use MySQL to retain data integrity, as there is the potential for multiple users to be adding quotes at the same time. I didn't want any quotes to be overwritten, and DBs have built-in functionality to take care of that.

MySQL was a devil to install and get working with perl, but that was mostly caused by a few .ini files in the wrong place and using the wrong perl modules. Once I realised using DBI itself was sufficient (no need to use DBD::mysql), the errors quickly got resolved. [Big Grin] Yay!

So the !quote command now works, and is specific to which bot the person is talking to (if it's bot1, it returns quotes from game1 only; if it's bot2, it returns game2 quotes). And users can also add quotes with !addquote character: quote. [Big Grin] I am pleased.

Posts: 1805 | Registered: Jun 1999  |  IP: Logged | Report this post to a Moderator
   

   Close Topic   Feature Topic   Move Topic   Delete Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:


Contact Us | Hatrack River Home Page

Copyright © 2008 Hatrack River Enterprises Inc. All rights reserved.
Reproduction in whole or in part without permission is prohibited.


Powered by Infopop Corporation
UBB.classic™ 6.7.2