08 November 2007

PL/SQL: Random Data Generation

I recently created a script to generate random data. I did not want random strings for names, so I downloaded name lists from the U.S. Census names file. The data is all uppercase, I preferred to go with sentence case. An easy update statement later and I had names.

In order to randomly retrieve a row:


select
NameLast
into
xNameLast
from
(
select
NameLast
from
NameLast
order by
dbms_random.value
)
where
rownum = 1;
Why does this work? We are limiting the resultset to one row by using 'where rownum = 1'. That row is being randomized by the derived statement's ORDER BY clause referencing a random value... Which I do not understand. ORDER BY needs to reference columns, which in this case would always be one. Yet it does randomly select a row. So it must be sorting it based on a surrogate column or some such. I would ask Tom, but he is busy now.

Random names add some authenticity to the data set and just makes for good reading; Charity Certain and Refugio Cheeseman are two of my favorites.

I thought about loading a dictionary file so that I could randomly generate phrases... Selecting a random number of random words would be an easy enhancement.

In any event, feeding your procedures with random data is easy to do with PL/SQL using the DBMS_RANDOM package. The following statements outline how to generate most values.


-- Retrieve a whole number between x and y
select trunc(dbms_random.value( x, y)) from dual;

-- Retrieve a string of x characters
select dbms_random.string('A', x) from dual;

-- Retrieve a date between now and (now – x)
select sysdate - (trunc(dbms_random.value(0, x))) from dual;

No comments: