Book Home Programming PerlSearch this book

29.2. Perl Functions in Alphabetical Order

Many of the following function names are annotated with, um, annotations. Here are their meanings:

Uses $_ ($ARG) as a default variable.

Sets $! ($OS_ERROR) on syscall errors.

Raises exceptions; use eval to trap $@ ($EVAL_ERROR).

Sets $? ($CHILD_ERROR) when child process exits.

Taints returned data.

Taints returned data under some system, locale, or handle settings.

Raises an exception if given an argument of inappropriate type.

Raises an exception if modifying a read-only target.

Raises an exception if fed tainted data.

Raises an exception if unimplemented on current platform.

Functions that return tainted data when fed tainted data are not marked, since that's most of them. In particular, if you use any function on %ENV or @ARGV, you'll get tainted data.

Functions marked with raise an exception when they require, but do not receive, an argument of a particular type (such as filehandles for I/O operations, references for blessing, etc.).

Functions marked with sometimes need to alter their arguments. If they can't modify the argument because it's marked read-only, they'll raise an exception. Examples of read-only variables are the special variables containing data captured during a pattern match and variables that are really aliases to constants.

Functions marked with may not be implemented on all platforms. Although many of these are named after functions in the Unix C library, don't assume that just because you aren't running Unix, you can't call any of them. Many are emulated, even those you might never expect to see--such as fork on Win32 systems, which works as of the 5.6 release of Perl. For more information about the portability and behavior of system-specific functions, see the perlport manpage, plus any platform-specific documentation that came with your Perl port.

Functions that raise other miscellaneous exceptions are marked with , including math functions that throw range errors, such as sqrt(-1).

29.2.1. abs    

abs VALUE
abs
This function returns the absolute value of its argument.
$diff = abs($first - $second);
Note: here and in subsequent examples, good style (and the use strict pragma) would dictate that you add a my modifier to declare a new lexically scoped variable, like this:
my $diff = abs($first - $second);
However, we've omitted my from most of our examples for clarity. Just assume that any such variable was declared earlier, if that cranks your rotor.

29.2.2. accept     

accept SOCKET, PROTOSOCKET
This function is used by server processes that wish to listen for socket connections from clients. PROTOSOCKET must be a filehandle already opened via the socket operator and bound to one of the server's network addresses or to INADDR_ANY. Execution is suspended until a connection is made, at which point the SOCKET filehandle is opened and attached to the newly made connection. The original PROTOSOCKET remains unchanged; its only purpose is to be cloned into a real socket. The function returns the connected address if the call succeeds, false otherwise. For example:
unless ($peer = accept(SOCK, PROTOSOCK)) {
    die "Can't accept a connection: $!\n";
}
On systems that support it, the close-on-exec flag will be set for the newly opened file descriptor, as determined by the value of $^F ($SYSTEM_FD_MAX).

See accept(2). See also the example in the section "Sockets" in Chapter 16, "Interprocess Communication".

29.2.3. alarm    

alarm EXPR
alarm
This function sends a SIGALRM signal to the current process after EXPR seconds.

Only one timer may be active at once. Each call disables the previous timer, and an EXPR of 0 may be supplied to cancel the previous timer without starting a new one. The return value is the amount of time remaining on the previous timer.

print "Answer me within one minute, or die: ";
alarm(60);             # kill program in one minute
$answer = <STDIN>;
$timeleft = alarm(0);  # clear alarm
print "You had $timeleft seconds remaining\n";
It is usually a mistake to intermix alarm and sleep calls, because many systems use the alarm(2) syscall mechanism to implement sleep(3). On older machines, the elapsed time may be up to one second less than you specified because of how seconds are counted. Additionally, a busy system may not get around to running your process immediately. See Chapter 16, "Interprocess Communication" for information on signal handling.

For alarms of finer granularity than one second, you might be able to use the syscall function to access setitimer(2) if your system supports it. The CPAN module, Timer::HiRes, also provides functions for this purpose.

29.2.4. atan2    

atan2 Y, X
This function returns the principal value of the arc tangent of Y/X in the range - to . A quick way to get an approximate value of is to say:
$pi = atan2(1,1) * 4;
For the tangent operation, you may use the tan function from either the Math::Trig or the POSIX modules, or just use the familiar relation:
sub tan { sin($_[0]) / cos($_[0]) }

29.2.5. bind    

bind SOCKET, NAME
This function attaches an address (a name) to an already opened socket specified by the SOCKET filehandle. The function returns true if it succeeded, false otherwise. NAME should be a packed address of the proper type for the socket.
use Socket;
$port_number = 80;      # pretend we want to be a web server
$sockaddr = sockaddr_in($port_number, INADDR_ANY);
bind SOCK, $sockaddr or die "Can't bind $port_number: $!\n";
See bind(2). See also the examples in the section "Sockets" in Chapter 16, "Interprocess Communication".

29.2.6. binmode    

binmode FILEHANDLE, DISCIPLINES
binmode FILEHANDLE
This function arranges for the FILEHANDLE to have the semantics specified by the DISCIPLINES argument. If DISCIPLINES is omitted, binary (or "raw") semantics are applied to the filehandle. If FILEHANDLE is an expression, the value is taken as the name of the filehandle or a reference to a filehandle, as appropriate.

The binmode function should be called after the open but before any I/O is done on the filehandle. The only way to reset the mode on a filehandle is to reopen the file, since the various disciplines may have treasured up various bits and pieces of data in various buffers. This restriction may be relaxed in the future.

In the olden days, binmode was used primarily on operating systems whose run-time libraries distinguished text from binary files. On those systems, the purpose of binmode was to turn off the default text semantics. However, with the advent of Unicode, all programs on all systems must take some cognizance of the distinction, even on Unix and Mac systems. These days there is only one kind of binary file (as far as Perl is concerned), but there are many kinds of text files, which Perl would also like to treat in a single way. So Perl has a single internal format for Unicode text, UTF-8. Since there are many kinds of text files, text files often need to be translated upon input into UTF-8, and upon output back into some legacy character set, or some other representation of Unicode. You can use disciplines to tell Perl how exactly (or inexactly) to do these translations.[2]

[2]More precisely, you will be able to use disciplines for this, but we're still implementing them as of this writing.

For example, a discipline of ":text" will tell Perl to do generic text processing without telling Perl which kind of text processing to do. But disciplines like ":utf8" and ":latin1" tell Perl which text format to read and write. On the other hand, the ":raw" discipline tells Perl to keep its cotton-pickin' hands off the data. For more on how disciplines work (or will work), see the open function. The rest of this discussion describes what binmode does without the DISCIPLINES argument, that is, the historical meaning of binmode, which is equivalent to:

binmode FILEHANDLE, ":raw";
Unless instructed otherwise, Perl will assume your freshly opened file should be read or written in text mode. Text mode means that \n (newline) will be your internal line terminator. All systems use \n as the internal line terminator, but what that really represents varies from system to system, device to device, and even file to file, depending on how you access the file. In such legacy systems (including MS-DOS and VMS), what your program sees as a \n may not be what's physically stored on disk. The operating system might, for example, store text files with \cM\cJ sequences that are translated on input to appear as \n to your program, and have \n from your program translated back to \cM\cJ on output to a file. The binmode function disables this automatic translation on such systems.

In the absence of a DISCIPLINES argument, binmode has no effect under Unix or Mac OS, both of which use \n to end each line and represent that as a single character. (It may, however, be a different character: Unix uses \cJ and older Macs use \cM. Doesn't matter.)

The following example shows how a Perl script might read a GIF image from a file and print it to the standard output. On systems that would otherwise alter the literal data into something other than its exact physical representation, you must prepare both handles. While you could use a ":raw" discipline directly in the GIF open, you can't do that so easily with pre-opened filehandles like STDOUT:

binmode STDOUT;
open(GIF, "vim-power.gif") or die "Can't open vim-power.gif: $!\n";
binmode GIF;
while (read(GIF, $buf, 1024)) {
    print STDOUT $buf;
}

29.2.7. bless    

bless REF, CLASSNAME
bless REF
This function tells the referent pointed to by reference REF that it is now an object in the CLASSNAME package--or the current package if no CLASSNAME is specified. If REF is not a valid reference, an exception is raised. For convenience, bless returns the reference, since it's often the last function in a constructor subroutine. For example:
$pet = Beast->new(TYPE => "cougar", NAME => "Clyde");

# then in Beast.pm:
sub new {
    my $class  = shift;
    my %attrs = @_;
    my $self   = { %attrs };
    return bless($self, $class);
}
You should generally bless objects into CLASSNAMEs that are mixed case. Namespaces with all lowercase names are reserved for internal use as Perl pragmata (compiler directives). Built-in types (such as "SCALAR", "ARRAY", "HASH", etc., not to mention the base class of all classes, "UNIVERSAL") all have uppercase names, so you may wish to avoid such package names as well.

Make sure that CLASSNAME is not false; blessing into false packages is not supported and may result in unpredictable behavior.

It is not a bug that there is no corresponding curse operator. (But there is a sin operator.) See also Chapter 12, "Objects", for more about the blessing (and blessings) of objects.

29.2.8. caller    

caller EXPR
caller
This function returns information about the stack of current subroutine calls and such. Without an argument, it returns the package name, filename, and line number that the currently executing subroutine was called from:
($package, $filename, $line) = caller;
Here's an example of an exceedingly picky function, making use of the special tokens __PACKAGE__ and __FILE__ described in Chapter 2, "Bits and Pieces":
sub careful {
    my ($package, $filename) = caller;
    unless ($package  eq __PACKAGE__ && $filename eq __FILE__) {
        die "You weren't supposed to call me, $package!\n";
    }
    print "called me safely\n";
}

sub safecall {
    careful();
}
When called with an argument, caller evaluates EXPR as the number of stack frames to go back before the current one. For example, an argument of 0 means the current stack frame, 1 means the caller, 2 means the caller's caller, and so on. The function also reports additional information as shown here:
$i = 0;
while (($package, $filename, $line, $subroutine,
        $hasargs, $wantarray, $evaltext, $is_require,
        $hints, $bitmask) = caller($i++) )
{
    ...
}
If the frame is a subroutine call, $hasargs is true if it has its own @_ array (not one borrowed from its caller). Otherwise, $subroutine may be "(eval)" if the frame is not a subroutine call, but an eval. If so, additional elements $evaltext and $is_require are set: $is_require is true if the frame is created by a require or use statement, and $evaltext contains the text of the evalEXPR statement. In particular, for a evalBLOCK statement, $filename is "(eval)", but $evaltext is undefined. (Note also that each use statement creates a require frame inside an evalEXPR frame.) The $hints and $bitmask are internal values; please ignore them unless you're a member of the thaumatocracy.

In a fit of even deeper magic, caller also sets the array @DB::args to the arguments passed in the given stack frame--but only when called from within the DB package. See Chapter 20, "The Perl Debugger".

29.2.9. chdir    

chdir EXPR
chdir
This function changes the current process's working directory to EXPR, if possible. If EXPR is omitted, the caller's home directory is used. The function returns true upon success, false otherwise.
chdir "$prefix/lib" or die "Can't cd to $prefix/lib: $!\n";
See also the Cwd module, described in Chapter 32, "Standard Modules", which lets you keep track of your current directory automatically.

29.2.10. chmod    

chmod LIST
This function changes the permissions of a list of files. The first element of the list must be the numerical mode, as in the chmod(2) syscall. The function returns the number of files successfully changed. For example:
$cnt = chmod 0755, 'file1', 'file2';
will set $cnt to 0, 1, or 2, depending on how many files were changed. Success is measured by lack of error, not by an actual change, because a file may have had the same mode before the operation. An error probably means you lacked sufficient privileges to change its mode because you were neither the file's owner nor the superuser. Check $! to find the actual reason for failure.

Here's a more typical usage:

chmod(0755, @executables) == @executables
        or die "couldn't chmod some of @executables: $!";
If you need to know which files didn't allow the change, use something like this:
@cannot = grep {not chmod 0755, $_} 'file1', 'file2', 'file3';
die "$0: could not chmod @cannot\n" if @cannot;
This idiom makes use of the grep function to select only those elements of the list for which the chmod function failed.

When using nonliteral mode data, you may need to convert an octal string to a number using the oct function. That's because Perl doesn't automatically assume a string contains an octal number just because it happens to have a leading "0".

$DEF_MODE = 0644;   # Can't use quotes here!
PROMPT: {
    print "New mode? ";
    $strmode = <STDIN>;
        exit unless defined $strmode;   # test for eof
    if ($strmode =~ /^\s*$/) {          # test for blank line
        $mode = $DEF_MODE;
    }
    elsif ($strmode !~ /^\d+$/) {
        print "Want numeric mode, not $strmode\n";
        redo PROMPT;
    }
    else {
        $mode = oct($strmode);          # converts "755" to 0755
    }
    chmod $mode, @files;
}
This function works with numeric modes much like the Unix chmod(2) syscall. If you want a symbolic interface like the one the chmod(1) command provides, see the File::chmod module on CPAN.

You can also import the symbolic S_I* constants from the Fcntl module:

use Fcntl ':mode';
chmod S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH, @executables;
Some people consider that more readable than 0755. Go figure.

29.2.11. chomp    

chomp VARIABLE
chomp LIST
chomp
This function (normally) deletes a trailing newline from the end of a string contained in a variable. This is a slightly safer version of chop (described next) in that it has no effect upon a string that doesn't end in a newline. More specifically, it deletes the terminating string corresponding to the current value of $/, and not just any last character.

Unlike chop, chomp returns the number of characters deleted. If $/ is "" (in paragraph mode), chomp removes all trailing newlines from the selected string (or strings, if chomping a LIST). You cannot chomp a literal, only a variable.

For example:

while (<PASSWD>) {
    chomp;   # avoid \n on last field
    @array = split /:/;
    ...
}
With version 5.6, the meaning of chomp changes slightly in that input disciplines are allowed to override the value of the $/ variable and mark strings as to how they should be chomped. This has the advantage that an input discipline can recognize more than one variety of line terminator (such as Unicode paragraph and line separators), but still safely chomp whatever terminates the current line.

29.2.12. chop    

chop VARIABLE
chop LIST
chop
This function chops off the last character of a string variable and returns the character chopped. The chop operator is used primarily to remove the newline from the end of an input record, and is more efficient than using a substitution. If that's all you're doing, then it would be safer to use chomp, since chop always shortens the string no matter what's there, and chomp is more selective.

You cannot chop a literal, only a variable.

If you chop a LIST of variables, each string in the list is chopped:

@lines = `cat myfile`;
chop @lines;
You can chop anything that is an lvalue, including an assignment:
chop($cwd = `pwd`);
chop($answer = <STDIN>);
This is different from:
$answer = chop($tmp = <STDIN>);  # WRONG
which puts a newline into $answer because chop returns the character chopped, not the remaining string (which is in $tmp). One way to get the result intended here is with substr:
$answer = substr <STDIN>, 0, -1;
But this is more commonly written as:
chop($answer = <STDIN>);
In the most general case, chop can be expressed in terms of substr:
$last_char = chop($var);
$last_char = substr($var, -1, 1, "");   # same thing
Once you understand this equivalence, you can use it to do bigger chops. To chop more than one character, use substr as an lvalue, assigning a null string. The following removes the last five characters of $caravan:
substr($caravan, -5) = "";
The negative subscript causes substr to count from the end of the string instead of the beginning. If you wanted to save the characters so removed, you could use the four-argument form of substr, creating something of a quintuple chop:
$tail = substr($caravan, -5, 5, "");

29.2.13. chown    

chown LIST
This function changes the owner and group of a list of files. The first two elements of the list must be the numeric UID and GID, in that order. A value of -1 in either position is interpreted by most systems to leave that value unchanged. The function returns the number of files successfully changed. For example:
chown($uidnum, $gidnum, 'file1', 'file2') == 2
        or die "can't chown file1 or file2: $!";
will set $cnt to 0, 1, or 2, depending on how many files got changed (in the sense that the operation succeeded, not in the sense that the owner was different afterward). Here's a more typical usage:
chown($uidnum, $gidnum, @filenames) == @filenames
        or die "can't chown @filenames: $!";
Here's a subroutine that accepts a username, looks up the user and group IDs for you, and does the chown:
sub chown_by_name {
    my($user, @files) = @_;
    chown((getpwnam($user))[2,3], @files) == @files
            or die "can't chown @files: $!";
}

chown_by_name("fred", glob("*.c"));
However, you may not want the group changed as the previous function does, because the /etc/passwd file associates each user with a single group even though that user may be a member of many secondary groups according to /etc/group. An alternative is to pass a -1 for the GID, which leaves the group of the file unchanged. If you pass a -1 as the UID and a valid GID, you can set the group without altering the owner.

On most systems, you are not allowed to change the ownership of the file unless you're the superuser, although you should be able to change the group to any of your secondary groups. On insecure systems, these restrictions may be relaxed, but this is not a portable assumption. On POSIX systems, you can detect which rule applies like this:

use POSIX qw(sysconf _PC_CHOWN_RESTRICTED);
# only try if we're the superuser or on a permissive system
if ($> == 0 || !sysconf(_PC_CHOWN_RESTRICTED) ) {
    chown($uidnum, -1, $filename)
        or die "can't chown $filename to $uidnum: $!";
}

29.2.14. chr    

chr NUMBER
chr

This function returns the character represented by that NUMBER in the character set. For example, chr(65) is "A" in either ASCII or Unicode, and chr(0x263a) is a Unicode smiley face. For the reverse of chr, use ord.

If you'd rather specify your characters by name than by number (for example, "\N{WHITE SMILING FACE}" for a Unicode smiley), see charnames in Chapter 31, "Pragmatic Modules".

29.2.15. chroot    

chroot FILENAME
chroot
If successful, FILENAME becomes the new root directory for the current process--the starting point for pathnames beginning with "/". This directory is inherited across exec calls and by all subprocesses forked after the chroot call. There is no way to undo a chroot. For security reasons, only the superuser can use this function. Here's some code that approximates what many FTP servers do:
chroot((getpwnam('ftp'))[7])
    or die "Can't do anonymous ftp: $!\n";
This function is unlikely to work on non-Unix systems. See chroot(2).

29.2.16. close    

close FILEHANDLE
close
This function closes the file, socket, or pipe associated with FILEHANDLE. (It closes the currently selected filehandle if the argument is omitted.) It returns true if the close is successful, false otherwise. You don't have to close FILEHANDLE if you are immediately going to do another open on it, since the next open will close it for you. (See open.) However, an explicit close on an input file resets the line counter ($.), while the implicit close done by open does not.

FILEHANDLE may be an expression whose value can be used as an indirect filehandle (either the real filehandle name or a reference to anything that can be interpreted as a filehandle object).

If the filehandle came from a piped open, close will return false if any underlying syscall fails or if the program at the other end of the pipe exited with nonzero status. In the latter case, the close forces $! ($OS_ERROR) to zero. So if a close on a pipe returns a nonzero status, check $! to determine whether the problem was with the pipe itself (nonzero value) or with the program at the other end (zero value). In either event, $? ($CHILD_ERROR) contains the wait status value (see its interpretation under system) of the command associated with the other end of the pipe. For example:

open(OUTPUT, '| sort -rn | lpr -p') # pipe to sort and lpr
    or die "Can't start sortlpr pipe: $!";
print OUTPUT @lines;                # print stuff to output
close OUTPUT                        # wait for sort to finish
    or warn $! ? "Syserr closing sortlpr pipe: $!"
               : "Wait status $? from sortlpr pipe";
A filehandle produced by dup(2)ing a pipe is treated as an ordinary filehandle, so close will not wait for the child on that filehandle. You have to wait for the child by closing the original filehandle. For example:
open(NETSTAT, "netstat -rn |")
    or die "can't run netstat: $!";
open(STDIN, "<&NETSTAT")
    or die "can't dup to stdin: $!";
If you close STDIN above, there is no wait, but if you close NETSTAT, there is.

If you somehow manage to reap an exited pipe child on your own, the close will fail. This could happen if you had a $SIG{CHLD} handler of your own that got triggered when the pipe child exited, or if you intentionally called waitpid on the process ID returned from the open call.

29.2.17. closedir    

closedir DIRHANDLE
This function closes a directory opened by opendir and returns the success of that operation. See the examples under readdir. DIRHANDLE may be an expression whose value can be used as an indirect dirhandle, usually the real dirhandle name.

29.2.18. connect    

connect SOCKET, NAME
This function initiates a connection with another process that is waiting at an accept. The function returns true if it succeeded, false otherwise. NAME should be a packed network address of the proper type for the socket. For example, assuming SOCK is a previously created socket:
use Socket;

my ($remote, $port) = ("www.perl.com", 80);
my $destaddr = sockaddr_in($port, inet_aton($remote));
connect SOCK, $destaddr
    or die "Can't connect to $remote at port $port: $!";
To disconnect a socket, use either close or shutdown. See also the examples in the section "Sockets" in Chapter 16, "Interprocess Communication". See connect(2).

29.2.19. cos     

cos EXPR
cos
This function returns the cosine of EXPR (expressed in radians). For example, the following script will print a cosine table of angles measured in degrees:
# Here's the lazy way of getting degrees-to-radians.

$pi = atan2(1,1) * 4;
$piover180 = $pi/180;

# Print table.
for ($deg = 0; $deg <= 90; $deg++) {
    printf "%3d %7.5f\n", $deg, cos($deg * $piover180);
}
For the inverse cosine operation, you may use the acos() function from the Math::Trig or POSIX modules, or use this relation:
sub acos { atan2( sqrt(1 - $_[0] * $_[0]), $_[0] ) }

29.2.20. crypt    

crypt PLAINTEXT, SALT
This function computes a one-way hash of a string exactly in the manner of crypt(3). This is somewhat useful for checking the password file for lousy passwords,[3] although what you really want to do is prevent people from adding the bad passwords in the first place.

[3] Only people with honorable intentions are allowed to do this.

crypt is intended to be a one-way function, much like breaking eggs to make an omelette. There is no (known) way to decrypt an encrypted password apart from exhaustive, brute-force guessing.

When verifying an existing encrypted string, you should use the encrypted text as the SALT (like crypt($plain, $crypted) eq $crypted). This allows your code to work with the standard crypt, and with more exotic implementations, too.

When choosing a new SALT, you minimally need to create a random two character string whose characters come from the set [./0-9A-Za-z] (like join '', ('.', '/', 0..9, 'A'..'Z', 'a'..'z')[rand 64, rand 64]). Older implementations of crypt only needed the first two characters of the SALT, but code that only gives the first two characters is now considered nonportable. See your local crypt(3) manpage for interesting details.

Here's an example that makes sure that whoever runs this program knows their own password:

$pwd = (getpwuid ($<))[1];    # Assumes we're on Unix.

system "stty -echo";   # or look into Term::ReadKey on CPAN
print "Password: ";
chomp($word = <STDIN>);
print "\n";
system "stty echo";

if (crypt($word, $pwd) ne $pwd) {
    die "Sorry...\n";
} else {
    print "ok\n";
}
Of course, typing in your own password to whoever asks for it is unwise.

Shadow password files are slightly more secure than traditional password files, and you might have to be a superuser to access them. Because few programs should run under such powerful privileges, you might have the program maintain its own independent authentication system by storing the crypt strings in a different file than /etc/passwd or /etc/shadow.

The crypt function is unsuitable for encrypting large quantities of data, not least of all because you can't get the information back. Look at the by-module/Crypt and by-module/PGP directories on your favorite CPAN mirror for a slew of potentially useful modules.

29.2.21. dbmclose    

dbmclose HASH
This function breaks the binding between a DBM (database management) file and a hash. dbmclose is really just a call to untie with the proper arguments, but is provided for backward compatibility with ancient versions of Perl.

29.2.22. dbmopen    

dbmopen HASH, DBNAME, MODE
This binds a DBM file to a hash (that is, an associative array). (DBM stands for database management, and consists of a set of C library routines that allow random access to records via a hashing algorithm.) HASH is the name of the hash (including the %). DBNAME is the name of the database (without any .dir or .pag extension). If the database does not exist and a valid MODE is specified, the database is created with the protection specified by MODE, as modified by the umask. To prevent creation of the database if it doesn't exist, you may specify a MODE of undef, and the function will return false if it can't find an existing database. Values assigned to the hash prior to the dbmopen are not accessible.

The dbmopen function is really just a call to tie with the proper arguments, but is provided for backward compatibility with ancient versions of Perl. You can control which DBM library you use by using the tie interface directly or by loading the appropriate module before you call dbmopen. Here's an example that works on some systems for versions of DB_File similar to the version in your Netscape browser:

use DB_File;
dbmopen(%NS_Hist, "$ENV{HOME}/.netscape/history.dat", undef)
    or die "Can't open netscape history file: $!";

while (($url, $when) = each %NS_Hist) {
    next unless defined($when);
    chop ($url, $when);        # kill trailing null bytes
    printf "Visited %s at %s.\n", $url,
        scalar(localtime(unpack("V",$when)));
}
If you don't have write access to the DBM file, you can only read the hash variables, not set them. If you want to test whether you can write, either use a file test like -w $file, or try setting a dummy hash entry inside an eval {}, which will trap the exception.

Functions such as keys and values may return huge list values when used on large DBM files. You may prefer to use the each function to iterate over large DBM files so that you don't load the whole thing in memory at once.

Hashes bound to DBM files have the same limitations as the type of DBM package you're using, including restrictions on how much data you can put into a bucket. If you stick to short keys and values, it's rarely a problem. See also the DB_File module in Chapter 32, "Standard Modules".

Another thing you should bear in mind is that many existing DBM databases contain null-terminated keys and values because they were set up with C programs in mind. The Netscape history file and the old sendmail aliases file are examples. Just use "$key\0" when pulling out a value, and remove the null from the value.

$alias = $aliases{"postmaster\0"};
chop $alias;   # kill the null
There is currently no built-in way to lock a generic DBM file. Some would consider this a bug. The GDBM_File module does attempt to provide locking at the granularity of the entire file. When in doubt, your best bet is to use a separate lock file.

29.2.23. defined    

defined EXPR
defined
This function returns a Boolean value saying whether EXPR has a defined value or not. Most of the data you deal with is defined, but a scalar that contains no valid string, numeric, or reference value is said to contain the undefined value, or undef for short. Initializing a scalar variable to a particular value will define it, and it will stay defined until you assign an undefined value to it or explicitly call the undef function on that variable.

Many operations return undef under exceptional conditions, such as at end-of-file, when using an uninitialized variable's value, an operating system error, etc. Since undef is just one kind of false value, a simple Boolean test does not distinguish between undef, numeric zero, the null string, and the one-character string, "0"--all of which are equally false. The defined function allows you to distinguish between an undefined null string and a defined null string when you're using operators that might return a real null string.

Here is a fragment that tests a scalar value from a hash:

print if defined $switch{D};
When used on a hash element like this, defined only tells you whether the value is defined, not whether the key has an entry in the hash. It's possible to have a key whose value is undefined; the key itself however still exists. Use exists to determine whether the hash key exists.

In the next example we exploit the convention that some operations return the undefined value when you run out of data:

print "$val\n" while defined($val = pop(@ary));
And in this one, we do the same thing with the getpwent function for retrieving information about the system's users.
setpwent();
while (defined($name = getpwent())) {
    print "<<$name>>\n";
}
endpwent();
The same thing goes for error returns from syscalls that could validly return a false value:
die "Can't readlink $sym: $!"
    unless defined($value = readlink $sym);
You may also use defined to see whether a subroutine has been defined yet. This makes it possible to avoid blowing up on nonexistent subroutines (or subroutines that have been declared but never given a definition):
indir("funcname", @arglist);
sub indir {
    my $subname = shift;
    no strict 'refs';  # so we can use subname indirectly
    if (defined &$subname) {
        &$subname(@_);    # or $subname->(@_);
    }
    else {
        warn "Ignoring call to invalid function $subname";
    }
}
Use of defined on aggregates (hashes and arrays) is deprecated. (It used to report whether memory for that aggregate had ever been allocated.) Instead, use a simple Boolean test to see whether the array or hash has any elements:
if (@an_array) { print "has array elements\n" }
if (%a_hash)   { print "has hash members\n"   }
See also undef and exists.

29.2.24. delete    

delete EXPR
This function deletes an element (or a slice of elements) from the specified hash or array. (See unlink if you want to delete a file.) The deleted elements are returned in the order specified, though this behavior is not guaranteed for tied variables such as DBM files. After the delete operation, the exists function will return false on any deleted key or index. (In contrast, after the undef function, the exists function continues to return true, because the undef function only undefines the value of the element, but doesn't delete the element itself.)

Deleting from the %ENV hash modifies the environment. Deleting from a hash that is bound to a (writable) DBM file deletes the entry from that DBM file.

Historically, you could only delete from a hash, but with Perl version 5.6 you may also delete from an array. Deleting from an array causes the element at the specified position to revert to a completely uninitialized state, but it doesn't close up the gap, since that would change the positions of all the subsequent entries. Use a splice for that. (However, if you delete the final element in an array, the array size will shrink by one (or more, depending on the position of the next largest existing element (if any))).

EXPR can be arbitrarily complicated, provided that the final operation is a hash or array lookup:

# set up array of array of hash
$dungeon[$x][$y] = \%properties;

# delete one property from hash
delete $dungeon[$x][$y]{"OCCUPIED"};

# delete three properties all at once from hash
delete @{ $dungeon[$x][$y] }{ "OCCUPIED", "DAMP", "LIGHTED" };

# delete reference to %properties from array
delete $dungeon[$x][$y];
The following na&#239;ve example inefficiently deletes all the values of a %hash:
foreach $key (keys %hash) {
    delete $hash{$key};
}
And so does this:
delete @hash{keys %hash};
But both of these are slower than just assigning the empty list or undefining it:
%hash = ();         # completely empty %hash
undef %hash;        # forget %hash ever existed
Likewise for arrays:
foreach $index (0 .. $#array) {
    delete $array[$index];
}
and:
delete @array[0 .. $#array];
are less efficient than either of:
@array = ();         # completely empty @array
undef @array;        # forget @array ever existed

29.2.25. die    

die LIST
die
Outside an eval, this function prints the concatenated value of LIST to STDERR and exits with the current value of $! (the C-library errno variable). If $! is 0, it exits with the value of $? >> 8 (which is the status of the last reaped child from a system, wait, close on a pipe, or `command`). If $? >> 8 is 0, it exits with 255.

Within an eval, the function sets the $@ variable to the error message that would have otherwise been produced, then aborts the eval, which returns undef. The die function can thus be used to raise named exceptions that can be caught at a higher level in the program. See eval later in this chapter.

If LIST is a single object reference, that object is assumed to be an exception object and is returned unmodified as the exception in $@.

If LIST is empty and $@ already contains a string value (typically from a previous eval) that value is reused after appending "\t...propagated". This is useful for propagating (reraising) exceptions:

eval { ... };
die unless $@ =~ /Expected exception/;
If LIST is empty and $@ already contains an exception object, the $@->PROPAGATE method is called to determine how the exception should be propagated.

If LIST is empty and $@ is empty, then the string "Died" is used.

If the final value of LIST does not end in a newline (and you're not passing an exception object), the current script filename, line number, and input line number (if any) are appended to the message, as well as a newline. Hint: sometimes appending ", stopped" to your message will cause it to make better sense when the string "at scriptname line 123" is appended. Suppose you are running script canasta; consider the difference between the following two ways of dying:

die "/usr/games is no good";
die "/usr/games is no good, stopped";
which produce, respectively:
/usr/games is no good at canasta line 123.
/usr/games is no good, stopped at canasta line 123.
If you want your own error messages reporting the filename and line number, use the __FILE__ and __LINE__ special tokens:
die '"', __FILE__, '", line ', __LINE__, ", phooey on you!\n";
This produces output like:
"canasta", line 38, phooey on you!
One other style issue--consider the following equivalent examples:
die "Can't cd to spool: $!\n"   unless chdir '/usr/spool/news';

chdir '/usr/spool/news'         or die "Can't cd to spool: $!\n"
Because the important part is the chdir, the second form is generally preferred.

See also exit, warn, %SIG, and the Carp module.

29.2.26. do (block)    

do BLOCK
The doBLOCK form executes the sequence of statements in the BLOCK and returns the value of the last expression evaluated in the block. When modified by a while or until statement modifier, Perl executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.) The doBLOCK itself does not count as a loop, so the loop control statements next, last, or redo cannot be used to leave or restart the block. See the section "Bare Blocks" in Chapter 4, "Statements and Declarations", for workarounds.

29.2.27. do (file)    

do FILE
The doFILE form uses the value of FILE as a filename and executes the contents of the file as a Perl script. Its primary use is (or rather was) to include subroutines from a Perl subroutine library, so that:
do 'stat.pl';
is rather like:
scalar eval `cat stat.pl`;   # `type stat.pl` on Windows
except that do is more efficient, more concise, keeps track of the current filename for error messages, searches all the directories listed in the @INC array, and updates %INC if the file is found. (See Chapter 28, "Special Names".) It also differs in that code evaluated with doFILE cannot see lexicals in the enclosing scope, whereas code in evalFILE does. It's the same, however, in that it reparses the file every time you call it--so you might not want to do this inside a loop unless the filename itself changes at each loop iteration.

If do can't read the file, it returns undef and sets $! to the error. If do can read the file but can't compile it, it returns undef and sets an error message in $@. If the file is successfully compiled, do returns the value of the last expression evaluated.

Inclusion of library modules (which have a mandatory .pm suffix) is better done with the use and require operators, which also do error checking and raise an exception if there's a problem. They also offer other benefits: they avoid duplicate loading, help with object-oriented programming, and provide hints to the compiler on function prototypes.

But doFILE is still useful for such things as reading program configuration files. Manual error checking can be done this way:

# read in config files: system first, then user
for $file ("/usr/share/proggie/defaults.rc",
           "$ENV{HOME}/.someprogrc")
{
    unless ($return = do $file) {
        warn "couldn't parse $file: $@" if $@;
        warn "couldn't do $file: $!"    unless defined $return;
        warn "couldn't run $file"       unless $return;
    }
}
A long-running daemon could periodically examine the timestamp on its configuration file, and if the file has changed since it was last read in, the daemon could use do to reload that file. This is more tidily accomplished with do than with require or use.

29.2.28. do (subroutine)    

do SUBROUTINE(LIST)
The doSUBROUTINE(LIST) is a deprecated form of a subroutine call. An exception is raised if the SUBROUTINE is undefined. See Chapter 6, "Subroutines".

29.2.29. dump    

dump LABEL
dump
This function causes an immediate core dump. Primarily this is so that you can use the undump program (not supplied) to turn your core dump into an executable binary after having initialized all your variables at the beginning of the program. When the new binary is executed it will begin by executing a gotoLABEL (with all the restrictions that goto suffers). Think of it as a goto with an intervening core dump and reincarnation. If LABEL is omitted, the program is restarted from the top. Warning: any files opened at the time of the dump will not be open any more when the program is reincarnated, with possible resulting confusion on the part of Perl. See also the -u command-line option in Chapter 19, "The Command-Line Interface".

This function is now largely obsolete, partly because it's difficult in the extreme to convert a core file into an executable in the general case, and because various compiler backends for generating portable bytecode and compilable C code have superseded it.

If you're looking to use dump to speed up your program, check out the discussion of efficiency matters in Chapter 24, "Common Practices", as well the Perl native-code generator in Chapter 18, "Compiling". You might also consider autoloading or selfloading, which at least make your program appear to run faster.

29.2.30. each    

each HASH
This function steps through a hash one key/value pair at a time. When called in list context, each returns a two-element list consisting of the key and value for the next element of a hash, so that you can iterate over it. When called in scalar context, each returns just the key for the next element in the hash. When the hash is entirely read, the empty list is returned, which when assigned produces a false value in scalar context, such as a loop test. The next call to each after that will start iterating again. The typical use is as follows, using predefined %ENV hash:
while (($key,$value) = each %ENV) {
    print "$key=$value\n";
}
Internally, a hash maintains its own entries in an apparently random order. The each function iterates through this sequence because every hash remembers which entry was last returned. The actual ordering of this sequence is subject to change in future versions of Perl, but is guaranteed to be in the same order as the keys (or values) function would produce on the same (unmodified) hash.

There is a single iterator for each hash, shared by all each, keys, and values function calls in the program; it can be reset by reading all the elements from the hash, or by evaluating keys %hash or values %hash. If you add or delete elements of a hash while you're iterating over it, the resulting behavior is not well-defined: entries might get skipped or duplicated.

See also keys, values, and sort.

29.2.31. eof    

eof FILEHANDLE
eof()
eof
This function returns true if the next read on FILEHANDLE would return end-of-file, or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle, or a reference to a filehandle object of some sort. An eof without an argument returns the end-of-file status for the last file read. An eof() with empty parentheses () tests the ARGV filehandle (most commonly seen as the null filehandle in <>). Therefore, inside a while (<>) loop, an eof() with parentheses will detect the end of only the last of a group of files. Use eof (without the parentheses) to test each file in a while (<>) loop. For example, the following code inserts dashes just before the last line of the last file:
while (<>) {
    if (eof()) {
        print "-" x 30, "\n";
    }
    print;
}
On the other hand, this script resets line numbering on each input file:
# reset line numbering on each input file
while (<>) {
    next if /^\s*#/;        # skip comments
    print "$.\t$_";
} continue {
    close ARGV if eof;      # Not eof()!
}
Like "$" in a sed program, eof tends to show up in line number ranges. Here's a script that prints lines from /pattern/ to end of each input file:
while (<>) {
    print if /pattern/ .. eof;
}
Here, the flip-flop operator (..) evaluates the pattern match for each line. Until the pattern matches, the operator returns false. When it finally matches, the operator starts returning true, causing the lines to be printed. When the eof operator finally returns true (at the end of the file being examined), the flip-flop operator resets, and starts returning false again for the next file in @ARGV.

Warning: The eof function reads a byte and then pushes it back on the input stream with ungetc(3), so it is not useful in an interactive context. In fact, experienced Perl programmers rarely use eof, since the various input operators already behave politely in while-loop conditionals. See the example in the description of foreach in Chapter 4, "Statements and Declarations".

29.2.32. eval    

eval BLOCK
eval EXPR
eval
The eval keyword serves two distinct but related purposes in Perl. These purposes are represented by two forms of syntax, evalBLOCK and evalEXPR. The first form traps run-time exceptions (errors) that would otherwise prove fatal, similar to the "try block" construct in C++ or Java. The second form compiles and executes little bits of code on the fly at run time, and also (conveniently) traps any exceptions just like the first form. But the second form runs much slower than the first form, since it must parse the string every time. On the other hand, it is also more general. Whichever form you use, eval is the preferred way to do all exception handling in Perl.

For either form of eval, the value returned from an eval is the value of the last expression evaluated, just as with subroutines. Similarly, you may use the return operator to return a value from the middle of the eval. The expression providing the return value is evaluated in void, scalar, or list context, depending on the context of the eval itself. See wantarray for more on how the evaluation context can be determined.

If there is a trappable error (including any produced by the die operator), eval returns undef and puts the error message (or object) in $@. If there is no error, $@ is guaranteed to be set to the null string, so you can test it reliably afterward for errors. A simple Boolean test suffices:

eval { ... };    # trap run-time errors
if ($@) { ... }    # handle error
The evalBLOCK form is syntax-checked at compile time, so it is quite efficient. (People familiar with the slow evalEXPR form are occasionally confused on this issue.) Since the code in the BLOCK is compiled at the same time as the surrounding code, this form of eval cannot trap syntax errors.

The evalEXPR form can trap syntax errors because it parses the code at run time. (If the parse is unsuccessful, it places the parse error in $@, as usual.) Otherwise, it executes the value of EXPR as though it were a little Perl program. The code is executed in the context of the current Perl program, which means that it can see any enclosing lexicals from a surrounding scope, and that any non-local variable settings remain in effect after the eval is complete, as do any subroutine or format definitions. The code of the eval is treated as a block, so any locally scoped variables declared within the eval last only until the eval is done. (See my and local.) As with any code in a block, a final semicolon is not required.

Here is a simple Perl shell. It prompts the user to enter a string of arbitrary Perl code, compiles and executes that string, and prints whatever error occurred:

print "\nEnter some Perl code: ";

while (<STDIN>) {
    eval;
    print $@;
    print "\nEnter some more Perl code: ";
}
Here is a rename program to do a mass renaming of files using a Perl expression:
#!/usr/bin/perl
# rename - change filenames
$op = shift;
for (@ARGV) {
    $was = $_;
    eval $op;
    die if $@;
    # next line calls the built-in function, not the script by the same name
    rename($was,$_) unless $was eq $_;
}
You'd use that program like this:
$ rename 's/\.orig$//'                *.orig
$ rename 'y/A-Z/a-z/ unless /^Make/'  *
$ rename '$_ .= ".bad"'               *.f
Since eval traps errors that would otherwise prove fatal, it is useful for determining whether particular features (such as fork or symlink) are implemented.

Because evalBLOCK is syntax-checked at compile time, any syntax error is reported earlier. Therefore, if your code is invariant and both evalEXPR and evalBLOCK will suit your purposes equally well, the BLOCK form is preferred. For example:

# make divide-by-zero nonfatal
eval { $answer = $a / $b; };    warn $@ if $@;

# same thing, but less efficient if run multiple times
eval '$answer = $a / $b';       warn $@ if $@;

# a compile-time syntax error (not trapped)
eval { $answer = };             # WRONG

# a run-time syntax error
eval '$answer =';               # sets $@
Here, the code in the BLOCK has to be valid Perl code to make it past the compile phase. The code in the EXPR doesn't get examined until run time, so it doesn't cause an error until run time.

The block of evalBLOCK does not count as a loop, so the loop control statements next, last, or redo cannot be used to leave or restart the block.

29.2.33. exec    

exec PATHNAME LIST
exec LIST
The exec function terminates the current program and executes an external command and never returns!!! Use system instead of exec if you want to recover control after the commands exits. The exec function fails and returns false only if the command does not exist and if it is executed directly instead of via your system's command shell (discussed below).

If there is only one scalar argument, the argument is checked for shell metacharacters. If metacharacters are found, the entire argument is passed to the system's standard command interpreter (/bin/sh under Unix). If there are no metacharacters, the argument is split into words and executed directly, since in the interests of efficiency this bypasses all the overhead of shell processing. It also gives you more control of error recovery should the program not exist.

If there is more than one argument in LIST, or if LIST is an array with more than one value, the system shell will never be used. This also bypasses any shell processing of the command. The presence or absence of metacharacters in the arguments doesn't affect this list-triggered behavior, which makes it the preferred form in security-conscious programs that do not wish to expose themselves to potential shell escapes.

This example causes the currently running Perl program to replace itself with the echo program, which then prints out the current argument list:

exec 'echo', 'Your arguments are: ', @ARGV;
This example shows that you can exec a pipeline, not just a single program.
exec "sort $outfile | uniq"
    or die "Can't do sort/uniq: $!\n";
Ordinarily, exec never returns--if it does return, it always returns false, and you should check $! to find out what went wrong. Be aware that in older releases of Perl, exec (and system) did not flush your output buffer, so you needed to enable command buffering by setting $| on one or more filehandles to avoid lost output in the case of exec, or misordered output in the case of system. This situation was largely remedied in the 5.6 release of Perl.

When you ask the operating system to execute a new program within an existing process (as Perl's exec function does), you tell the system the location of the program to execute, but you also tell the new program (through its first argument) the name under which the program was invoked. Customarily, the name you tell it is just a copy of the location of the program, but it doesn't necessarily have to be, since there are two separate arguments at the level of the C language. When it is not a copy, you have the odd result that the new program thinks it's running under a name that may be totally different from the actual pathname where the program resides. Often this doesn't matter to the program in question, but some programs do care and adopt a different persona depending on what they think their name is. For example, the vi editor looks to see whether it was called as "vi" or as "view". If invoked as "view", it automatically enables read-only mode, just as though it was called with the -R command-line option.

This is where exec's optional PATHNAME parameter comes into play. Syntactically, it goes in the indirect-object slot like the filehandle for print or printf. Therefore, it doesn't take a comma after it, because it's not exactly part of the argument list. (In a sense, Perl takes the opposite approach from the operating system in that it assumes the first argument is the important one, and lets you modify the pathname if it differs.) For example:

$editor = "/usr/bin/vi";
exec $editor "view", @files      # trigger read-only mode
    or die "Couldn't execute $editor: $!\n";
As with any other indirect object, you can also replace the simple scalar holding the program name with a block containing arbitrary code, which simplifies the previous example to:
exec { "/usr/bin/vi" } "view" @files      # trigger read-only mode
    or die "Couldn't execute $editor: $!\n";
As we mentioned earlier, exec treats a discrete list of arguments as an indication that it should bypass shell processing. However, there is one place where you might still get tripped up. The exec call (and system, too) will not distinguish between a single scalar argument and an array containing only one element.
@args = ("echo surprise");  # just one element in list
exec @args                  # still subject to shell escapes
     or die "exec: $!";     #   because @args == 1
To avoid this, you can use the PATHNAME syntax, explicitly duplicating the first argument as the pathname, which forces the rest of the arguments to be interpreted as a list, even if there is only one of them:
exec { $args[0] } @args   # safe even with one-argument list
    or die "can't exec @args: $!";
The first version, the one without the curlies, runs the echo program, passing it "surprise" as an argument. The second version doesn't--it tries to run a program literally called echo surprise, doesn't find it (we hope), and sets $! to a nonzero value indicating failure.

Because the exec function is most often used shortly after a fork, it is assumed that anything that normally happens when a Perl process terminates should be skipped. Upon an exec, Perl will not call your END blocks, nor will it call any DESTROY methods associated with any objects. Otherwise, your child process would end up doing the cleanup you expected the parent process to do. (We wish that were the case in real life.)

Because it's such a common mistake to use exec instead of system, Perl warns you if there is a following statement that isn't die, warn, or exit when run with the popular -w command-line option, or if you've used the use warnings qw(exec syntax) pragma. If you really want to follow an exec with some other statement, you can use either of these styles to avoid the warning:

exec ('foo')   or print STDERR "couldn't exec foo: $!";
{ exec ('foo') }; print STDERR "couldn't exec foo: $!";
As the second line above shows, a call to exec that is the last statement in a block is exempt from this warning.

See also system.

29.2.34. exists    

exists EXPR
This function returns true if the specified hash key or array index exists in its hash or array. It doesn't matter whether the corresponding value is true or false, or whether the value is even defined.
print "True\n"      if         $hash{$key};
print "Defined\n"   if defined $hash{$key};
print "Exists\n"    if exists  $hash{$key};

print "True\n"      if         $array[$index];
print "Defined\n"   if defined $array[$index];
print "Exists\n"    if exists  $array[$index];
An element can be true only if it's defined, and can be defined only if it exists, but the reverse doesn't necessarily hold.

EXPR can be arbitrarily complicated, provided that the final operation is a hash key or array index lookup:

if (exists $hash{A}{B}{$key}) { ... }
Although the last element will not spring into existence just because its existence was tested, intervening ones will. Thus $$hash{"A"} and $hash{"A"}->{"B"} will both spring into existence. This is not a function of exists, per se; it happens anywhere the arrow operator is used (explicitly or implicitly):
undef $ref;
if (exists $ref->{"Some key"}) { }
print $ref;   # prints HASH(0x80d3d5c)
Even though the "Some key" element didn't spring into existence, the previously undefined $ref variable did suddenly come to hold an anonymous hash. This is a surprising instance of autovivification in what does not at first--or even second--glance appear to be an lvalue context. This behavior is likely to be fixed in a future release. As a workaround, you can nest your calls:
if ($ref                        and
    exists $ref->[$x]           and
    exists $ref->[$x][$y]       and
    exists $ref->[$x][$y]{$key} and
    exists $ref->[$x][$y]{$key}[2] ) { ... }
If EXPR is the name of a subroutine, the exists function will return true if that subroutine has been declared, even if it has not yet been defined. The following will just print "Exists":
sub flub;
print "Exists\n"     if exists  &flub;
print "Defined\n"    if defined &flub;
Using exists on a subroutine name can be useful for an AUTOLOAD subroutine that needs to know whether a particular package wants a particular subroutine to be defined. The package can indicate this by declaring a stub sub like flub.

29.2.35. exit    

exit EXPR
exit
This function evaluates EXPR as an integer and exits immediately with that value as the final error status of the program. If EXPR is omitted, the function exits with 0 status (meaning "no error"). Here's a fragment that lets a user exit the program by typing x or X:
$ans = <STDIN>;
exit if $ans =~ /^[Xx]/;
You shouldn't use exit to abort a subroutine if there's any chance that someone might want to trap whatever error happened. Use die instead, which can be trapped by an eval. Or use one of die's wrappers from the Carp module, like croak or confess.

We said that the exit function exits immediately, but that was a bald-faced lie. It exits as soon as possible, but first it calls any defined END routines for at-exit handling. These routines cannot abort the exit, although they can change the eventual exit value by setting the $? variable. Likewise, any class that defines a DESTROY method will invoke that method on behalf of all its objects before the real program exits. If you really need to bypass exit processing, you can call the POSIX module's _exit function to avoid all END and destructor processing. And if POSIX isn't available, you can exec "/bin/false" or some such.

29.2.36. exp    

exp EXPR
exp
This function returns e to the power of EXPR. To get the value of e, just use exp(1). For general exponentiation of different bases, use the ** operator we stole from FORTRAN:
use Math::Complex;
print -exp(1) ** (i * pi);   # prints 1

29.2.37. fcntl    

fcntl FILEHANDLE, FUNCTION, SCALAR
This function calls your operating system's file control functions, as documented in the fcntl(2) manpage. Before you call fcntl, you'll probably first have to say:
use Fcntl;
to load the correct constant definitions.

SCALAR will be read or written (or both) depending on the FUNCTION. A pointer to the string value of SCALAR will be passed as the third argument of the actual fcntl call. (If SCALAR has no string value but does have a numeric value, that value will be passed directly rather than passing a pointer to the string value.) See the Fcntl module for a description of the more common permissible values for FUNCTION.

The fcntl function will raise an exception if used on a system that doesn't implement fcntl(2). On systems that do implement it, you can do such things as modify the close-on-exec flags (if you don't want to play with the $^F ($SYSTEM_FD_MAX) variable), modify the nonblocking I/O flags, emulate the lockf(3) function, and arrange to receive the SIGIO signal when I/O is pending.

Here's an example of setting a filehandle named REMOTE to be nonblocking at the system level. This makes any input operation return immediately if nothing is available when reading from a pipe, socket, or serial line that would otherwise block. It also works to cause output operations that normally would block to return a failure status instead. (For those, you'll likely have to negotiate $| as well.)

use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);

$flags = fcntl(REMOTE, F_GETFL, 0)
            or die "Can't get flags for the socket: $!\n";

$flags = fcntl(REMOTE, F_SETFL, $flags | O_NONBLOCK)
            or die "Can't set flags for the socket: $!\n";
The return value of fcntl (and ioctl) is as follows:

Syscall Returns Perl Returns
-1 undef
0 String "0 but true"
anything else That number

Thus Perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:

$retval = fcntl(...) || -1;
printf "fcntl actually returned %d\n", $retval;
Here, even the string "0 but true" prints as 0, thanks to the %d format. This string is true in Boolean context and 0 in numeric context. (It is also happily exempt from the normal warnings on improper numeric conversions.)

29.2.38. fileno    

fileno FILEHANDLE
This function returns the file descriptor underlying a filehandle. If the filehandle is not open, fileno returns undef. A file descriptor is a small, non-negative integer like 0 or 1, in contrast to filehandles like STDIN and STDOUT, which are symbols. Unfortunately, the operating system doesn't know about your cool symbols. It only thinks of open files in terms of these small file numbers, and although Perl will usually do the translations for you automatically, occasionally you have to know the actual file descriptor.

So, for example, the fileno function is useful for constructing bitmaps for select and for passing to certain obscure system calls if syscall(2) is implemented. It's also useful for double-checking that the open function gave you the file descriptor you wanted and for determining whether two filehandles use the same system file descriptor.

if (fileno(THIS) == fileno(THAT)) {
    print "THIS and THAT are dups\n";
}
If FILEHANDLE is an expression, the value is taken as an indirect filehandle, generally its name or a reference to something resembling a filehandle object.

One caution: don't count on the association of a Perl filehandle and a numeric file descriptor throughout the life of the program. If a file has been closed and reopened, the file descriptor may change. Perl takes a bit of trouble to try to ensure that certain file descriptors won't be lost if an open on them fails, but it only does this for file descriptors that don't exceed the current value of the special $^F ($SYSTEM_FD_MAX) variable (by default, 2). Although filehandles STDIN, STDOUT, and STDERR start out with file descriptors of 0, 1, and 2 (the Unix standard convention), even they can change if you start closing and opening them with wild abandon. You can't get into trouble with 0, 1, and 2 as long as you always reopen immediately after closing. The basic rule on Unix systems is to pick the lowest available descriptor, and that'll be the one you just closed.

29.2.39. flock    

flock FILEHANDLE, OPERATION
The flock function is Perl's portable file-locking interface, although it locks only entire files, not records. The function manages locks on the file associated with FILEHANDLE, returning true for success and false otherwise. To avoid the possibility of lost data, Perl flushes your FILEHANDLE before locking or unlocking it. Perl might implement its flock in terms of flock(2), fcntl(2), lockf(3), or some other platform-specific lock mechanism, but if none of these is available, calling flock raises an exception. See the section "File Locking" in Chapter 16, "Interprocess Communication".

OPERATION is one of LOCK_SH, LOCK_EX, or LOCK_UN, possibly ORed with LOCK_NB. These constants are traditionally valued 1, 2, 8, and 4, but you can use the symbolic names if you import them from the Fcntl module, either individually or as a group using the :flock tag.

LOCK_SH requests a shared lock, so it's typically used for reading. LOCK_EX requests an exclusive lock, so it's typically used for writing. LOCK_UN releases a previously requested lock; closing the file also releases any locks. If the LOCK_NB bit is used with LOCK_SH or LOCK_EX, flock returns immediately rather than waiting for an unavailable lock. Check the return status to see whether you got the lock you asked for. If you don't use LOCK_NB, you might wait indefinitely for the lock to be granted.

Another nonobvious but traditional aspect of flock is that its locks are merely advisory. Discretionary locks are more flexible but offer fewer guarantees than mandatory ones. This means that files locked with flock may be modified by programs that do not also use flock. Cars that stop for red lights get on well with each other, but not with cars that don't stop for red lights. Drive defensively.

Some implementations of flock cannot lock things over the network. While you could in theory use the more system-specific fcntl for that, the jury (having sequestered itself on the case for a decade or so) is still out on whether this is (or even can be) reliable.

Here's a mailbox appender for Unix systems that use flock(2) to lock mailboxes:

use Fcntl qw/:flock/;       # import LOCK_* constants
sub mylock {
    flock(MBOX, LOCK_EX)
        or die "can't lock mailbox: $!";
    # in case someone appended while we were waiting
    # and our stdio buffer is out of sync
    seek(MBOX, 0, 2)
        or die "can't seek to the end of mailbox: $!";
}

open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
    or die "can't open mailbox: $!";

mylock();
print MBOX $msg, "\n\n";
close MBOX
    or die "can't close mailbox: $!";
On systems that support a real flock(2) syscall, locks are inherited across fork calls. Other implementations are not so lucky, and are likely to lose the locks across forks. See also the DB_File module in Chapter 32, "Standard Modules" for other flock examples.

29.2.40. fork    

fork
This function creates two processes out of one by invoking the fork(2) syscall. If it succeeds, the function returns the new child process's ID to the parent process and 0 to the child process. If the system doesn't have sufficient resources to allocate a new process, the call fails and returns undef. File descriptors (and sometimes locks on those descriptors) are shared, while everything else is copied--or at least made to look that way.

In versions of Perl prior to 5.6, unflushed buffers remain unflushed in both processes, which means you may need to set $| on one or more filehandles earlier in the program to avoid duplicate output.

A nearly bulletproof way to launch a child process while checking for "cannot fork" errors would be:

use Errno qw(EAGAIN);
FORK: {
    if ($pid = fork) {
        # parent here
        # child process pid is available in $pid
    }
        elsif (defined $pid) { # $pid is zero here if defined
        # child here
        # parent process pid is available with getppid
    }
        elsif ($! == EAGAIN) {
        # EAGAIN is the supposedly recoverable fork error
        sleep 5;
        redo FORK;
    }
        else {
        # weird fork error
        die "Can't fork: $!\n";
    }
}
These precautions are not necessary on operations that do an implicit fork(2), such as system, backticks, or opening a process as a filehandle, because Perl automatically retries a fork on a temporary failure when it's doing the fork for you. Be careful to end the child code with an exit, or else your child will inadvertently leave the conditional block and start executing code intended only for the parent process.

If you fork without ever waiting on your children, you will accumulate zombies (exited processes whose parents haven't waited on them yet). On some systems, you can avoid this by setting $SIG{CHLD} to "IGNORE"; on most, you must wait for your moribund children. See the wait function for examples of doing this, or see the "Signals" section of Chapter 16, "Interprocess Communication" for more on SIGCHLD.

If a forked child inherits system file descriptors like STDIN and STDOUT that are connected to a remote pipe or socket, you may have to reopen these in the child to /dev/null. That's because even when the parent process exits, the child will live on with its copies of those filehandles. The remote server (such as, say, a CGI script or a background job launched from a remote shell) will appear to hang because it's still waiting for all copies to be closed. Reopening the system filehandles to something else fixes this.

On most systems supporting fork(2), great care has gone into making it extremely efficient (for example, using copy-on-write technology on data pages), making it the dominant paradigm for multitasking over the last few decades. The fork function is unlikely to be implemented efficiently, or perhaps at all, on systems that don't resemble Unix. For example, Perl 5.6 emulates a proper fork even on Microsoft systems, but no assurances can be made on performance at this point. You might have more luck there with the Win32::Process module.

29.2.41. format    

format NAME =
    picture line
    value list
    ...
.
This function declares a named sequence of picture lines (with associated values) for use by the write function. If NAME is omitted, the name defaults to STDOUT, which happens to be the default format name for the STDOUT filehandle. Since, like a sub declaration, this is a package-global declaration that happens at compile time, any variables used in the value list need to be visible at the point of the format's declaration. That is, lexically scoped variables must be declared earlier in the file, while dynamically scoped variables merely need to be set at the time write is called. Here's an example (which assumes we've already calculated $cost and $quantity):
my $str = "widget";               # Lexically scoped variable.

format Nice_Output =
Test: @<<<<<<<< @||||| @>>>>>
      $str,     $%,    '$' . int($num)
.

local $~ = "Nice_Output";         # Select our format.
local $num = $cost * $quantity;   # Dynamically scoped variable.

write;
Like filehandles, format names are identifiers that exist in a symbol table (package) and may be fully qualified by package name. Within the typeglobs of a symbol table's entries, formats reside in their own namespace, which is distinct from filehandles, directory handles, scalars, arrays, hashes, and subroutines. Like those other six types, however, a format named Whatever would also be affected by a local on the *Whatever typeglob. In other words, a format is just another gadget contained in a typeglob, independent of the other gadgets.

The "Format Variables" section in Chapter 7, "Formats" contains numerous details and examples of their use. Chapter 28, "Special Names" describes the internal format-specific variables, and the English and IO::Handle modules provide easier access to them.

29.2.42. formline    

formline PICTURE, LIST
This is an internal function used by formats, although you may also call it yourself. It always returns true. It formats a list of values according to the contents of PICTURE, placing the output into the format output accumulator, $^A (or $ACCUMULATOR if you use the English module). Eventually, when a write is done, the contents of $^A are written to some filehandle, but you could also read $^A yourself and then set $^A back to "". A format typically does one formline per line of form, but the formline function itself doesn't care how many newlines are embedded in the PICTURE. This means that the ~ and ~~ tokens will treat the entire PICTURE as a single line. You may therefore need to use multiple formlines to implement a single record-format, just as the format compiler does internally.

Be careful if you put double quotes around the picture, since an @ character may be taken to mean the beginning of an array name. See "Formats" in Chapter 6, "Subroutines" for example uses.

29.2.43. getc    

getc FILEHANDLE
getc
This function returns the next byte from the input file attached to FILEHANDLE. It returns undef at end-of-file, or if an I/O error was encountered. If FILEHANDLE is omitted, the function reads from STDIN.

This function is somewhat slow, but occasionally useful for single-character (byte, really) input from the keyboard--provided you manage to get your keyboard input unbuffered. This function requests unbuffered input from the standard I/O library. Unfortunately, the standard I/O library is not so standard as to provide a portable way to tell the underlying operating system to supply unbuffered keyboard input to the standard I/O system. To do that, you have to be slightly more clever, and in an operating-system-dependent fashion. Under Unix you might say this:

if ($BSD_STYLE) {
  system "stty cbreak </dev/tty >/dev/tty 2>&1";
} else {
  system "stty", "-icanon", "eol", "\001";
}

$key = getc;

if ($BSD_STYLE) {
  system "stty -cbreak </dev/tty >/dev/tty 2>&1";
} else {
  system "stty", "icanon", "eol", "^@"; # ASCII NUL
}
print "\n";
This code puts the next character (byte) typed on the terminal in the string $key. If your stty program has options like cbreak, you'll need to use the code where $BSD_STYLE is true. Otherwise, you'll need to use the code where it is false. Determining the options for stty(1) is left as an exercise to the reader.

The POSIX module provides a more portable version of this using the POSIX::getattr function. See also the Term::ReadKey module from your nearest CPAN site for a more portable and flexible approach.

29.2.44. getgrent    

getgrent
setgrent
endgrent
These routines iterate through your /etc/group file (or maybe someone else's /etc/group file, if it's coming from a server somewhere). The return value from getgrent in list context is:
($name, $passwd, $gid, $members)
where $members contains a space-separated list of the login names of the members of the group. To set up a hash for translating group names to GIDs, say this:
while (($name, $passwd, $gid) = getgrent) {
    $gid{$name} = $gid;
}
In scalar context, getgrent returns only the group name. The standard User::grent module supports a by-name interface to this function. See getgrent(3).

29.2.45. getgrgid    

getgrgid GID
This function looks up a group file entry by group number. The return value in list context is:
($name, $passwd, $gid, $members)
where $members contains a space-separated list of the login names of the members of the group. If you want to do this repeatedly, consider caching the data in a hash using getgrent.

In scalar context, getgrgid returns only the group name. The User::grent module supports a by-name interface to this function. See getgrgid(3).

29.2.46. getgrnam    

getgrnam NAME
This function looks up a group file entry by group name. The return value in list context is:
($name, $passwd, $gid, $members)
where $members contains a space-separated list of the login names of the members of the group. If you want to do this repeatedly, consider caching the data in a hash using getgrent.

In scalar context, getgrnam returns only the numeric group ID. The User::grent module supports a by-name interface to this function. See getgrnam(3).

29.2.47. gethostbyaddr    

gethostbyaddr ADDR, ADDRTYPE
This function translates addresses into names (and alternate addresses). ADDR should be a packed binary network address, and ADDRTYPE should in practice usually be AF_INET (from the Socket module). The return value in list context is:
($name, $aliases, $addrtype, $length, @addrs) =
    gethostbyaddr($packed_binary_address, $addrtype);
where @addrs is a list of packed binary addresses. In the Internet domain, each address is (historically) four bytes long, and can be unpacked by saying something like:
($a, $b, $c, $d) = unpack('C4', $addrs[0]);
Alternatively, you can convert directly to dot vector notation with the v modifier to sprintf:
$dots = sprintf "%vd", $addrs[0];
The inet_ntoa function from the Socket module is useful for producing a printable version. This approach will become important if and when we all ever manage to switch over to IPv6.
use Socket;
$printable_address = inet_ntoa($addrs[0]);
In scalar context, gethostbyaddr returns only the host name.

To produce an ADDR from a dot vector, say this:

use Socket;
$ipaddr = inet_aton("127.0.0.1");       # localhost
$claimed_hostname = gethostbyaddr($ipaddr, AF_INET);
Interestingly, with version 5.6 of Perl you can skip the inet_aton() and use the new v-string notation that was invented for version numbers but happens to work for IP addresses as well:
$ipaddr = v127.0.0.1;
See the section "Sockets" in Chapter 16, "Interprocess Communication" for more examples. The Net::hostent module supports a by-name interface to this function. See gethostbyaddr(3).

29.2.48. gethostbyname    

gethostbyname NAME
This function translates a network hostname to its corresponding addresses (and other names). The return value in list context is:
($name, $aliases, $addrtype, $length, @addrs) =
    gethostbyname($remote_hostname);
where @addrs is a list of raw addresses. In the Internet domain, each address is (historically) four bytes long, and can be unpacked by saying something like:
($a, $b, $c, $d) = unpack('C4', $addrs[0]);
You can convert directly to vector notation with the v modifier to sprintf:
$dots = sprintf "%vd", $addrs[0];
In scalar context, gethostbyname returns only the host address:
use Socket;
$ipaddr = gethostbyname($remote_host);
printf "%s has address %s\n",
    $remote_host, inet_ntoa($ipaddr);
See "Sockets" in Chapter 16, "Interprocess Communication" for another approach. The Net::hostent module supports a by-name interface to this function. See also gethostbyname(3).

29.2.49. gethostent    

gethostent
sethostent STAYOPEN
endhostent
These functions iterate through your /etc/hosts file and return each entry one at a time. The return value from gethostent is:
($name, $aliases, $addrtype, $length, @addrs)
where @addrs is a list of raw addresses. In the Internet domain, each address is four bytes long, and can be unpacked by saying something like:
($a, $b, $c, $d) = unpack('C4', $addrs[0]);
Scripts that use gethostent should not be considered portable. If a machine uses a name server, it would have to interrogate most of the Internet to try to satisfy a request for all the addresses of every machine on the planet. So gethostent is unimplemented on such machines. See gethostent(3) for other details.

The Net::hostent module supports a by-name interface to this function.

29.2.50. getlogin    

getlogin
This function returns the current login name if found. On Unix systems, this is read from the utmp(5) file. If it returns false, use getpwuid instead. For example:
$login = getlogin() || (getpwuid($<))[0] || "Intruder!!";

29.2.51. getnetbyaddr    

getnetbyaddr ADDR, ADDRTYPE
This function translates a network address to the corresponding network name or names. The return value in list context is:
use Socket;
($name, $aliases, $addrtype, $net) = getnetbyaddr(127, AF_INET);
In scalar context, getnetbyaddr returns only the network name. The Net::netent module supports a by-name interface to this function. See getnetbyaddr(3).

29.2.52. getnetbyname    

getnetbyname NAME
This function translates a network name to its corresponding network address. The return value in list context is:
($name, $aliases, $addrtype, $net) = getnetbyname("loopback");
In scalar context, getnetbyname returns only the network address. The Net::netent module supports a by-name interface to this function. See getnetbyname(3).

29.2.53. getnetent    

getnetent
setnetent STAYOPEN
endnetent
These functions iterate through your /etc/networks file. The return value in list context is:
($name, $aliases, $addrtype, $net) = getnetent();
In scalar context, getnetent returns only the network name. The Net::netent module supports a by-name interface to this function. See getnetent(3).

The concept of network names seems rather quaint these days; most IP addresses are on unnamed (and unnameable) subnets.

29.2.54. getpeername