Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP-UX Reference > E

eucset(1)

HP-UX 11i Version 3: February 2007
» 

Technical documentation

» Feedback
Content starts here

 » Table of Contents

 » Index

NAME

eucset — set and get code widths for ldterm

SYNOPSIS

eucset [-p]

eucset [ [-c HP15-codeset] or [-c UTF8] or [-c ASIAN_UTF8] or [-c GB18030] or [cswidth] ]

DESCRIPTION

The eucset command sets or gets (reports) the encoding and display widths of the Extended UNIX Code (EUC), UCS Transformation Format (UTF8), or GB18030 characters processed by the current input terminal. EUC is an encoding method for codesets composed of single or multiple bytes. EUC permits applications and the terminal hardware to use the 7-bit US ASCII code and up to three single byte or multibyte codesets simultaneously.

ldterm is a STREAMS terminal line discipline module which obtains codeset information from eucset. See ldterm(7).

The cswidth value defines the character widths for codesets. If cswidth is not implicitly or explicitly defined by passing no argument to the eucset command, the cswidth value is determined by the following criteria in descending priority:

1.

Use the cswidth value stored in the current locale, if defined.

2.

Use predefined cswidth values if the codeset name defined in the locale is GB18030, UTF8, or one of the four HP15 codesets.

3.

Use the CSWIDTH environment variable if defined and in the correct format.

4.

Use 7-bit US ASCII as the default codeset and its cswidth value.

This command must be used to specify EUC or non-EUC codesets, whether they are single byte or multibyte. However, the eucset command can correctly set the cswidth parameter without using any options in most cases except for ASIAN_UTF8. See the WARNINGS section for special warnings on the values of the cswidth argument.

For the GB18030, ASIAN_UTF8, or UTF8 setting, use the -c option.

Options

The eucset command recognizes the following options and arguments:

-p

Displays the current settings of the EUC character widths for the terminal.

-c

Sets the width to one of the four HP15 codesets, UTF8, or ASIAN_UTF8, or GB18030. The HP15 codesets supported are SJIS, CCDC, GB, and BIG5.

cswidth

Defines the character widths for codesets 1 through 3. See the EUC Code Set Classes section in this manpage for more information.

EUC Code Set Classes

EUC divides codesets into four classes. Each codeset has two characteristics: the number of bytes for encoding the characters in the codeset, and the number of display columns to display the characters in the codeset. All characters within a codeset possess the same characteristics. ASIAN_UTF8 is used for setting double width display, and UTF8 is used for single width.

  • Codeset 0 consists of all 7-bit, single byte ASCII characters. The most significant bit of each of these characters is 0 (zero). Characters in codeset 0 require one byte for encoding, and occupy one display column. These values are fixed for codeset 0 (zero). The 7-bit US ASCII code is the primary EUC codeset, which is available to users without direct specification.

  • Codeset 1 is a supplementary EUC codeset. Codeset 1 characters have an initial byte whose most significant bit is 1. Characters in codeset 1 may require more than one byte for encoding, and may require more than one display column. The eucset command must be used to set the characteristics for codeset 1.

  • Codesets 2 and 3 are supplementary EUC codesets. Characters in these codesets have an initial byte of SS2 or SS3, respectively. They require more than one byte for encoding, and may require more than one display column. The eucset command must be used to set the characteristics for codesets 2 and 3.

The cswidth argument in the eucset command line is a character string that describes the character widths for codesets 1 through 3. This command does not allow the user to modify the settings for codeset 0. The character string is of the following format:

X1[:Y1],X2[:Y2],X3[:Y3]

X1

The number of bytes required to encode a character in codeset class 1.

Y1

The number of display columns needed to display characters in this class.

X2

The number of bytes required to encode a character in codeset 2, not counting the SS2 byte,

Y2

The number of display columns for codeset 2 characters.

X3

The number of bytes needed to encode characters in codeset 3, not counting the SS3 byte,

Y3

The number of display columns required for these characters.

The values for the column widths may be omitted if they are equal to the number of encoding bytes. If the encoding value of any of the EUC codesets is set to 0 (zero), then the codeset does not exist. See the WARNINGS section for special warnings on the values of the cswidth argument.

If no cswidth argument is supplied, the eucset command uses the value of the CSWIDTH environment variable. If this variable is not present, the following default string is substituted:

1:1,0:0,0:0

This default string designates that the environment uses a single byte EUC codeset that has characters in the EUC codeset 1 format. If the environment uses a multibyte EUC codeset in the codeset 1 format, single byte or multibyte EUC codesets in the codeset 2 or 3 format, or both, the default setting cannot be used.

EXTERNAL INFLUENCES

Environment Variables

LANG

Provide a default value for the internationalization variables that are unset or null. If LANG is not specified or is set to the empty string, a default of C (see lang(5)) is used instead of LANG. If any of the internationalization variables contain an invalid setting, eucset behaves as if all internationalization variables are set to C. See environ(5).

LC_ALL

If set to a nonempty string value, override the values of all other internationalization variables.

LC_MESSAGES

Determines the locale that should be used to affect the format and contents of diagnostic messages written to standard error and informative messages written to standard output.

NLSPATH

Determines the location of message catalogs for the processing of LC_MESSAGES.

EXAMPLES

To display the encoding and display widths for the EUC codesets 1 to 3 in your environment, enter:

eucset -p

Assuming eucset has been previously used to set for ja_JP.eucJP, the entry generates the following:

cswidth 2:2,1:1,2:2

To change the current settings of the encoding and display widths for the EUC characters in codesets 1 and 2 to two bytes each, enter one of the following:

  • eucset 2:2,2:2,0:0

    eucset 2,2,0

To set the encoding and display widths for the EUC characters in the locale ja_JP.eucJP, enter:

eucset 2:2,1:1,2:2

For zh_TW.eucTW, enter:

eucset 2:2,3:2

For ko_KR.eucKR, enter:

eucset 2:2

To set the code width to that of UTF8, enter:

eucset -c UTF8

To set the code width to that of ASIAN_UTF8, enter:

eucset -c ASIAN_UTF8

To set the code width to that of GB18030, enter:

eucset -c GB18030

WARNINGS

The cswidth argument does not include the SS2 or SS3 bytes in the byte width values.

This command is not specified by standards, may not be available on other vendor's systems, and may be subject to change or obsolescence in a future release.

AUTHOR

eucset was developed by OSF and HP.

SEE ALSO

dtterm(1), ldterm(7).

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1983-2007 Hewlett-Packard Development Company, L.P.