String Manipulation

A word, a name or a message can be stored in the computer as a string. Any sequence of characters may be used in a string. Quotation marks are used to delimit the beginning and ending of the string. For example:

 
LET A$="COMPUTER"
Fail$="The test has failed."
File_name$="INVENTORY"
Test$=Fail$[5,8]
 

The left-hand side of the assignment (the variable name) is equated to the right-hand side of the assignment (the literal).

String variable names are identical to numeric variable names with the exception of a dollar sign ($) appended to the end of the name. (Refer to "Naming Variables" in chapter 3 for more information.)

Details About Strings

If you are using a localized version of BASIC that supports two-byte characters, such as Japanese localized BASIC, note that BASIC handles two-byte characters in special ways. This chapter describes the character handling features of BASIC as they apply to one-byte extended ASCII characters.

In general, BASIC handles one- and two-byte characters in a very similar manner; however, two-byte characters are handled differently in some cases. The general principles of two-byte character handling are explained in HP BASIC Porting and Globalization.

The length of a string is the number of characters in the string. In the previous example, the length of A$ is 8 since there are eight characters in the literal "COMPUTER". A string with length 0 (i.e., that contains no characters) is a null string. BASIC allows the dimensioned length of a string to range from 1 to 32 767 characters and the current length (number of characters in the string) to range from zero to the dimensioned length.

The default dimensioned length of a string is 18 characters. The DIM, COM, and ALLOCATE statements are used to define string lengths up to the maximum length of 32 767 characters. An error results whenever a string variable is assigned more characters than its dimensioned length.

A string may contain any character. The only special case is when a quotation mark needs to be in a string. Two quotes, in succession, will embed a quote within a string.

 

10 Quote$="The time is ""NOW""."
20 PRINT Quote$
30 END
 

Result: The time is "NOW".

String Storage

Strings whose length exceeds the default length of 18 characters must have space reserved before assignment. The following statements may be used:

 
DIM Long$[400]   Reserve space for a 400-character string

COM Line$[80]   Reserve an 80-character common variable

ALLOCATE Search$[Length]   Dynamic length allocation
 

Strings that have been dimensioned but not assigned return the null string.

String Arrays

Large amounts of text are easily handled in arrays. For example:

 
DIM File$(1000)[80]
 

This statement reserves storage for 1000 lines of 80 characters per line. Do not confuse the brackets, which define the length of the string, with the parentheses which define the number of strings in the array. Each string in the array can be accessed by an index. For example:

 
PRINT File$(27)
 

Prints the 27th element in the array. Since each character in a string uses one byte of memory and each string in the array requires as many bytes as the length of the string, string arrays can quickly use a lot of memory.

A program saved on a disk as an ASCII type file can be entered into a string array, manipulated, and written back out to disk.

Evaluating Expressions Containing Strings

This section covers:

Evaluation Hierarchy

The three allowed string operations are extracting a substring, concatenation, and parenthesization. The evaluation hierarchy is presented in the following table.
Order Operation
High Parentheses
-- Substrings and Functions
Low Concatenation

String Concatenation

Two separate strings are joined together by using the concatenation operator "& ". The following program combines two strings into one.

 
10 One$="WRIST"
20 Two$="WATCH"
30 Concat$=One$& Two$
40 PRINT One$,Two$,Concat$
50 END
 

Result:

 
WRIST     WATCH     WRISTWATCH
 

Relational Operations

Most of the relational operators used with numeric expressions can also be used with strings. Any of these relational operators may be used: <, >, <=, >=, =, <>.

The following examples show some of the possible tests.

 

"ABC" = "ABC" True
"ABC" = " ABC" False

"ABC" < "AbC" True
"6" > "7" False
"2" < "12" False

"long" <= "longer" True
"RE-SAVE" >= "RESAVE" False
 

Testing begins with the first character in the string and proceeds, character by character, until the relationship has been determined.

The outcome of a relational test is based on the characters in the strings not on the length of the strings. For example:

 
"BRONTOSAURUS" < "CAT"
 

This relationship is true since the letter "C" is higher in ASCII value than the letter "B".
NOTE
When the LEX binary is loaded, the outcome of a string comparison is based on the character's lexical value rather than the character's ASCII value. See the LEXICAL ORDER IS statement later in this chapter for more details.

Substrings

A subscript can be appended to a string variable name to define a substring. A substring may comprise all or just part of the original string. Brackets enclose the subscript which can be a constant, variable, or numeric expression. For instance:
String$[4] Specifies a substring starting with the fourth character of the original string.

The subscript must be in the range: 1 to the current length of the string plus 1. Note that the brackets now indicate the substring's starting position instead of the total length of the string as when reserving storage for a string.

Subscripted strings may appear on either side of the assignment.

Single-Subscript Substrings

When a substring is specified with only one numerical expression, enclosed with brackets, the expression is evaluated and rounded to an integer indicating the position of the first character of the substring within the string.

The following examples use the variable A$ which has been assigned the literal "DICTIONARY".
Statement Output
PRINT A$ DICTIONARY
PRINT A$[0] (error)
PRINT A$[1] DICTIONARY
PRINT A$[5] IONARY
PRINT A$[5,2]
PRINT A$[10] Y
PRINT A$[11] (null string)
PRINT A$[12] (error)

When a single subscript is used it specifies the starting character position, within the string, of the substring. An error results when the subscript evaluates to zero or greater than the current length of the string plus 1. A subscript that evaluates to 1 plus the length of the string returns the null string ("") but does not produce an error.

Double-Subscript Substrings

A substring may have two subscripts, within brackets, to specify a range of characters. When a comma is used to separate the items within brackets, the first subscript marks the beginning position of the substring, while the second subscript is the ending position of the substring. The form is: A$[Start,End].

When a semicolon is used in place of a comma, the first subscript again marks the beginning position of the substring, while the second subscript is now the length of the substring. The form is: A$[Start;Length].

In the following examples the variable B$ has been assigned the literal "ENLIGHTENMENT":
Statement Output
PRINT B$ ENLIGHTENMENT
PRINT B$[1,13] ENLIGHTENMENT
PRINT B$[1;13] ENLIGHTENMENT
PRINT B$[1,9] ENLIGHTEN
PRINT B$[1;9] ENLIGHTEN
PRINT B$[3,7] LIGHT
PRINT B$[3;7] LIGHTEN
PRINT B$[13,13] N
PRINT B$[13;1] N
PRINT B$[13,26] (error)
PRINT B$[13;13] (error)
PRINT B$[14;1] (null string)

An error results if the second subscript in a comma separated pair is greater than the current string length plus 1 or if the sum of the subscripts in a semicolon separated pair is greater than the current string length plus 1.

Specifying the position just past the end of a string returns the null string.

Special Considerations

All substring operations allow a subscript to specify the first position past the end of a string. This allows strings to be concatenated without the concatenation operator. For instance:

 
10 A$="CONCAT"
20 A$[7]="ENATION"
30 PRINT A$
40 END
 

Result: CONCATENATION

The substring assignment is only valid if the substring already has characters up to the specified position. Access beyond the first position past the end of a string results in the error:

 
ERROR 18  String ovfl. or substring err
 

A good practice is to dimension all strings including those shorter than the default length of eighteen characters. When a substring assignment specifies fewer characters than are available, any extra trailing characters are truncated. For example:

 
10 Big$="Too big to fit"
20 Small$="Little string"
30 !
40 Small$[1,3]=Big$
50 !
60 PRINT Small$
70 END
 

Result: Tootle string

The alternate assignment is shown in the next example. Here a 4-character string is assigned to a 8-character substring.

 
10 Big$="A large string"
20 Small$="tiny"
30 !
40 Big$[3,10]=Small$
50 !
60 DISP Big$
70 END
 

Prints: A tiny ring

Since the subscripted length of the substring is greater than the length of the replacement string, enough blanks (ASCII spaces) are added to the end of the replacement string to fill the entire specified substring.

String-Related Functions

Several built-in functions are available in BASIC for manipulating strings. These functions include conversions between string and numeric values.

Special Note for Localized BASIC

If you are using a localized version of BASIC that supports two-byte characters, such as Japanese localized BASIC, note that BASIC handles two-byte characters in special ways. The general principles of two-byte character handling are explained in HP BASIC Porting and Globalization.

Current String Length

The "length" of a string is the number of characters in the string. The LEN function returns an integer equal to the string length. For example:

 
PRINT LEN("HELP ME")
 

Result: 7

The following example program prints the length of a string that is typed on the keyboard.

 

10 DIM In$[160]
20 INPUT In$
30 Length=LEN(In$)
40 DISP Length;"characters in """;In$;""""
50 END
 

Try finding the length of a string containing only spaces. When the INPUT statement is used, any leading or trailing spaces are removed from items typed on the keyboard. Change INPUT to LINPUT in line 20 to allow leading and trailing spaces to be entered.

Maximum String Length

The MAXLEN function returns an integer equal to the dimensioned length of a string variable. For example:

 
100    DIM First_string$[37],Second_string$(2)[15]
110    PRINT "Maximum length of the first string is";
120    PRINT MAXLEN(First_string$)
130    PRINT
140    PRINT "Maximum length of the second string is";
150    PRINT MAXLEN(Second_string$(1))
160    Test("A TEST STRING")
170    END
180    SUB Test(A$)
190      PRINT
200      PRINT "Maximum length of the test string is";
210      PRINT MAXLEN(A$)
220    SUBEND
 

Result:

 
Maximum length of the first string is 37

Maximum length of the second string is 15

Maximum length of the test string is 13
 

Substring Position

The "position" of a substring within a string is determined by the POS function. The function returns the value of the starting position of the substring or zero if the entire substring was not found. For example:

 
PRINT POS("DISAPPEARANCE","APPEAR")
 

Result: 4

If POS returns a non-zero value, the entire substring occurs in the first string and the value specifies the starting position of the substring.

Note that POS returns the first occurrence of a substring within a string. By adding a subscript, and indexing through the string, the POS function can be used to find all occurrences of a substring. The following program uses this technique to extract each word from a sentence.

 
10    DIM A$[80]
20    A$="I know you think you understand what I said, but you don't."
30    INTEGER Scan,Found
40    Scan=1 ! Current substring position
50    PRINT A$
60    REPEAT
70      Found=POS(A$[Scan]," ") ! Find the next ASCII space
80      IF Found THEN
90        PRINT A$[Scan,Scan+Found-1] ! Print the word
100       Scan=Scan+Found ! Adjust "Scan" past last match
110     ELSE
120       PRINT A$[Scan] ! Print last word in string
130     END IF
140   UNTIL NOT Found
150   END
 

As each occurrence is found, the new subscript specifies the remaining portion of the string to be searched.

String-to-Numeric Conversion

The VAL function converts a string expression into a numeric value. The string must evaluate to a valid number or an error will result. The number returned by the VAL function will be converted to and from scientific notation when necessary. For example:

 
PRINT VAL("123.4E3")
 

Result: 123400

The following program converts a fraction into its equivalent decimal value.

 
10    INPUT "Enter a fraction (i.e. 3/4)",Fraction$
20    !
30    ON ERROR GOTO Err
40      Numerator=VAL(Fraction$)
50      !
60      IF POS(Fraction$,"/") THEN
70        Delimiter=POS(Fraction$,"/")
80        Denominator=VAL(Fraction$[Delimiter+1])
90      ELSE
100       PRINT "Invalid fraction"
110       GOTO Quit
120     END IF
130     !
140     PRINT Fraction$;" = ";Numerator/Denominator
150     GOTO Quit
160 Err: PRINT "ERROR Invalid fraction"
170      OFF ERROR
180 Quit: END
 

Similar techniques can be used for converting: feet and inches to decimal feet or hours and minutes to decimal hours.

The NUM function converts a single character into its equivalent numeric value. The number returned is in the range: 0 to 255. For example:

 
PRINT NUM("A")
 

Result: 65

The next program prints the value of each character in a name.

 

10    INPUT "Enter your first name",Name$
20    !
30    PRINT Name$
40    PRINT
50    FOR I=1 TO LEN(Name$)
60      PRINT NUM(Name$[I]); ! Print value of each character
70    NEXT I
80    PRINT
90    END
 

Entering the name: JOHN will produce the following.

 
74  79  72  78
 

Numeric-to-String Conversion

The VAL$ function converts the value of a numeric expression into a character string. The string contains the same characters (digits) that appear when the numeric variable is printed. For example:

 
PRINT 1000000,VAL$(1000000)
 

Prints: 1.E+6 1.E+6

The next program converts a number into a string so the POS function can be used to separate the mantissa from the exponent.

 

10    CONTROL 2,0;1 ! CAPS LOCK ON
20    INPUT "Enter a number with an exponent",Number
30    !
40    Number$=VAL$(Number)
50    !
60    PRINT Number$
70    E=POS(Number$,"E")
80    IF E THEN
90      PRINT "Mantissa is",Number$[1;E-1]
100     PRINT "Exponent is",Number$[E+1]
110   ELSE
120     PRINT "No exponent"
130   END IF
140   END
 

The CHR$ function converts a number into an ASCII character. The number can be of type INTEGER or REAL since the value is rounded, and a modulo 255 is performed. For example:

 
PRINT CHR$(12)   Clear screen
PRINT CHR$(7)    Ring the bell
PRINT CHR$(97);CHR$(98);CHR$(99)     
 

Prints: abc

CRT Character Set

The following program prints the character set on the screen of the CRT to show the order that strings will be sorted.

 
10    ! Program: CRT Character Set.
20    !
30    PRINT CHR$(12);"CRT Character Set"
40    STATUS 1,9;Line_length ! 50, 80, or 128 Columns
50    Left=Line_length/2-16
60    !
70    FOR I=0 TO 255
80      Col=I MOD 16*2+Left
90      Row=I DIV 16+3
100     IF Col=Left THEN
110       PRINT TABXY(Left-5,Row);
120       PRINT USING "3D";I
130     END IF
140     PRINT TABXY(Col,Row);
150     CONTROL 1,4;1       ! Display Functions on
160     PRINT USING "B,B,#";128,I ! Print the Character
170     CONTROL 1,4;0       ! Display Functions off
180   NEXT I
190   PRINT
200   I=127
210   ON KNOB .08 GOSUB Change
220   DISP USING "5A,5D,X,2A,B,B";"ASCII",I,"=",128,I
230   GOTO 220
240 Change:   I=I-KNOBX/10
250           IF I<0 THEN I=0
260           IF I>255 THEN I=255
270           RETURN
280 END
 

ASCII character values from 128 to 159 are treated differently by different systems. Refer to the section "The Extended Character Set" found this chapter.

String Functions

This section covers string functions which perform the following tasks:

String Reverse

The REV$ function returns a string created by reversing the sequence of characters in the given string.

 
PRINT REV$("Snack cans")
 

Prints: snac kcanS

A common use for the REV$ function is to find the last occurrence of an item in a string.

 
10    DIM List$[30]
20    List$="3.22 4.33 1.10 8.55 12.20 1.77"
30    Length=LEN(List$)
40    Last_space=POS(REV$(List$)," ") ! "SPACE" is delimiter
50    DISP "The last item is:";List$[1+Length-Last_space]
60    END
 

Displays: The last item is: 1.77

String Repeat

The RPT$ function returns a string created by repeating the specified string, a given number of times.

 
PRINT RPT$("* *",10)
 

Prints: * ** ** ** ** ** ** ** ** ** *

Trimming a String

The TRIM$ function returns a string with all leading and trailing blanks (ASCII spaces) removed.

 
PRINT "*";TRIM("    1.23     ");"*"
 

Prints: *1.23* TRIM$ is often used to extract fields from data statements or keyboard input.

 

10    INPUT "Enter your full name",Name$
20    First$=TRIM$(Name$[1,POS(Name$," ")])
30    Last$=TRIM$(Name$[1+LEN(Name$)-POS(REV$(Name$)," ")])
40    PRINT Name$,LEN(Name$)
50    PRINT Last$,LEN(Last$)
60    PRINT First$,LEN(First$)
70    END
 

Note that the INPUT statement trims leading and trailing blanks from whatever is typed. If you need to enter leading or trailing spaces, use the LINPUT statement.

Case Conversion

The case conversion functions, UPC$ and LWC$, return strings with all characters converted to the proper case. UPC$ converts all lower-case characters to their corresponding upper-case characters and LWC$ converts any upper-case characters to their corresponding lower-case characters. Roman Extension characters will be converted according to the current lexical order. See the LEXICAL ORDER IS statement later in this chapter for the case conversion listings.

 

10    DIM Word$[160]
20    LINPUT "Enter a few characters",Word$
30    PRINT
40    PRINT "You typed: ";Word$
50    PRINT "Uppercase: ";UPC$(Word$)
60    PRINT "Lowercase: ";LWC$(Word$)
70    END
 

Copying String Arrays and Subarrays

MAT functions (available with the MAT binary) are commonly used to manipulate data in numeric arrays. However, several of these functions can be used with string arrays. For example, a string array is copied into another string array by the following.

 
MAT Copy$ = Original$
 

Note that only the variable name is necessary. The array specifier "(*)" is not included when using the MAT statement.

Every element in a string array will be initialized to a constant value by the following statement.

 
MAT Array$ = (Null$)
 

The constant value can be a literal or a string expression and is enclosed in parentheses to distinguish it from an array name.

A subarray can be copied into another subarray of the same size and shape. For example, suppose you want to copy the string elements in a two-dimensional string array found in rows 1 through 3 and columns 5 and 6 of the string array called Sub_array$ into the array called New$, you would execute the following statement:

 
MAT New$= Sub_array$(1:3,5:6)
 

where the above statement assumes an OPTION BASE of 1 and that New$ is dimensioned to be a 3×2 string array.

For more information on copying numeric and string arrays see the MAT statement in the HP BASIC Language Reference.

Searching and Sorting

Information stored in a string array often requires sorting. There are over a dozen common algorithms that may be used. Each algorithm has certain advantages depending on the number of items to be sorted, the current order of the items, the time allowed to sort the items, and the complexity of the algorithm.

A list of items can be sorted very quickly by the MAT SORT statement.

 
10    ! Program: SORT_LIST
20    DIM List$(1:5)[6]
30    DATA Bread,Milk,Eggs,Bacon,Coffee
40    READ List$(*)
50    !
60    PRINT "original order"
70    PRINT List$(*)
80    !
90    PRINT "ascending order"
100   MAT SORT List$
110   PRINT List$(*)
120   !
130   PRINT "descending order"
140   MAT SORT List$ DES
150   PRINT List$(*)
160   END
 

Running this program produces:

 
original order
Bread    Milk    Eggs    Bacon    Coffee

ascending order
Bacon    Bread   Coffee  Eggs     Milk

descending order
Milk     Eggs    Coffee  Bread    Bacon
 

Sorting by Substrings

A substring range can be appended to the end of a MAT SORT key specifier. For example, to sort the entire first column of a two-dimensional string array called Str_ary$ using the 3rd and 4th characters of each string, you would use this key specifier: (*,1)[3,4]. The MAT SORT statement would be as follows:

 
MAT SORT Str_ary$(*,1)[3,4]
 

Items will then be sorted by the characters within the substring specified. No error results from specifying a substring position beyond the current length of the string.

 
10    PRINT CHR$(12)    ! Program: SUBSORT
20    DATA 1 OLD ORANGE,2 TINY TOADS,3 TALL TREES,4 FAT FOWLS,5 FRIED FISH
30    DATA 6 SLOW SNAILS,7 SLIMY SLUGS,8 AWFUL HOURS,9 NASTY KNIVES
40    DIM Things$(1:9)[38]
50    READ Things$(*)
60    First=1
70    Length=1
80    DISP "Use KNOB and SHIFT-KNOB to change sort field."
90    ON KNOB .2 GOTO Slide
100 Go:MAT SORT Things$(*)[First;Length]
110   FOR I=1 TO 9
120     PRINT TABXY(10,I);Things$(I);RPT$(" ",3)
130   NEXT I
140 W:GOTO W
150   !
160 Slide:STATUS 2,10;Shift      ! Check for SHIFT OR CTRL
170   S=SGN(KNOBX)
180   IF Shift THEN
190     Length=Length+S*(S>0 AND Length<16)+S*(S<0 AND Length>1)
200   ELSE
210     First=First+S*(S>0 AND First<18)+S*(S<0 AND First>1)
220   END IF
230   DISP "MAT SORT Things$(*)[";First;";";Length;"]"
240   PRINT TABXY(9,10);RPT$(" ",First);RPT$("^",Length);RPT$(" ",10)
250   GOTO Go
260   END
 

Adding Items to a Sorted List

Lists of strings can be maintained in sorted order. Every time a new item is added to the list, the list is sorted by the MAT SORT statement. To prevent overwriting any of the items already in the list, items should be added to the top (first array element) of a list sorted in ascending order and to the bottom (last array element) of a list sorted in descending order.

 
10    PRINT CHR$(12)
20    ! Since arrays are in COM, they "remember" old values.
30    ! After running, execute SCRATCH C to clear the arrays.
40    !
50    COM Ascend$(1:18)[18],Descend$(1:18)[18]
60 Again:I=I+1
70    INPUT "Enter a word",Word$
80    Ascend$(1)=Word$             ! Fill array at top
90    Descend$(18)=Word$           ! Fill array at bottom
100   CALL See
110   IF I<18 THEN Again
120   BEEP
130   END
140   !---------------------------------------------------
150   SUB See                      ! DISPLAY THE ARRAYS
160     COM Ascend$(*),Descend$(*)
170     MAT SORT Ascend$           ! <- ascending sort
180     MAT SORT Descend$ DES      ! <- descending sort
190     FOR J=1 TO 18
200       PRINT TABXY(1,J);RPT$(" ",49)
210       PRINT TABXY(1,J);J;TABXY(11,J);Ascend$(J);TABXY(31,J);Descend$(J)
220     NEXT J
230   SUBEND
 

Sorting by Multiple Keys

When sorting a multi-dimensional numeric or string array, it is possible to specify more than one key. The array will be sorted by the first key then the second key and so on until the key specifiers are exhausted. Once the first key sorts items into similar groups, the items within a group can be arranged in any order you choose.

 
10    COM Tool$(1:8,1:3)[10]
20    DATA PENCIL,RED,35,PENCIL,BLUE,12,PENCIL,GREEN,0,PENCIL,BLACK,17
30    DATA PEN,BLACK,17,PEN,BLUE,127,PEN,RED,55,PEN,GREEN,43
40    READ Tool$(*)
50    PRINT
60    PRINT "*** UNSORTED LIST ***"
70    Display
80    PRINT "*** SORT BY COLOR ***"
90    MAT SORT Tool$(*,2)[1,3]     ! Sort color by first three letters.
100   Display
110   PRINT "*** SORT BY COLOR THEN BY NAME ***"
120   MAT SORT Tool$(*,2),(*,1)    ! Two key sort.
130   Display
140   PRINT "*** SORT BY NAME THEN BY COLOR ***"
150   MAT SORT Tool$(*,1),(*,2)[1;3] DES
160   Display
170   END
180   !----------------------
190   SUB Display
200     COM Tool$(*)
210     K=K+1
220     FOR I=1 TO 8
230       FOR J=1 TO 3
240         PRINT Tool$(I,J),
250       NEXT J
260       PRINT
270     NEXT I
280   SUBEND
 

Sorting to a Vector

It is possible to determine the sorting order of items in an array without disturbing the array. This is accomplished by "sorting" to a single-dimensioned numeric array (vector). The vector will then contain the subscripts of the items in the order that the items would have been arranged.

 
10    DIM Month$(1:12)[3],Fix(1:12)
20    DATA JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC
30    READ Month$(*)
40    MAT SORT Month$ TO Fix    ! Sort to vector
50    PRINT Month$(*)
60    PRINT Fix(*)
70    FOR I=1 TO 12
80      PRINT Month$(Fix(I)),   ! Print months alphabetically
90    NEXT I
100   END
 

Running this program produces:

 
JAN  FEB  MAR  APR  MAY  JUN  JUL  AUG  SEP  OCT  NOV  DEC

4  8  12  2  1  7  6  3  5  11  10  9

APR  AUG  DEC  FEB  JAN  JUL  JUN  MAR  MAY  NOV  OCT  SEP
 

The first element of the vector contains a four (4), indicating the fourth element in the array would be the first element if the array were actually sorted.

Reordering an Array

The rows and columns of multiple dimension arrays can be reordered. Reordering is made according to a reorder vector (single dimension array). The vector contains the values of the subscripts of the array. When the array is reordered, the columns (or rows) are arranged according to the the order of the subscripts in the reorder vector.

Searching for Strings

The following program outlines a method for replacing a word in a string.

 
100    ! Program: Word_Replace
110    !
120    DIM Text$[80]
130    !
140    Search$="bad"
150    Replace$="good"
160    Text$="I am a bad string."
170    !
180    PRINT Text$
190    S_length=LEN(Search$)
200    Position=POS(Text$,Search$)
210    IF NOT Position THEN Quit
220    !
230    Text$=Text$[1,Position-1]& Replace$& Text$[Position+S_length]
240    !
250    PRINT Text$
260    Quit: END
 
Print: I am a bad string. I am a good string.

Searching String Arrays

Searching string arrays is similar to searching numeric arrays. For example, assume array List$ contains a list of names and dollar amounts. The program shown next puts the data into the source array (List$). It then searches for a particular name and outputs the corresponding dollar amount.

 
100    OPTION BASE 1                ! Select option base.
110    DIM List$(4)[20]             ! Dimension source array.
120    DATA BLACK BILL $100.00,BROWN JEFF $150.00
130    DATA GREEN JIM $200.00,WHITE WILL $125.00
140    READ List$(*)                ! Read data into List$.
150    PRINT USING "20A,/";List$(*) ! Output the original list.
160    MAT SEARCH List$(*)[1,5],LOC("BROWN");Person ! Search proper
170    ! portion of each string in List$ for a
180    ! particular person.
190    PRINT
200    IF Person<=4 THEN
210      PRINT List$(Person)[1,5];": ";List$(Person)[13,20] ! Output
220      !                          specified name and dollar amount.
230    END IF
240    END
 

In this program a MAT SEARCH is used to find the string which contains the required name. Once that string is found, the portion of it containing the dollar amount is displayed. Note that the substring specifier is used in the search and display statements. If you run this program, the following results are obtained.

 

BLACK BILL $100.00
BROWN JEFF $150.00
GREEN JIM  $200.00
WHITE WILL $125.00

BROWN: $150
 

Number-Base Conversion

The two functions IVAL and DVAL convert a binary, octal, decimal, or hexadecimal string value into a decimal number. The IVAL$ and DVAL$ functions convert a decimal number into a binary, octal, decimal, or hexadecimal string value. The IVAL and IVAL$ functions are restricted to the range of INTEGER variables (-32 768 thru 32 767). The DVAL and DVAL$ functions allow "double length" integers and thus allow larger numbers to be converted (-2 147 483 648 thru 2 147 483 647).

If you are familiar with binary notation, you will probably recognize the fact that IVAL and IVAL$ operate on 16-bit values while DVAL and DVAL$ operate on 32-bit values.

Introduction to Lexical Order

The LEXICAL ORDER IS statement (available with LEX) lets you change the collating sequence (sorting order) of the character set. Changing the lexical order will affect the results of all string relational operators and operations, including the MAT SORT, MAT SEARCH, and CASE statements. In addition to redefining the collating sequence, the case conversion functions, UPC$ and LWC$, are adjusted to reflect the current lexical order.

Predefined lexical orders include: ASCII, FRENCH, GERMAN, SPANISH, SWEDISH, and STANDARD. You can create lexical orders for special applications. The STANDARD lexical order is determined by an internal keyboard jumper, set at the factory to correspond to the keyboard supplied with the computer. The setting can be determined by examining the proper keyboard status register (STATUS 2,8;Language). Thus, the STANDARD lexical order on a computer equipped with a French keyboard will actually invoke the FRENCH lexical order.

Why Lexical Order?

A common task for computers is to arrange (sort) a group of items in alphabetical order. However, "alphabetical order" for a computer is normally based on the character sequence of the American Standard Code for Information Interchange (ASCII) character set. While the ASCII character sequence is adequate for many English Language applications, most foreign language alphabets include accented characters which are not part of the standard ASCII character set but must be included in the sequence to correctly sort the characters used in the language.

How It Works

The LEXICAL ORDER IS statement modifies the collating sequence by assigning a new value to each character. The new value, called a sequence number, is used in place of the character's ASCII value whenever characters are compared.

The ASCII Character Set

The ASCII set consists of 128 distinct characters including upper-case and lower-case alpha, numeric, punctuation, and control characters.

The complete ASCII character set is shown in the "Useful Tables" section of the HP BASIC Language Reference manual. Each character is given along with its ASCII value. The character's value (given in the table) is actually the decimal representation of the binary value (bit pattern) used internally, by the computer, to represent the character.

The characters are arranged in ascending value, which is to say, in ascending lexical order. A character is "less than" another character if its ASCII value is smaller. From the table it can be seen that "A" is less than "B" since the value of the letter "A" (65) is less than the value of the letter "B" (66).

If you have experimented with string comparisons based on the ASCII collating sequence, you may have noticed a few shortcomings. Consider the following words.

RESTORE, RE-STORE, and RE_STORE

Sorting these items according to the ASCII collating sequence will arrange them in the following order.

RE-STORE < RESTORE < RE_STORE

This points out a limitation of string comparisons based on ASCII sequence. Since the hyphen's value (45) is less than any alpha-numeric character, and the underbar's value (95) is greater than all upper-case alpha characters, a word containing a hyphen will be less than the same word without the hyphen, and a word containing an underbar will be greater than the same word without the underbar. The LEXICAL ORDER IS statement lets you overcome these limitations by changing the sorting order of the character set.

Displaying Control Characters

Several special display features are available through the use of STATUS and CONTROL registers. Normally, ASCII characters 0 through 31 (control characters) are not displayed on the CRT. To enable the display of control characters, execute the following statement.

 
CONTROL 1,4;1 or DISPLAY FUNCTIONS ON
 

Printing a line of text to the CRT will now show the trailing carriage-return and linefeed. Although this mode is useful for some applications, control characters are usually not displayed on the CRT.

 
CONTROL 1,4;0 or DISPLAY FUNCTIONS OFF
 

Turns off the special display functions mode.

The Extended Character Set

Only 128 characters are defined in the ASCII character set. An additional 128 characters are available in the extended character set. The extended set includes CRT highlighting characters, special symbols, and Roman Extension characters (accented vowels and other characters used in many European languages).
NOTE
Some printers produce different extended characters than those displayed on the CRT. Check the printer manual for details on alternate character sets.

Display Enhancement Characters

Certain combinations of characters sent to the display using PRINT or DISP affect the way characters are displayed. The characters which control underlining, inverse video, and blinking are ++display enhancement characters++. These characters do not actually appear on the display themselves; they affect the appearance of subsequent printable characters. Whether or not any of these display enhancements are available depends on the capabilities of your display hardware.

For a complete list of all BASIC display enhancement characters, refer to the tables in the back of the HP BASIC Language Reference and the HP BASIC Condensed Reference. SYSTEM$ can be used to determine what CRT highlights are present. The expression

 
SYSTEM$("CRT ID")
 

returns a string containing information such as the CRT width and available highlights. The string returned by this expression is for Series 300 medium resolution monochrome monitors is:

 
6: 80H GB1
 

The 80 is the width of the CRT in characters and the H indicates that monochrome highlights are available. If there were a space instead of the H, then the CRT does not have highlights.

You can also determine if you have CRT highlights by sending highlight characters to the CRT and seeing if anything happens. For example, CHR$(132) turns on underlining and CHR$(128) turns off enhancements. Thus, on a display with highlights, the following:

 
PRINT CHR$(132);"This is important.";CHR$(128)
 

produces this:

 
This is important.On a display with highlights, this text is underlined.
 

On a display without highlights, the display enhancement characters are ignored and the line is displayed as normal text. Note that these display enhancement characters produce an action only in PRINT and DISP statements. When viewed in EDIT mode or on the system message line, these display enhancement characters appear as "h__p__".

Alternate CRT Characters

There is a keyboard control register for the CRT mapping of character codes, changing the contents of the register may cause different characters to be displayed.

Try the following.

 

PRINT CHR$(247)
CONTROL 1,11;1
PRINT CHR$(247)
CONTROL 1,11;0
 

The first print statement will produce the character expected from the character tables. The second print statement may show a character (double arrow) from an alternate character set. Note that the alternate character set is only available on some displays.

Finding "Missing" Characters

By now, you may have noticed that there are more possible CRT characters than keys on the keyboard. If your particular keyboard does not have a key for the character you need, locate the [ANY CHAR] key (every keyboard has this key).

When you press the [ANY CHAR] key, the message, "Enter 3 digits, 000 to 255" appears in the lower left corner of the CRT. Enter the three digits: 065 and the character whose value is 65 (the letter "A") will be placed on the screen. Any character can be input by this method. Pressing a non-digit key or entering a value outside the range will cancel the function.

Predefined Lexical Order

When the LEX Binary is first loaded or after a SCRATCH A, the computer executes a LEXICAL ORDER IS STANDARD statement. This will be the correct lexical order for the language on the keyboard. This can be checked by examining the keyboard status register (STATUS 2,8;Language) or by either of the following statements.

 

SYSTEM$("LEXICAL ORDER IS")
SYSTEM$("KEYBOARD LANGUAGE")
 

The table below shows the language indicated by the value returned by the STATUS statement. Thus, if the value returned indicates a French keyboard, the STANDARD lexical order will be the same as the FRENCH lexical order. The STANDARD lexical order for the Katakana keyboard is ASCII.
Value Keyboard Language Lexical Order
0 ASCII ASCII
1 FRENCH FRENCH
2 GERMAN GERMAN
3 SWEDISH SWEDISH
4 SPANISH (European Spanish keyboard) SPANISH
5 KATAKANA KATAKANA
6 CANADIAN ENGLISH ASCII
7 UNITED KINGDOM ASCII
8 CANADIAN FRENCH FRENCH
9 SWISS FRENCH FRENCH
10 ITALIAN FRENCH
11 BELGIAN GERMAN
12 DUTCH GERMAN
13 SWISS GERMAN GERMAN
14 LATIN (Latin Spanish keyboard) SPANISH
15 DANISH SWEDISH
16 FINNISH SWEDISH
17 NORWEGIAN SWEDISH
18 SWISS FRENCH* FRENCH
19 SWISS GERMAN* GERMAN
20 KANJI 1

1 Refer to the HP BASIC Porting and Globalization manual for information about the KANJI character set.
NOTE
The predefined lexical-order tables are found in the "Useful Tables" section in the HP BASIC Language Reference manual. For information about user-defined lexical orders, refer to the HP BASIC Advanced Programming Techniques manual.

Either the CHR$ function or [ANY CHAR] may be used to produce characters not readily available on the keyboard.