GEMPACK Text Files


GEMPACK uses two file formats for storing data:

Header Array Files, and

GEMPACK Text Data Files

One way to turn data into Header Array format is to first prepare a special text file, called a GEMPACK text file, which ViewHAR or MODHAR can read and translate into the HAR format. ViewHAR can also write to GEMPACK Text files. Note that:

GEMPACK Text files created by ViewHAR do not store the set element information contained in some Header Array files.

ViewHAR saves real numbers to GEMPACK Text files with full (single) precision, regardless of how many decimal places are used at the time to display data on the screen.

When writing text files, ViewHAR always uses a CSV, spreadsheet format (each row is a single line, numbers separated by commas).

GEMPACK Text files can also be opened by Excel or other spreadsheet programs.

The following description of the GEMPACK Text file format has been adapted from Chapter 20 of GEMPACK document GPD-4, which should be consulted for further information. In some respects ViewHAR is more permissive than ModHAR in interpreting text data: these differences are noted below.

In preparing arrays of data for GEMPACK text files, a text editor or spreadsheet may be used. GEMPACK Text files can contain several arrays. For each array, there must be the how much data information, followed by the actual data values of the array.

Numerical Data Items

The how much data information can continue onto second and subsequent lines, can contain comments and must finish with a semicolon ';' The expected format is

<sizes> <type> <order> <header> <longname> ;

where

<sizes> is a list of 1 to 7 positive integers, giving the sizes of the array of data values,

<type> is either real or integer (if omitted, real is assumed)

<order> is either row_order, col_order or spreadsheet (if omitted, row_order is assumed)

<header> and <longname> are suggested but not compulsory (see below).

The input of text strings such as 'real' and 'header' is not case sensitive so either 'real' or 'REAL' is correct.

Examples

A 2-dimensional real array of size 2 x 3 in row order, could appear as:

2 3 real row_order header "RR03"

longname "Income by industry and region";

3.5 6.2 4.1

7.6 1.2 33.2

The header and longname are advised but optional. The actual long name (up to 70 characters long) must all be on one line of the file; neither it (nor the actual header) can be split over two lines. If no header is specified, ViewHAR will make one up (NH01, NH02, etc). Don't forget the semicolon at the end of the how much data line(s).

If we wanted to represent the same data in col order we would type in the transpose of the matrix:

2 3 real col_order header "RR03" longname "Income by industry and region";

3.5 7.6

6.2 1.2

4.1 33.2

In each case ViewHAR just looks for 6 numbers following the how much data information, and ignores line breaks. So the second example could also be written (if you wanted to be really confusing):

2 3 real col_order header "RR03" longname "Income by industry and region";

3.5 7.6 6.2

1.2 4.1 33.2

When reading text files, ViewHAR treats row_order and spreadsheet as synonymous. When writing text files, ViewHAR always uses spreadsheet format (each row is a single line, numbers separated by commas)

For vectors, row_order and col_order have no effect and may be omitted. You may write either:

3 real header "RR06" longname "Income by region";

11.1

7.4

37.3

or

3 real header "RR06" longname "Income by region";

11.1,7.4,37.3

The latter also shows how commas may be used instead of spaces between numbers. Real numbers may be written without or without a decimal point or may be in scientific format:

18 1.22 -7 +1000 9.72E-5

Repeated values in arrays can be given in the form

<positive integer>*<real number>

For example, 20*1.35 gives 20 values each of 1.35. Don't leave spaces beside the '*' symbol, and don't include brackets around negative real numbers. For negative values the appropriate form is:

20*-1.35

There is a limit of 16000 numbers per line, even when the repeated values scheme is used. Thus, the following is illegal:

150 150 real spreadsheet header "BIG1" longname "Many Sevens";

22500*7.0

Comments

Any part of an input line following a single exclamation mark ! will be ignored. If the exclamation mark is at the start of the line, the whole line will be ignored. Each comment finishes at the end of the line it starts on but can be continued by putting ! at the start of the next line. Blank lines are also ignored (except in character data, see below). Comments can be included anywhere amongst the data values for real or integer data (ViewHAR is a bit more permissive than ModHAR here). For example:

103 real header "ARM1" longname "Armington Elasticities"; ! added 14/11/97

22*1.2 ! ag and mining

50.0 ! crude oil

40*2.9 ! manufactures

40*0.0 ! non-traded

Multi-dimensional Matrices

When reading real matrices of 3 or more dimensions from text files, ViewHAR follows the same rules as MODHAR. You can read about these rules in Chapter 20 of GEMPACK document GPD-4.

Suppose you had a matrix dimensioned 5x4x3. Your how much data line would read:

5 4 3 real header "IM3D" longname "special data";

followed by 3 matrices each of 5 rows and 4 columns.

If this seems confusing, consider the following alternative suggestions:

Read in the data as two dimensional slices into separate headers and combine them into one using a TABLO program.

Use ViewHAR to write out a 3-D header as a GEMPACK text file, and then use this text file as a pattern to follow.

In ViewHAR, create a blank 3-D array of appropriate dimensions. In Excel, prepare the various 2-D slices as you find convenient. Then, using the ViewHAR combo boxes to present a view of your blank matrix which corresponds to the Excel data, paste in the actual numbers from Excel.

Integer Data

An integer vector might be written:

7 integer header "AGOC" longname "occ-to-aggocc mapping";

1,1,2,2,2,3,4

GEMPACK only allows integer arrays to have at most two dimensions. Otherwise, ViewHAR treats integer text data the same as real data. Even the following is legal:

4 integer header "pqrt" ;

22, 3, 4.2, 10

ViewHAR will round the 4.2 to 4. However, it is poor practice to rely on this feature!

Character Data

Only vectors of character data are allowed, not matrices. For example:

3 strings length 12 header "YUMM" longname "Ice cream flavours";

Vanilla

Chocolate

Strawberry

The how much data format is:

<n1> strings length <n2> <header> <longname> ;

where <n1> and <n2> are positive integers specifying the number of strings and their maximum length. There should be one character string per line: they cannot be broken and continued on the next line. Don't have blank lines after the how much data information: they will be interpreted as character strings filled with blanks. Officially, comments are not allowed in character data. However, if the strings are declared in the 'how much data' section to have length 40 (for example), any characters in columns 41 and onwards of each data line are ignored.

Other methods of Saving Data to file are:

Excel (XLS) format

GDX (GAMS) files

Database text files



URL of this topic: www.copsmodels.com/webhelp/viewhar/hc_gemtext.htm

Link to full GEMPACK Manual

Link to GEMPACK homepage