You can load data directly into GenStat from ASCII format files that follow a
standard layout, as follows:
- The data must be arranged in columns
- There must be the same number of columns in each row
- The data may be preceded by a line giving the names of the columns
- The data can be numbers or strings. Strings containing spaces or other separators must be enclosed in single quotes (').
- Numerical data will be stored as variates, and strings will be stored in texts. Data can also be grouped automatically,
to form factors.
You can start the file with comments describing the data. Comments should be enclosed by double quotes (") and the opening quote must
be at the very beginning of the line. An example ASCII data file illustrates how a
data file can be arranged.
ASCII Data Filename
Specifies the name of the data file. You can browse for a filename by clicking on the Browse
button.
View of Data File
Provides a view of the contents of the file that can be used to help complete the options for this
menu.
Display Data in a Spreadsheet
When selected, the data will first be loaded into GenStat, then a new
spreadsheet will be opened containing the data read from the file.
Read Column Names From File
This option should be checked if the column names are included at the head of
the data file.
Names for Data Columns
If the data file does not contain the names of the columns you should specify
them in this input field. Type in a list of names separated by spaces or commas.
The number of names should be the same as the number of columns, or the data
may not be read in correctly (the data are read in parallel, as described
in the FILEREAD procedure).
Automatically Group Data
When selected, columns that contain fewer distinct values than
specified in the Maximum Number of Categories field will
be automatically converted into factors, with levels or labels created as appropriate.
Maximum Number of Categories
Specifies the maximum number of categories to use when automatically forming
factors from the data.
Missing Value Indicator
Specifies the character used to indicate missing data values. Values that begin
with this character will be read in as 'missing'. For example, if '-' is the
missing value indicator, any negative numbers will be stored as missing value.
Data Separator
Specifies the character used to separate data values in the file, for example
a comma (,). White space (spaces, tabs, etc.) can always be used in addition
to the specified Data Separator.
Display
Specifies information to be displayed when loading data from the file. Items
that are checked will be printed out in the Output Window and should be examined
carefully to ensure that the data has been read correctly.
See Also
- FILEREAD procedure and READ
directive for reading ASCII data using commands