To aid in the conversion of environments from the old PDP software to
the format used in PDP++, and for generally importing training and
testing data represented in plain text files, we have provided
functions on the Environment that read and write `.pat' files in
the old PDP format. These functions are called ReadOldPDP
,
and WriteOldPDP
.
The format that these functions read and write is very simple, consisting of a sequence of numbers, with an (optional) event name at the beginning of the line. When reading in a file, ReadOldPDP simply reads in numbers sequentially for each pattern in each event, so the layout of the numbers is not critical. If the optional name is to be used, it must appear at the beginning of the line that starts a new event.
For example, in the old PDP software, the "xor.pat" file for the XOR example looks like this:
p00 0 0 0 p01 0 1 1 p10 1 0 1 p11 1 1 0
It is critical that the EventSpec and its constitutent PatternSpecs (see section 12.2 Events, Patterns and their Specs) are configured in advance for the correct number of values in the pattern file. The event spec for the above example would contain two PatternSpecs. The PatternSpecs would look like:
PatternSpec[0] { type = INPUT; to_layer = FIRST; n_vals = 2; }; PatternSpec[0] { type = TARGET; to_layer = LAST; n_vals = 1; };
So that the first two values (n_vals = 2) will be read into the first (input) pattern, and the third value (n_vals = 1) will be read into the last (output) pattern.
The ReadOldPDP
function also allows comments in the .pat
files, as it skips over lines beginning with # or //. Further,
ReadOldPDP
allows input to be split on different lines, since it
will read numbers until it gets the right number for each pattern.
There is a special set of comments you can use to control the creation
and organization of subgroups of events. To start a new subgroup, put
the comment # startgroup
before the pattern lines for the events
in your subgroup. When you are done with a subgroup put the comment
# endgroup
after the patterns of the last event in that
subgroup. For example, if you wanted 2 groups of 3 events you might have
a file that looked like this:
# startgroup p01 0 0 0 p02 0 1 1 p03 0 1 0 # endgroup # startgroup p11 1 0 1 p12 1 1 0 p13 1 1 1 #endgroup
WriteOldPDP
simply produces a file in the above format for all of
the events in the environment on which it is called. This can be useful
for exporting to other programs, or for converting patterns into a
different type of environment. For example if events were created
originally in a regular Environment, but you now want to associate a
frequency with them, then you can use WriteOldPDP to save the regular
events to a file, and then use ReadOldPDP to read them into a
FreqEnv
which will enable a frequency to be attached to them.
For Environments that are more complicated than a simple list of events
(e.g., if groups of events are used), it is possible to use CSS to
import text files of these events. Example code for reading events
structured into subgroups is included in the distribution as
`css/include/read_event_gps.css', and can be used as a starting
point for reading various kinds of different formats. The key function
which makes writing these kinds of functions in CSS easy is
ReadLine
, which reads one line of data from a file and puts it
into an array of strings, which can then be manipulated, converted into
numbers, etc. This is much like the `awk' utility.
The read_event_gps.css
example assumes that it will be read into
a Script
object in a project, with three s_args
values that
control the parameters of the expected format. Note that these
parameters could instead be put in the top of the data file, and read in
from there at the start.