Chapter 4. The Sound subsystem

Table of Contents
The SCI0 Sound Resource Format
Mapping instruments in FreeSCI

The SCI0 Sound Resource Format

by Ravi I.

Revision 8, Dec. 21, 2000

Preface

Sierra's SCI0 sound resources contain the music and sound effects played in the game. With the introduction of SCI, the company took advantage of new sound hardware which allowed for far better music than the traditional PC speaker could ever create. Sierra chose two devices to specifically target: the MT-32, and the Adlib. The MT-32 is a MIDI synth while the Adlib is a less expensive card based around the OPL2, a non-MIDI chip. Anyone interested in Sierra music and its history can find information at the Sierra Soundtrack Series (http://www.queststudios.com).

Music is stored as a series of MIDI events, and the sound resource is basically just a MIDI file. Much of what I write here comes from outside documents, and I would direct anyone seeking more information to a MIDI specification. The MIDI resource page (http://www.midi.org/resource.htm) on midi.org has some good links as well as tables with status and control information.

Some people prefer the one-based numbering system for channel and program numbers. I personally prefer the zero-based system, and use it here. If you're familiar with channels 1-16, be aware that I will call them 0-15. My intention is not to be deviant from other programs but to be more accurate in representing the way information gets stored. The same is true for programs 0-127 as opposed to 1-128. For whatever reason, convention already holds that controls be numbered 0-127, so nothing in my treatment of them should be abnormal.

Sierra changed its sound file format in the switch to SCI1. I refer only to SCI0 sound files in this specification. Hybrid interpreters such as the one used for Quest for Glory II are also excluded. Finally, SCI games written for non-DOS systems may have different formats. This document applies to Sierra's IBM games.

Please post comments or questions to the SCI webboard: http://www.InsideTheWeb.com/mbs.cgi/mb173941

You can contact me personally at ravi.i@softhome.net, but I would prefer that SCI messages be posted on the webboard so everyone can see them.

Sound Devices

A gamer's sound hardware greatly affects how the music will sound. Devices used by SCI0 can be broken into 3 categories:

MIDI Synths

These will generally give the best sound quality. MIDI synths are polyphonic with defineable instruments through patch files and full support for MIDI controls. The General MIDI standard had not been written when Sierra began writing SCI games, and as far as I know no SCI0 game uses a GM driver or includes a GM track. This means that synths had to be individually supported.

Non-MIDI Synths

Generally not as good as MIDI synths, but also less expensive. The OPLx family of chips are still very common among home PC users thanks to the Adlib and SoundBlaster cards. Synths are polyphonic with defineable instruments through patch files, but drivers must be written to interpret MIDI events and turn them into commands the hardware will recognize. Support for most sound controls gets lost in the process. Furthermore, drivers must map logical, polyphonic MIDI channels to physical, monophonic hardware channels. A specific control was introduced for this purpose and will be discussed later. There is no common way of accessing these devices, so they must be individually supported. These days, however, most people have an Adlib compatible card.

Beepers

Beepers produce very poor music and don't support instrument definitions, but all PC users have one so supporting them covers people without special sound hardware. The most common device is the PC speaker, which is monophonic. Another is the Tandy speaker with 3 channels. Drivers must interpret MIDI events, but need only concern themselves with basic functionality. Interpreting the MIDI events is also made easier because each channel is monophonic. To play a chord on the Tandy, for example, each voice must be put in a separate MIDI channel.

With such a diverse group of devices to support, Sierra put a lot of the work on the shoulders of the drivers. Functions for loading patch files, handling events, pausing, etc. are all in the drivers. The interpreter calls them as needed but does not concern itself at all with how they get implemented.

Listed here are devices supported by the SCI0 interpreter with a little information about each. There could very well be other hardware not listed here, so please send in any missing information. Also, since the interpreter is made to be device independent, this list could be easily expanded. The process would require that a new driver be written and a patch file created if appropriate. Since the driver is responsible for choosing which patch file to use and then entering it into the device, the second step would be quite easy.

Device NameDriverPatchPolyFlag
Roland MT-32 mt32 001 32 01h
Adlib adl 003 9 04h
PC Speaker std * 1 20h
Tandy 1000 or PCJr jr * 3 +
Tandy 1000 SL, TL tandy * 3 +
IBM Music Feature imf 002 8  
Yamaha FM-01 fb01 002 8  
CMS or Game Blaster cms 101 12  
Casio MT540 or CT460 mt540 004 10  
Casio CSM-1   007   
Roland D110,D10,D20   000   

(thanks to Shane T. for providing some of this). Blank fields are unknown, not unused.

* when asked which patch to load, the PC and Tandy speaker drivers return FFFFh, which is a signal that they do not use patches
+ the Tandy drivers almost certainly use 10h for their play flag, but this is unconfirmed so I'll leave it out for now

The driver field is the file name of the driver without the drv extension. Patch specifies which patch file the driver requests. Poly is the maximum number of voices which can be playing at once as reported by the driver. The play flag specifies which channels the device will play and gets explained in the header section.

File Format

Sound files follow the same format as all extracted SCI0 resources: the first two bytes of the file contain a magic number identifying the resource type and the rest of the file contains a dump of the uncompressed data. The identifier is the resource type (04h for sound) OR-ed with 80h and stored as a word. The result will be 84h 00h in extracted sound files.

The sound resource data itself is a header with channel initialization followed by a series of MIDI events.

Header

The header provides the sound driver with 2 pieces of information about each channel. The first is a byte which specifies how many voices each logical MIDI channel will be playing. For MIDI synths, this information is not really necessary and is probably ignored. The same goes for beepers. This byte is only useful for non-MIDI synths which must know how many hardware channels each logical MIDI channel will need. This value is only an initial setting. Sound files can request changes to the mapping later with control changes. Requesting more hardware channels than are actually available can cause errors on some drivers.

The second byte describes how the user's sound harware should treat the channel. It is the combination of bit flags which may be OR-ed together. If the appropriate bit is set for the currently selected sound device, the channel will be played. If it is not, the channel will be silent. The driver decides which bit it will use as the play flag, and the table under Sound Devices lists the flag used by each driver. Drivers ignore the first byte (used to request hardware channels) on MIDI channels it won't play.

There is an exception to the above. Channel 9 is the MIDI percussion channel. The MT-32 (and possibly other MIDI synths) always plays this channel, regardless of whether or not it is specifically flagged.

Before the channel initialization is a byte which specifies whether the header holds information for 15 or 16 channels. A value of 0 means that there will be 16 channels and a value of 2 means that there will be 15 channels. All other values are undefined and will render sound files unplayable.

The header format:

1 byte - number of channels code (usually 0, can also be 2)
2 bytes - initialization for channel 0
2 bytes - initialization for channel 1
.
.
.
2 bytes - initialization for channel 15

Notice that the space for channel 15's initialization will always be present. If the number of channels code is 2, the last two bytes of the header will be ignored, but they are stil in the resource. The header is always 33 bytes in length.

Events

The actual music is stored in a series of events. The generic form for an event is:

<byte - delta time> [byte - status] [byte - p1 [p2]]

Delta time is the number of ticks to wait after executing the previous event before executing this event. Standard MIDI stores this as a variable length value. In sound resources, it will generally be one byte and the most significant bit is in fact used as part of the value. In cases where a single byte cannot hold a high enough delta time, sound resources use F8h one or more times in a row. The F8h byte causes a delay of F0h (240 decimal) ticks before continuing playback. The sequence F8 F8 78 FC waits 600 ticks then stops the sequence because of the FCh status. The fact that F8h waits F0h ticks makes me think that E9h is the largest technically accepted delta time, even though larger values will work. The only exception is the FCh status, which can be theoretically given without a preceding delta time value -- any cases where FCh is listed as a delta time, the sound should stop playing. Ticks occur at 60 Hz, which is about 16667 microseconds between ticks.

The status byte is basically a command. The most significant bit is always set. This feature is important because the status byte will not always be present. If you read a byte expecting it to be a status byte but the most significant bit is not set, that byte is actually a paramater and you should repeat the last status used. This is known as running status mode and appears to get used relatively often.

The generic form for a status byte is (in bits) 1xxxcccc - The lower nibble designates which channel the message affects except when the upper nibble is 15. The upper nibble is the command, but as stated earlier, the most significant bit must be 1. That leaves space for 8 messages, most of which require at least one paramater. Paramaters will never have their most significant bit set.

Status Reference

8x n v

Note off: Stop playing note n on channel x, releasing the key with velocity v. If a hold pedal is pressed, the note will continue to play after this status is received and end when the pedal is released. A zero velocity note on can also be used to stop playing a note.

9x n v

Note on: Play note n on with velocity v on channel x. The velocity is the speed with which the key gets pressed, which basically means how loud the note should be played. Playing a note with velocity 0 is a way of turning the note off.

Ax n p

Key pressure (after-touch): Set key pressure to p for note n on channel x. This is to modify key pressure for a note that is already playing.

Bx c s

Control: Set control c to s on channel x. This can be confusing because there isn't just one meaning. Changing the settings on different controls will, of course, have different outcomes.

Controls which handle any value are continuous controllers. They have a continuous range. Controls which are only on/off are switches. Their defined range is only 01h (OFF) and 7Fh (ON). However, in order to respond to all values, 01h-3Fh is treated as OFF and 40h-7Fh is treated as ON. While in practice they may only use bit 6 as a flag, my personal opinion is that values between 01h and 7Fh should be avoided for the sake of clarity.

Listed in this reference are the non-standard MIDI controls I've found in Sierra SCI0 sound files. Not all drivers support all controls.

Control Refrence

4Bh

Channel mapping: When a channel sets this control, it tells the driver how many notes it will be playing at once, and therefore how many hardware channels it occupies.

4Ch

Reset on SUSPEND: An on/off switch where a value of zero is off and a non-zero value is on. Note that this is not the same as for standard MIDI control switches. When this control is on, calling the sound driver's SUSPEND subfunction will reset the sound position to the beginning. The initial value is set to off when a sound gets loaded.

4Eh

Unknown: Experiments in setting and clearing it show that a value of 0 will cause notes to be played without regard for the velocity paramater while a value of 1 will enable velocities.

50h

Reverb: I know little about this myself. Rickard Lind reports that it exists in the MT-32 driver and supports parameter values 0-10 (possibly 0-16?).

60h

Cumulative cue: The interpreter can get cues from the sound file, which sets the Sound object's signal property. When a sound gets loaded, the inital cue is set to 127. When a CC60 occurs, the new control value is added to the current cue. If the cue were 130, for example, a CC60 5 on any channel would make the new cumulative cue equal 135.

Cx p

Program change: Set program (patch / instrument / ect.) to p for channel x. This is a simple instrument change.

Channel 15, however, includes two special cases of this status. If the new program is less than 127 the Sound object's signal property is set to the new program, making a non-cumulative cue. If the new program is equal to 127, then that exact position (not the start of the current tick) is set as the loop point. Normally, the driver loops to the beginning of the sound. If an explicit loop point is set, the sound will be replayed from the marked time instead.

The actual time of the loop point is better explained with a short diagram:

0x10 0x91 0x20 0x20 play a note on channel 1
0x05 0x91 0x20 0x00 stop the previous note
0x00 0x92 0x30 0x10 play a note on channel 2
[restart here]
0x00 0xCF 0x7F set loop point
0x00 0xC8 0x05 set to program 5 on channel 8
0x00 0xCF 0x13 set signal to 19
0x20 0xFC end of file, loop to marked location

In both situations (p < 127 and p = 127), no actual program change takes place. Channel 15 is used for control, not playing music.

Dx p

Pressure (after-touch): Set key pressure to p on channel x. This is similar to Ax but differs in its scope. Message Ax is applied on a per-note basis while message Dx is applied to an entire channel.

Ex t b

Pitch wheel: Set the pitch wheel to tb. The setting is actually a 14 bit number with the least significant 7 bits stored in b and the most significant 7 bits stored in t. (Remember the top bit can't be used for either byte.) The range of settings is 0000h to 3FFFh. A setting of 2000h means the pitch wheel is centered. Larger values raise pitch and smaller values lower it.

F0

Begin SysEx: Starts a system exclusive data block. The block must terminate with F7h.

F7

End SysEx: Ends a system exclusive data block. Normal sound data resumes at this point.

FC

Stop Sequence: This is a system real-time message which tells the sound driver to stop the current sound. The sound object's signal property gets set to FFFFh and the position moves to the loop point, which defaults to the beginning. Drivers allow this message to occur without a delta time, but I haven't seen any examples.

Revision history

Revision 8 - Dec. 21, 2000

  • Added suggested limit on delta time values

  • Fixed hex notation (sometimes listed NNh, sometimes 0xNN)

  • Removed notice about early revisions' mistake describing the header's channel mapping byte

  • Added note about control 50h (thanks to Rickard Lind)

  • Listed MT-32 play flag

  • Added notice about the special case of channel 9 to the header section

Revision 7 - Jan. 7, 2000

  • Added information about F8h delta times (thanks to Rickard Lind for bringing these to my attention)

  • Reorganized Fx status information

  • Fixed major error in description of loop points (sorry)

  • Fixed typos

Revision 6 - Sep. 17, 1999

  • Added information about cues

  • Updated control 60h information

  • Added information about loop points

  • Updated control 4Ch information

  • Cleaned up control reference introduction

Revision 5 - Jul. 5, 1999

  • Rewrote much of the specification, trying to focus less on explaining MIDI and more on explaining sound resources

  • Removed information about standard MIDI controls

  • Added driver table

  • Expanded sound device section

  • Completed header information

Revision 4 - Jun. 19, 1999

  • Fixed the list of changes in Revision 3 (was incomplete)

  • Expanded the introductory blurb about controls

  • I began working with a disassembly of ADL.DRV, and am hoping to use it to complete this specification. The next revision should be more interesting than this one.

Revision 3 - May 4, 1999

  • Removed the "compatible games" list. I haven't found a non-compatible SCI0 game yet, which makes the list quite useless.

  • Verified that SCI1 sound resources are different.

  • Tidied the "About the output medium" section. Does that term "output medium" sound wordy or unclear? I don't really like it, but I didn't want to beat "sound device" to death.

  • More information about the header

  • Modified the explanation for message FCh.

  • Changed most references to status bytes as "commands" with "messges" to stay more consistent with MIDI terminology.

  • Added midi.org as a source for more MIDI information

  • Removed labels like "tentative" and "incomplete" as things become more concrete -- not complete yet, but getting there.

  • More information about controls

Revision 2 - Jan. 16, 1999

  • Got rid of the HTML. I originally intented to post this as a message on the webboard, but ended up distributing the file. If I'm going to distribute it as a file, there's no need to bother with the HTML since I can do all my formatting as plain text.

  • I found refrences to command 8x in the 1988 Christmas Card, so my comment about not seeing one got removed. To date, I haven't seen any examples of commands Ax or Dx.

  • Expanded the header section.

  • Added information about controls.

  • Added information about the output mediums.

  • Tried to be more consistent with terminology

Revision 1 - Dec. 29, 1998

  • First release of the specification