A-Law is a compounded compression algorithm for voice signals defined by the Geneva Recommendations (G.711). The G.711 recommendation defines A-Law as a method of encoding 16-bit PCM signals into a nonlinear 8-bit format. The algorithm is commonly used in United States telecommunications. A-Law is very similar to µ-Law; however, each uses a slightly different coder and decoder.
The acoustic signature of a system is data containing all of the sound characteristics of a system. This includes such things as reverb time, frequency response and other timbral qualities. Impulse files used by Acoustic Mirror can be thought of as acoustic signatures.
This number is based on the Computer ID number of the computer on which Sound Forge software is installed. Each computer has a unique number, similar to a license plate. An activation number is created based on that number. When you register the software, Sony will generate an activation number for you. Once the activation number is entered, the software will not time out. Since the activation number is based on the Computer ID, it is important that you have Sound Forge software installed on the computer where you will be using it.
A Microsoft technology that enables different programs to share information. ActiveX extends Microsoft Windows-based architecture to include Internet and corporate intranet features and capabilities. Developers use it to build user interactivity into programs and World Wide Web pages.
Adaptive Delta Pulse Code Modulation (ADPCM)
A method of compressing audio data. Although the theory for compression using ADPCM is standard, there are many different algorithms employed. For example, Microsoft’s ADPCM algorithm is not compatible with the International Multimedia Association’s (IMA) approved ADPCM.
Advanced Streaming Format (ASF)
See Windows Media Format.
A type of distortion that occurs when digitally recording high frequencies with a low sample rate. For example, in a motion picture, when a car’s wheels appear to slowly spin backward while the car is quickly moving forward, you are seeing the effects of aliasing. Similarly, when you try to record a frequency greater than one half of the sampling rate (the Nyquist Frequency), instead of hearing a high pitch, you may hear a low-frequency rumble.
To prevent aliasing, an anti-aliasing filter is used to remove high frequencies before recording. Once the sound has been recorded, aliasing distortion is impossible to remove without also removing other frequencies from the sound. This same anti-aliasing filter must be applied when resampling to a lower sample rate.
Amplitude Modulation (AM) is a process whereby the amplitude (loudness) of a sound is varied over time. When varied slowly, a tremolo effect occurs. If the frequency of modulation is high, many side frequencies are created that can strongly alter the timbre of a sound.
When discussing audio, this term refers to a method of reproducing a sound wave with voltage fluctuations that are analogous to the pressure fluctuations of the sound wave. This is different from digital recording in that these fluctuations are infinitely varying rather than discrete changes at sample time. See Quantization.
The attack of a sound is the initial portion of the sound. Percussive sounds (drums, piano, guitar plucks) are said to have a fast attack. This means that the sound reaches its maximum amplitude in a very short time. Sounds that slowly swell up in volume (soft strings and wind sounds) are said to have a slow attack.
Audio Compression Manager (ACM)
The Audio Compression Manager, from Microsoft, is a standard interface for audio compression and signal processing for Windows. The ACM can be used by Windows programs to compress and decompress .wav files.
Audio Event Locator
The Audio Event Locator is similar to a scrub function. However, rather than playing the sound file at a slow speed, it loops playback around the cursor position. This position can be selected by dragging the cursor around in the Sound Forge Overview window.
Audio Interchange File Format (AIFF)
An audio file format developed by Apple Computer.
ASF Stream Redirector file. See Redirector file.
A decrease in the level of a signal.
When discussing audio equalization, each frequency band has a width associated with it that determines the range of frequencies that are affected by the EQ. An EQ band with a wide bandwidth will affect a wider range of frequencies than one with a narrow bandwidth.
When discussing network connections, bandwidth refers to the rate of signals transmitted or the amount of data that can be transmitted in a fixed amount of time (stated in bits/second): a 56 Kbps network connection is capable of receiving 56,000 bits of data per second.
baseline of a waveform is also referred to as the zero-amplitude axis or negative infinity. In the following image, the red line represents the baseline.
Beats Per Minute (BPM)
The tempo of a piece of music can be written as a number of beats in one minute. If the tempo is 60 BPM, a single beat will occur once every second.
The most elementary unit in digital systems. Its value can only be 1 or 0, corresponding to a voltage in an electronic circuit. Bits are used to represent values in the binary numbering system. As an example, the 8-bit binary number 10011010 represents the unsigned value of 154 in the decimal system. In digital sampling, a binary number is used to store individual sound levels, called samples.
The number of bits used to represent a single sample. For example, 8- or 16-bit are common sample sizes. While 8-bit samples take up less memory (and hard disk space), they are inherently noisier than 16- or 24-bit samples.
Memory used as an intermediate repository in which data is temporarily held while waiting to be transferred between two locations. A buffer ensures that there is an uninterrupted flow of data between computers. Media players may need to rebuffer when there is network congestion.
A virtual pathway where signals from tracks and effects are mixed. A bus’s output is a physical audio device in the computer from which the signal will be heard.
Refers to a set of 8 bits. An 8-bit sample requires one byte of memory to store, while a 16-bit sample takes two bytes of memory to store
The Channel Converter is a function that converts files from mono to stereo and stereo to mono with independent level control of the new channels. This function can also create interesting effects by converting stereo files to stereo with various levels and inversion of channels.
The Channel Meters in Sound Forge software display the peak output levels of the sound file currently playing. These meters have selectable resolution and options to hold peaks and valleys.
Chorusing is an effect created by combining a signal with a modulating, delayed copy of itself. This effect creates the illusion of multiple sources creating the same sound.
The clipboard is where sample data is saved when you cut or copy it from a data window. You can then paste, mix, or crossfade the sample data stored on the clipboard with another data window. This sample data can also be used by other Windows applications that support Sound data on the clipboard, such as Sound Recorder.
Occurs when the amplitude of a sound is above the maximum allowed recording level. In digital systems, clipping is seen as a clamping of the data to a maximum value, such as 32,767 in 16-bit data. Clipping causes sound to distort.
Coder/Decoder: refers to any technology for compressing and decompressing data. The term codec can refer to software, hardware, or a combination of both technologies.
A compression ratio controls the ratio of input to output levels above a specific threshold. This ratio determines how much a signal has to rise above the threshold for every 1 dB of increase in the output. For example, with a ratio of 3:1, the input level must increase by three decibels to produce a one-decibel output-level increase:
Threshold = -10 dB
Compression Ratio = 3:1
Input = -7 dB
Output = -9 dB
Because the input is 3 dB louder than the threshold and the compression ratio is 3:1, the resulting signal is 1 dB louder than the threshold.
Compression Ratio (file size)
The ratio of the size of the original uncompressed file to the compressed contents. For example, a 3:1 compression ratio means that the compressed file is one-third the size of the original.
Each computer has a unique number, similar to a license plate. An activation number is created based on that number. Since the activation number is based on the Computer ID, it is important that you have Sound Forge software installed on the computer where you will be using it. The Computer ID is automatically detected and provided to you when you install the software.
The Computer ID is used for registration purposes only. It doesn’t give Sony access to any personal information and can’t be used for any purpose other than for generating a unique activation number for you to use the software.
Mixing two pieces of audio by fading one out as the other fades in:
Sometimes a sample loop cannot be easily created from the given source material. In these instances, a crossfade can be applied to the beginning and end of the loop to aid in the smooth transition between the two. The Crossfade Loop function provides a method of creating sampling loops in material that is otherwise difficult to loop.
The cutoff frequency of a filter is the frequency at which the filter changes its response. For example, in a low-pass filter, frequencies greater than the cutoff frequency are attenuated, while frequencies less than the cutoff frequency are not affected.
Each opened sound file has its own data window. At the top of each data window is a title bar displaying either the title of the sample or the name of the file. Also in each data window are the waveform display, time and level rulers, playbar and other tools that give you information and allow you to navigate throughout the entire sound file.
DC offset occurs when hardware, such as a sound card, adds DC current to a recorded audio signal. This current results in a recorded waveform that is not centered around the baseline (-infinity). Glitches and other unexpected results can occur when sound effects are applied to files that contain DC offsets. Sound Forge software can compensate for this DC offset by adding a constant value to the samples in the sound file.
In the following example, the red line represents the baseline. The lower waveform exhibits DC offset; note that the waveform is centered approximately 2 dB above the baseline.
A unit used to represent a ratio between two numbers using a logarithmic scale. For example, when comparing the numbers 14 and 7, you could say 14 is two times greater than the number 7; or you could say 14 is 6 dB greater than the number 7. Where did we pull that 6 dB from? Engineers use the equation dB = 20 x log (V1/V2) when comparing two instantaneous values. Decibels are commonly used when dealing with sound because the ear perceives loudness in a logarithmic scale.
In Sound Forge software, most measurements are given in decibels. For example, if you want to double the amplitude of a sound, you apply a 6 dB gain. A sample value of 32,767 (maximum positive sample value for 16-bit sound) can be referred to as having a value of 0 dB. Likewise, a sample value of 16,384 can be referred to having a value of -6 dB.
A program that enables Windows to connect different hardware and software. For example, a sound card device driver is used by Windows software to control sound card recording and playback.
Destructive editing is the type of editing whereby all cuts, deletes, mixes and other processes are actually processed to the sound file. Any time you delete a section of a sound file in Sound Forge software, the sound file on disk is actually rewritten without the deleted section. This is different than nondestructive editing.
Digital Rights Management (DRM)
A system for delivering songs, videos, and other media over the Internet in a file format that protects copyrighted material. Current proposals include some form of certificates that validate copyright ownership and restrict unauthorized redistribution.
Digital Signal Processing (DSP)
A general term describing anything that alters digital data. Signal processors have existed for a very long time (tone controls, distortion boxes, wah-wah pedals) in the analog (electrical) domain. Digital Signal Processors alter the data after it has been digitized by using a combination of programming and mathematical techniques. DSP techniques are used to perform many effects such as equalization and reverb simulation.
Since most DSP is performed with simple arithmetic operations (additions and multiplications), both your computer’s processor and specialized DSP chips can be used to perform any DSP operation. The difference is that DSP chips are optimized specifically for mathematical functions while your computer’s microprocessor is not. This results in a difference in processing speed.
A set of Application Program Interfaces designed by Microsoft for multimedia development. A DirectX plug-in, such as the Sony Noise Reduction DirectX Plug-In, uses the DirectX Media Streaming Services (DMSS) API. Because DMSS is a standard API, a DirectX plug-in can be used in any application that supports DMSS.
Dithering is the practice of adding noise to a signal to mask quantization noise.
Drag and Drop
A quick way to perform certain operations using the mouse. To drag and drop, you click and hold a highlighted selection, drag it (hold the left mouse button down and move the mouse) and drop it (let go of the mouse button) at another position on the screen.
The difference between the maximum and minimum signal levels. It can refer to a musical performance (high-volume vs. low-volume signals) or to electrical equipment (peak level before distortion vs. noise floor).
Endian (Little and Big)
Little and Big Endian describe the ordering of multi-byte data that is used by a computers microprocessor. Little Endian specifies that data is stored in a low-to-high byte format; this ordering is used by the Intel microprocessors. Big Endian specifies that data is stored in a high-to-low byte format; this ordering is used by the Motorola microprocessors.
Equalizing a sound file is a process by which certain frequency bands are raised or lowered in level.
A Fourier Transform is the mathematical method used to convert a waveform from the Time Domain to the Frequency Domain.
Since the Fourier Transform is computationally intensive, it is common to use a technique called a Fast Fourier Transform (FFT) to perform spectral analysis. The FFT uses mathematical shortcuts to lower the processing time at the expense of putting limitations on the analysis size.
The analysis size, also referred to as the FFT size, indicates the number of samples from the sound signal used in the analysis and also determines the number of discrete frequency bands. When a high number of frequency bands are used, the bands have a smaller bandwidth, which allows for more accurate frequency readings.
This dialog allows you to associate sound file extensions (such as .wav, .au, .snd, etc.) with Sound Forge software. This dialog is opened from the File tab of the Preferences dialog.
A file format specifies the way in which data is stored. In Windows, the most common audio file format is the Microsoft .wav format. For information on the different file formats supported by Sound Forge software, click here.
Audio uses frame rates only for the purposes of synchronizing to video or other audio. To synchronize with audio, a rate of 30 non-drop is typically used. To synchroniz
e with video, 30 drop is usually used.
Frequency Modulation (FM)
Frequency Modulation (FM) is a process by which the frequency (pitch) of a sound is varied over time. Subaudio frequency modulation results in pitch-bending effects (vibrato). Frequency modulation within audio band frequencies (20 Hz – 20,000 Hz) creates many different side-band frequencies that drastically alter the timbre of the sound.
Frequency Modulation (FM) Synthesis
This type of synthesis relies on the principles of Frequency Modulation. The FM Synthesis tool allows you to use frequency modulation (FM) and additive synthesis to create complex sounds from simple waveforms.
In frequency modulation, the frequency of a waveform (the carrier) is modulated by the output of another waveform (the modulator) to create a new waveform. If the frequency of the modulator is low, the carrier will be slowly detuned over time. However, if the frequency of the modulator is high, the carrier will be modulated so quickly that many additional frequencies, or sidebands, are created.
In Sound Forge software, up to four waveforms (operators) can be used in a variety of configurations. Depending on the configuration, an operator can be a carrier, a modulator, or a simple, unmodulated waveform.
The frequency spectrum of a signal refers to its range of frequencies. In audio, the audible frequency range is between 20 Hz and 20,000 Hz. The frequency spectrum sometimes refers to the distribution of these frequencies. For example, bass-heavy sounds have a large frequency content in the low end (20 Hz – 200 Hz) of the spectrum.
Head-Related Transfer Function (HRTF)
Sounds are perceived differently depending on the direction the sound comes from. This occurs because of the echoes bouncing from your shoulders and nose and the shape of your ears. A head-related transfer function contains the frequency and phase response information required to make a sound appear to originate from a certain direction in 3-dimensional space.
The unit of measurement for frequency or cycles per second (CPS).
A high-pass filter attenuates all frequencies below a cutoff frequency. It is usually used to remove low-frequency rumble from audio files.
The insertion point (also referred to as the cursor position) is analogous to the cursor in a word processor. It is where markers or commands may be inserted depending on the operation. The insertion point appears as a vertical flashing black line and can be moved by clicking the left mouse button anywhere in the data window.
Telecine is the process of converting 24 fps (cinema) source to 30 fps video (television) by adding pulldown fields. Inverse telecine, then, is the process of converting 30 fps (television) video to 24 fps (cinema) by removing pulldown.
InterVoice Sound File Support
The InterVoice sound file format (.IVC), commonly used in telephony applications, is now supported and includes G.711 µ-Law and A-Law, G.721 ADPCM (32 kb/s) and G.723 ADPCM (24 kb/s) data formats.
Inverting sound data reverses the polarity of a waveform around its baseline. Inverting a waveform does not change the sound of a file; however, when you mix different sound files, phase cancellation can occur, producing a “hollow” sound. Inverting one of the files can prevent phase cancellation.
In the following example, the red line represents the baseline, and the lower waveform is the inverted image of the upper waveform.
Limiting is essentially a hard compressor. Limiting is often used to keep signals from going above a certain level, but can also be applied to create heavily compressed effects. Limiting should only be performed on peaks; if the Threshold level is set too low, heavy distortion will occur.
Loops are small audio clips that are designed to create a repeating beat or pattern. Loops are usually one to four measures long.
A low-pass filter attenuates all frequencies above a cutoff frequency. Low-pass filters can be used as anti-alias filters or for general tonal shaping.
A marker is an anchored, accessible reference point in a file. Markers are stored in the Regions List and can be used for quick navigation.
Media Control Interface (MCI)
A standard way for Windows programs to communicate with multimedia devices such as sound cards and CD players. If a device has an MCI device driver, it can easily be controlled by most multimedia Windows software.
Microsoft Sound Mapper
The Sound Mapper is a special device that attempts to select the most appropriate sound card (map) on which to play a sound, or it will translate the sound into a format that can be played on your sound card.
Mid-side (MS) recording is a microphone technique in which one mic is pointed directly towards the source to record the ce
nter (mid) channel, and the other mic is pointed 90 degrees away from the source to record the stereo image. For proper playback on most systems, MS recordings must be converted to your standard left/right (also called AB) track.
MIDI (Musical Instrument Device Interface)
MIDI is a standard language of control messages that provides for communication between any MIDI-compliant devices. Anything from synthesizers to lights to factory equipment can be controlled via MIDI. Sound Forge software uses MIDI to trigger sound file playback, transfer audio data to samplers and synchronize to external software or gear.
MIDI allows for 16 discrete channels for sending data. When dealing with MIDI triggers, Sound Forge software needs to know what MIDI channel to look at for receiving the trigger. The channel this information is sent to in Sound Forge software depends on the device sending the MIDI messages.
A MIDI device-specific timing reference. It is not absolute time like MIDI Time Code (MTC); instead it is a tempo-dependent number of “ticks” per quarter note. MIDI clock is convenient for synchronizing devices that need to perform tempo changes mid-song.
MIDI controllers are a specific type of MIDI message. Sound Forge software can use MIDI controllers to trigger events and playback of sound files. Consult your MIDI sending device to see what controller messages it sends.
MIDI notes are a specific type of MIDI message. Sound Forge software can use MIDI notes to trigger events and playback of sound files. Any MIDI sequencer or controller will send MIDI notes.
A MIDI port is the physical MIDI connection on a piece of MIDI hardware. This port can be a MIDI in, out or through. Your computer must have a MIDI-capable card to output MIDI time code to an external device or to receive MIDI time code from an external device.
MIDI Time Code (MTC)
MTC is an addendum to the MIDI 1.0 specification and provides a way to specify absolute time for synchronizing MIDI-capable applications. MTC is essentially a MIDI representation of SMPTE time code.
Mixing allows multiple sound files to be blended into one file at user-defined relative levels.
Multiple-bit-rate encoding (also known as Intelligent Streaming for the Windows Media platform and SureStream” for the RealMedia G2 platform) allows you to create a single file that contains streams for several bit rates. A multiple-bit-rate file can accommodate users with different Internet connection speeds, or these files can automatically change to a different bit rate to compensate for network congestion without interrupting playback.
To take advantage of multiple-bit-rate encoding, you must publish your media files to a Windows Media server or a RealServerG2.
Musical Instrument Device Interface (MIDI)
A standard language of control messages that provides for communication between any MIDI-compliant devices. Anything from synthesizers to lights to factory equipment can be controlled via MIDI. Sound Forge software uses MIDI for synchronization purposes.
Noise-shaping is a technique which can minimize the audibility of quantization noise by shifting its frequency spectrum. For example, in 44,100 Hz audio quantization noise is shifted towards the Nyquist Frequency of 22,050 Hz.
This type of editing involves a pointer-based system of keeping track of edits. When you delete a section of audio in a nondestructive system, the audio on disk is not actually deleted. Instead, a set of pointers is established to tell the program to skip the deleted section during playback.
Refers to raising the volume so that the highest level sample in the file reaches a user-defined level. Use normalization to make sure you are using all of the dynamic range available to you.
The Nyquist Frequency (or Nyquist Rate) is one half of the sample rate and represents the highest frequency that can be recorded using the sample rate without aliasing. For example, the Nyquist Frequency of 44,100 Hz is 22,050 Hz. Any frequencies higher than 22,050 Hz will produce aliasing distortion in the sample if no anti-aliasing filter is used while recording.
Object Linking and Embedding (OLE)
OLE is a technology developed by Microsoft to allow independent applications to behave as though they are tightly integrated. This allows objects such as Sound Forge audio files to be integrated into other applications such as a Microsoft Word document.
The Overview is the area on the data window directly under the title bar. The entire length of the overview represents the entire sound file. Cursor, selection, and position information is shown relative to the entire length of the sound file.
One-shots are RAM-based audio clips that are not designed to loop. Things such as cymbal crashes and sound bites could be considered one-shots. Longer files can be treated as one-shots if your computer has sufficient memory.
To place a mono or stereo sound source perceptually between two or more speakers.
Pause time is the space between CD tracks. This space may contain silence — as in a standard commercially produced CD — or can contain audio — as in a live performance captured on CD.
The Red Book standard calls for two seconds of pause time, but you can edit the default pause time on the CD Settings tab of the Preferences dialog.
The file created by Sound Forge software when a file is opened for the first time. This file stores the information regarding the graphic display of the waveform so that opening a file is almost instantaneous. This file is stored in the directory where the audio file resides and has an .sfk extension. If this file is not in the same directory as the audio file or is deleted, it will be recalculated the next time you open the file.
The pixel aspect determines whether the pixels are square (1.0) which refers to computers, or rectangular (settings other than 1.000) which typically refers to televisions. The pixel aspect ratio is unrelated to the frame’s aspect ratio.
The Playlist is a list of regions set to play in a specific order. The Playlist allows for nondestructive editing and rearranging of a sound file quickly and easily. Multiple versions of the playlist can be saved in an external playlist file for easy comparison.
Pre-roll is the amount of time elapsed before an event occurs. Post-roll is the amount of time after the event. Pre and post-roll have various uses in Sound Forge software. Pre-roll can be added to a crossfade preview to listen to the sound before the crossfade begins to give context to it. Pre-roll can also be used in the Playlist to hear previous regions when playback is initiated from the middle of the Playlist.
A preset calls up a bulk setting of a function in Sound Forge software. If you like the way you adjusted the EQ but do not want to have to spend the time getting it back for later use, save it as a preset. All presets show up in the drop-down list on the top of most function dialogs in Sound Forge software.
Punching-in during recording means automatically starting and stopping recording at user-specified times.
In telecine conversion, fields are added to convert 24 fps film to 30 fps video.
In 2-3 pulldown, for example, the first frame is scanned into two fields, the second frame is scanned into three fields, and so on for the duration of the film. 2-3 pulldown is the standard for NTSC broadcasts of 24p material. Use 2-3 pulldown when printing to tape, but not when you intend to use the rendered video as source media. Removing 2-3 pulldown is inefficient because the pulldown fields that are created for frame 3 span two frames:
24 fps film (top) and resulting NTSC video with 2-3 pulldown fields (bottom)
Use 2-3-3-2 pulldown when you plan to use your rendered video as source media. When removing 2-3-3-2 pulldown, Sound Forge software simply discards frame three and merges the pulldown fields in the remaining frames:
24 fps film (top) and resulting NTSC video with 2-3-3-2 pulldown fields (bottom)
Pulse Code Modulation (PCM)
PCM is the most common representation of uncompressed audio signals. This method of coding yields the highest fidelity possible when using digital storage. PCM is the standard format for .wav and .aif files.
Compact disc players use the Q channel to display the music playing time. The Q channel is broken down into three modes:
Mode 1: Contains the running times from both the beginning of the disc (total disc time) and the beginning of the track (track relative time).
Mode 2: Identifies the track number, who recorded the track, where it was recorded and in what year.
Mode 3: Identifies UPC media catalog number for the disc.
A special mode of Q data is stored within the lead-in area. This Q data contains information on two- or four- channel format, copy protection, and pre-emphasis.
Quantization is the process by which measurements are rounded to discrete values. Specifically with respect to audio, quantization is a function of the analog-to-digital conversion process. The continuous variation of the voltages of a analog audio signal are quantized to discrete amplitude values represented by digital, binary numbers. The number of bits available to describe these values determines the resolution or accuracy of quantization. For example, if you have 8-bit analog-to-digital converters, the varying analog voltage must be quantized to 1 of 256 discrete values; a 16-bit converter has 65,536 values.
Quantization noise is a result of describing an analog signal in discrete digital terms (see quantization). This noise is most easily heard in low-resolution digital sounds that have low bit depths and sounds like a shhhhh-type sound while the audio is playing. It becomes more apparent when the signal is at low levels, such as during a fade out.
Reactive previews allow for the adjustment of parameters in a function dialog while the preview is playing. When a parameter is changed, the preview will automatically rebuild and continue playback.
Real-Time Streaming Protocol (RTSP)
A proposed standard for controlling broadcast of streaming media. RTSP was submitted by a body of companies including RealNetworks and Netscape.
A metafile that provides information to a media player about streaming media files. To start a streaming media presentation, a Web page will include a link to a redirector file. Linking to a redirector file allows a file to stream; if you link to the media file, it will be downloaded before playback begins.
Windows Media redirector files use the .asx or .wax extension; RealMedia redirector files use the .ram, .rpm, or .smi extension.
A region in Sound Forge software is a subsection of a sound file. You ca
n define any number of regions in a sound file which are stored in the Regions List.
The Regions List is simply the list containing all of the regions and markers defined within the sound file. From this list you can preview and edit the regions as well as drag them to the Playlist or to the desktop to create new files from them.
The act of recalculating samples in a sound file at a different rate than the file was originally recorded. If a sample is resampled at a lower rate, sample points are removed from the sound file, decreasing its size, but also decreasing its available frequency range. Resampling to a higher sample rate, Sound Forge software will interpolate extra sample points in the sound file. This increases the size of the sound file, but does not increase the quality. When downsampling, be aware of aliasing.
The Root Mean Square (RMS) of a sound is a measurement of the intensity of the sound over a period of time. The RMS level of a sound corresponds to the loudness perceived by a listener when measured over small intervals of time.
The level ruler is the area on a data window to the left of the waveform display. It shows the vertical axis units as a percentage or in decibels.
The time ruler is the area on a data window above the waveform display. It shows the horizontal axis units as well as marker, region, and loop tags.
Ruler tags are the small tab-shaped controls on the time ruler that represent the location of markers, regions, and loop points in the waveform display.
The word sample is used in many different (and often confusing) ways when talking about digital sound. Here are some of the different meanings:
A discrete point in time which a sound signal is divided into when digitizing. For example, an audio CD-ROM contains 44,100 samples per second. Each sample is really only a number that contains the amplitude value of a waveform measured over time.
- A sound that has been recorded in a digital format; used by musicians who make short recordings of musical instruments to be used for composition and performance of music or sound effects. These recordings are called samples. In this Help system, we try to use sound file instead of sample whenever referring to a digital recording.
- The act of recording sound digitally, i.e. to sample an instrument means to digitize and store it.
A sample dump is the process of transferring sample data between music equipment. Because of the large amounts of data required to store digital sound, sample dumps may take a very long time when using the MIDI Sample Dump Standard (SDS). However, when using the faster SCSI MIDI Device Interface (SMDI) protocol, sample dumps can be performed many times faster.
The MIDI Sample Dump Standard is a way to transfer samples between music equipment. Samples transferred with SDS are sent across MIDI cables at the MIDI data rate of 31,250 Hz baud. SMDI is a much faster sample transfer method for musicians.
The Sample Rate (also referred to as the Sampling Rate or Sampling Frequency) is the number of samples per second used to store a sound. High sample rates, such as 44,100 Hz provide higher fidelity than lower sample rates, such as 11,025 Hz. However, more storage space is required when using higher sample rates.
In the following example, each red dot represents one sample. Because the lower waveform is represented by twice as many samples as the top waveform, the samples are able to better approximate the original waveform.
See Bit Depth.
The Sample Value (also referred to as sample amplitude) is the number stored by a single sample. The number stored by a single sample:
In 32-bit audio, these values range from -2147483648 to 2147483647.
- In 24-bit audio, they range from -8388608 to 8388607.
- In 16-bit audio, they range from -32768 to 32767.
- In 8-bit audio, they range from -128 to 127.
The maximum allowed sample value is often referred to as 100% or 0 dB.
A sampler is a device that records sounds digitally. Although, in theory, your sound card is a sampler, the term usually refers to a device used to trigger and play back samples while changing the sample pitch.
Secure Digital Music Initiative (SDMI)
The Secure Digital Music Initiative (SDMI) is a consortium of recording industry and technology companies organized to develop standards for the secure distribution of digital music. The SDMI specification will answer consumer demand for convenient accessibility to quality digital music, enable copyright protection for artists’ work, and enable technology and music companies to build successful businesses.
SCSI MIDI Device Interface (SMDI)
SMDI is a standardized protocol for music equipment communication. Instead of using the slower standard MIDI serial protocol, it uses a SCSI bus for transferring information. Because of its speed, SMDI is often used for sample dumps.
A context-sensitive menu that appears when you click on certain areas of the screen. The functions available in the shortcut menu depend on the object being clicked on as well as the state of the program. As with any menu, you can select an item from the shortcut menu to perform an operation. Shortcut menus are used frequently in Sound Forge software for quick access to many commands.
Data that has positive and negative values and uses zero to represent silence. Unlike the signed format, twos complement is not used. Instead, negative values are represented by setting the highest bit of the binary number to one without complementing all other bits. This is a format option when opening and saving RAW sound files.
Data that has positive and negative twos complement values and uses zero to represent silence. This is a format option when opening and saving raw sound files.
The signal-to-noise ratio (SNR) is a measurement of the difference between a recorded signal and noise levels. A high SNR is always the goal.
The maximum signal-to-noise ratio of digital audio is determined by the number of bits per sample. In 16-bit audio, the signal to noise ratio is 96 dB, while in 8-bit audio its 48 dB. However, in practice this SNR is never achieved, especially when using low-end electronics.
Small Computer Systems Interface (SCSI)
SCSI is a standard interface protocol
for connecting devices to your computer. The SCSI bus can accept up to seven devices at a time including CD ROM drives, hard drives and samplers.
Society of Motion Picture and Television Engineers (SMPTE)
SMPTE time code is used to synchronize time between devices. The time code is calculated in hours:minutes:second:frames, where frames are fractions of a second based on the frame rate. Frame rates for SMPTE time code are 24, 25, 29.97 and 30 frames per second.
The sound card is the audio interface between your computer and the outside world. It is responsible for converting analog signals to digital and vice-versa. There are many sound cards available on the market today, covering the spectrum of quality and price. Sound Forge software will work with any Windows-compatible sound card.
The status format is the format by which Sound Forge software displays the time ruler and selection times. These include: Time, Seconds, Frames, Audio CD Time, and all Standard SMPTE frame rates. The status format is set for each sound file individually.
A method of data transfer in which a file is played while it is downloading. Streaming technologies allow Internet users to receive data as a steady, continuous stream after a brief buffering period. Without streaming, users would have to download files completely before playback.
Tempo is the rhythmic rate of a musical composition, usually specified in beats per minute (BPM).
A threshold determines the level at which the signal processor begins acting on the signal. During normalization, levels above this threshold are attenuated.
The format by which Sound Forge software displays the time ruler and selection times. These can include: Time, Seconds, Frames and all standard SMPTE frame rates.
Trim/Crop is a function that will delete all data in a sound file outside of the current selection. This is a necessary function when dealing with samples to be played by a sampler to get rid of blank time at the beginning and ending of samples.
µ-Law (mu-Law) is a companded compression algorithm for voice signals defined by the Geneva Recommendations (G.711). The G.711 recommendation defines µ-Law as a method of encoding 16-bit PCM signals into a nonlinear 8-bit format. The algorithm is commonly used in European and Asian telecommunications. µ-Law is very similar to A-Law, however, each uses a slightly different coder and decoder.
This is the temporary file created before you do any processing to a sound file. This undo buffer allows you to rewrite previous versions of the sound file if you decide you don’t like changes you’ve made to the sound file. This undo buffer is erased when the file is closed or the Clear Undo History command is selected.
These commands allow you to change a project back to a previous state, when you don’t like the changes you have made, or reapply the changes after you have undone them.
This is a list of all of the functions that have been done to a file that are available to be undone or redone. Undo/Redo History gives you the ability to undo or redo multiple functions as well as preview the functions for quick review of the processed and unprocessed material. This list can be displayed from within the View menu.
Data that has only positive values and uses half the maximum value to represent silence. This is a format option when opening and saving raw sound files.
Virtual MIDI Router (VMR)
A software-only router for MIDI data between programs. Sound Forge software uses the VMR to receive MIDI time code and send MIDI clock. No MIDI hardware or cables are required for a VMR, so routing can only be performed between programs running on the same PC. Sony supplies a VMR with Sound Forge software called the Sony Virtual MIDI Router.
An digital audio standard developed by Microsoft and IBM. One minute of uncompressed audio requires 10 MB of storage.
A waveform is the visual representation of wave-like phenomena, such as sound or light. For example, when the amplitude of sound pressure is graphed over time, pressure variations usually form a smooth waveform.
Each event shows a graph of the sound data waveform. The vertical axis corresponds to the amplitude of the wave. For 24-bit audio, they range from -8388608 to 8388607. For 16-bit sounds, the amplitude range is -32,768 to +32,767. For 8-bit sounds, the range is -128 to +127. The horizontal axis corresponds to time, with the leftmost point being the start of the waveform. In memory, the horizontal axis corresponds to the number of samples from the start of the sound file.
Microsoft’s Windows Media file format that can handle audio and video presentations and other data such as scripts, URL flips, images and HTML tags.
A zero-crossing is the point where a fluctuating signal crosses the baseline.
By making edits at zero-crossings with the same slope, the chance of creating glitches is minimized.
Zipper noise occurs when you apply a changing gain to a signal, such as when fading out. If the gain does not change in small enough increments, zipper noise can become very noticeable. Fades are accomplished using 64-bit arithmetic, thereby creating no audible zipper noise.