Audio Formats

The Formats

Wave (or WAV)

The standard audio file format for the PC is the wave file. There are different quality settings of wave files, from CD quality stereo, down to telephone quality in mono, and many combinations in between. These sounds can be played using any of the following programs: Windows Media Player, Sound Recorder, or the QuickTime Player. The file extension is .wav

Example - hello.wav, size:310kb

AIF/AIFF

The equivalent sound file on the Macintosh is AIF/AIFF and it has similar quality settings to the wave files. The QuickTime Player and Windows Media Player 7.1 for Macintosh will play these files. The file extension is .aif or .aiff

Example - RwWhistle.aiff, size:56kb (a Red-winged Blackbird)

MP3

A widely popular format today is the MP3 format. Originally this format was conceived to be audio and video, however, the video portion was dumped and the MPEG-1 Layer 3 audio format was born. These files are popular because they are compressed (smaller) in comparison to wave files, but the sound quality can be virtually indistinguishable. This makes them good for transferring over the internet because of their relatively small size. These files can be played using Windows Media Player, Musicmatch, WinAmp, or QuickTime. The file extension is .mp3

Example1 - hello.mp3, size:30kb (the same sound as the wave file from above, but 1/10 the size- that's MP3 compression!)

Example2 - speed_128.mp3, size:697kb (a 45 second "song" at the 128kbps bit rate)

Real

There are other compressed audio formats. The first one is known as Real Audio from Real.com. You need the Real Audio player to play these files. The company pushes their RealOne player, but it can be so annoying in its pop-up ads and windows that it takes away from the music experience. If you can, get the Real Player version 8. The file extensions are .ra, .rm, or .ram

Example - hello.rm, size:23kb (the same sound as above)

Windows Media

Another type of compressed audio comes from Microsoft. The Windows Media Audio format is another direct competitor to MP3. Microsoft claims that this format is CD quality at half the size of MP3. Well not quite. When audio files are compared at the same bit rate, or the measure of how many bits represent a given sound, the files are essentially the same in terms of quality and file size. Windows Media Player plays Windows Media Audio files. Microsoft has just released their Media Player 9 Series player. The file extension is .wma

Example - hello.wma, size: 43kb (the same sound as above)

MIDI

MIDI is a file format unlike any of the others we have mentioned. It isn't really a sound file, but rather a description of a sound file. It relies on your sound card to interpret the data and play the song. The higher the quality of MIDI interpreter on your sound card, the more realistic the sound will be. Since it is only a description and not a full sound, it is a relatively small file. Windows Media Player, QuickTime, along with countless other programs will play MIDI files. The file extension is .mid

Example - getting_to_know.mid, size:13kb

BIT

What is a BIT ?

A bit is a 0 or a 1 since a computer is pretty much just millions of switches a computer works with ON and OFF states, this is called BINARY since it is numbers in a Base 2 format.

binary in 2 BITS our base 10 numerals
00 0
01 1
10 2
11 3

In this table you see some 2 bit words and their equalant numbers in the BASE that we are used to seeing numbers in. They are two bit words since they have two ON or OFF states. I will stray from this for a little while to explain the audio side for a while....

Dynamic range of 1 BIT ?

Digital Theory, what is a bit, sample and dynamic range of digital audio.

In this picture we see I'm not a very good artist LOL.. besides this I have tried to show that BITS are the Y axis on a graph and that each BIT encodes 6 DB of DYNAMIC RANGE.... The X axis is of course TIME which is set by the sample rate.

Most people know how a television works. A dot is painted onto a screen, the dot moves very fast across the screen painting the picture. Remember how when a bright light is shone in to your eyes ?? After the light has been turned off you can still see a blurry dot, this is called persistence of vision. This allows the brain to see the fast moving dot on a TV screen as a full picture !! dogs don't have persistence of vision and cannot see pictures on a TV ! The refresh rate of a TV must be more than 30-40 hertz for a picture to be shown (that's why mains power is 50 Hz, so light bulbs don't flicker). In other words more than 30 - 40 pictures/snap shots must be painted for our eyes to see a constant picture... Getting back to audio now. Samples are like snapshots of sound and just like early Projection Movies in black and white if they are not played back at a fast enough speed your ears will hear the gap. Not sure what the speed of the ear is but if someone yells in to your ear you hear a ringing so you do have persistence of hearing. More on sample rates later on... Back to BITS....

A 8BIT word looks like this "01001000" .... Now as we add an extra BIT to a word length we DOUBLE the possible combinations, in terms of audio we double the quantisation values or the number of Y values we can round off the level of the audio to. In digital realms you cant have 1 and a HALF it either has to be 1 or a 2... Hence the term quantisation or rounding. The level of the audio has to be rounded to the nearest value allowed in a BIT.

Quality of 8 bit VS 16 bit and 24 bit ?

Number of BITS example of the BIT Number of Quantisation levels
2 01 4
3 110 8
.... ..... .....
8 00101011 256
16 0000110001010010 65536
20 ........... 1048576
24 ............................... 16777216

As you can see by this table by going from 16 BIT audio to 24 BIT audio, we have gained 256 TIMES the accuracy of the word lengths (samples). That's why all professionals will record in 20 bits or more. On a side note, most professionals will record in the same sample rate that the final product will be mastered to eg (44.1) even when they have the capability of recording in 96k or even beyond that. There are many reasons why which I wont go into here. This is now changing as DVD allows for higher sampling rates and capturing the original sound in higher sampling rates leaves the engineer more flexibility later on.

How large is a BIT ?

8 BITS make up 1 Byte of storage.. 16 BITS take up two bytes of storage and 24 BITS take up 3 bytes of storage. This is generally true, but there are some exceptions. I wont make it more complicated by going into this in more depth.

What is a dB (DeciBel) ?

dB is a deci Bel.... 10 deciBels make up one Bel... One decibel is approximately equal to the smallest change in volume of sound that the normal ear can detect. The scale of decibels is logarithmic, every increase of 10 dB representing an increase of about 300% in sound. The deciBel is a LOGARITHMIC scale and you cannot treat them like normal values when adding and subtracting two values.

Dynamic Range of Digital Audio

What is Dynamic range

Dynamic range represents the difference between the maximum signal that can be recorded (0dB / DFS) and the noise floor of your system. The noise floor is the noise present in your system without any signal present. A system with a high dynamic range will be quieter than one with a
lower dynamic range. Dynamic range is measured with the decibel (dB).

What's the dynamic range of 16 bit and 24 bit audio

1 Bit can encode 6dB of Dynamic range. Therefore a 24-bit system theoretically has a dynamic range of 144dB (24 * 6 = 144) and a 16-bit
system has a theoretical dynamic range of 96dB.

Why don't converters have 96dB or 144dB range ?

Current analog-to-digital converters typically produce a full-scale input voltage with an input of +7dBu. If they were to have 144dB of dynamic range, they would have to be capable of resolving signals as small as 10 nano-volts. That’s 10 one-billionths of a volt! Transistors and resistors produce noise in this range just by having electrons moving around due to heat. Even if the converters could be perfectly designed to read these levels, the low noise requirements of the surrounding circuitry such as power supplies and amplifiers would be so stringent that they would either be impossible or too expensive to build.

An average RMS of 120dB dynamic range in 24bit converters is about as good as it gets to this date with mass produced converters.

Nyquist Theory

To sample or graph a SINE wave you must have at least two points or co-ordinates in order to guess what the frequency is.. For example you need the ORIGIN (y=0 normally) and either a MAXIMUM or a MINIMUM value to guess the frequency. Because of this simple fact, to record a frequency you must have at least double that number as the sampling rate.. EG. To record a sine wave of 50 Hz you need a MINIMUM of 100 samples per second to record the sine wave. This basic fact which governs the minimum sampling rate is called the Nyquist theory. The Nyquist frequency is the highest frequency that you can record with a given sample rate. In the case of a recording with 44,100 samples per second (the sampling rate of CDs) the Nyquist frequency is 22050 Hz. I could go in to drawing pictures to prove this but I wont because you have seen the quality of my drawings above. LOL. The Nyquist theory is not something that you will hear much about but is good to know what it is and how it effects things in real life situations.

What is a BEAT frequency ?

Guitarist use the beat frequency to tune guitars with harmonics. If a guitarist picks a harmonic on the guitar the string can only vibrate with waves corresponding to the length of the string, by placing a finger on the string at a particular fret you create what is called in physics as a Node and Anti-Nodes. When playing two harmonics that are very close together you hear a rise and a fall in the perceived level of the notes due to the BEAT frequency. I will now explain what the BEAT freq is.. The difference between any two frequencies will create a new frequency.. The formula is F1-F2=BEAT ..... Back to relating this to Dynamic range... The BEAT frequency in simple terms is for every Hz that you go over the Nyquist frequency you will get a artifact that equals the difference between the recorded frequency and the nyquist frequency.. WOW that is hard to explain in simple terms.. For example if u record a 22,051 Hz sine wave with a 44.1 Khz sample rate you will get a 1 Hz rumble in your audio due to going over the Nyquist frequency. If you record a 22,080 Hz you will get a 30 HZ rumble and so on. Once again I will spare you of the draws I could do to prove this with graphs.

Brick Wall filters in DA.

When mastering tracks to go on to CD's the material is EQ' ed so that nothing over 18-19 KHz or very little reaches the DA converters. Early CD's and CD players where said to be HARSH to the ears since they used What's called a BRICK WALL filter to cut all frequencies off after a set point, sometimes as low as 15KHz. I'm sure you have heard people complain about Cd's and say they are un-warm and harsh, this is true of cheap cd players and older models.. This abrupt cut off wasn't natural and the ears picked it up even though very few adults can hear past 16 KHz... I should expand on this or newbies to digital audio will argue on this point, whilst a human cannot hear above say 16 K we can sense if frequencies are present or not in recordings. High end CD players actually recreate the harmonics in the DA conversion right up to 30 KHz... Pioneer call this "Legato Link" technology if you wish to look it up. A new born baby can hear to 20 Khz, as the baby gets older the ears slowly loses this range. The more loud rock concerts you attend the less you will be able to hear high frequencies, every time you hear your ears ringing and they seem quite then you have done damage to your ears.. If the sound is filtered too much and too steeply the sound is very harsh and if it is not filtered enough you will get rumbles in the recordings. Some AD and DA use 180KhZ brick wall filters to help block RFI and EMI interference, this one way how internal audio cards are much quieter than cheap sound cards.

DITHERING

I wont go into too much depth about Dithering since many sites explain it already. Here's a link.

Advanced Dithering explanation

When Dithering from 24bit to 16 bit the information stored in the last 8 bits is moved into the top 16 bits which are the ones which we want to keep. Truncating then throws away the last 8 bits. If you truncate before dithering you lose some of the audio information you have recorded. IE You throw away some of your quality ! If you dither before truncating, your adding small amounts of random noise to the audio to push the audio information up into the top 16bits... When the digital audio is truncated most of the noise is thrown away, although some of it will be kept. Dithering gives you much smoother and pleasant audio to listen to after you have reduced the word-length of the audio. Read Noise shaping to learn about advanced dithering techniques.

Noise Shaping

Noise shaping is dithering but taking in to account the Fletcher Munson graphs. These Fletcher Munson graphs show the areas where the human ear is most sensitive and where it is also least sensitive to certain audio frequencies... By only adding noise in the areas where our ears cannot hear as well, or our ears cannot hear at all, the noise that is added in dithering is pretty much completely inaudible. This is made even truer as dither is normally around 90db below the maximum level of material in 16 bit audio. There are many different noise shaping techniques in use and depending on the recorded material a different one may be better than another one. That's one reason why mastering should be left to professional studios who can dither properly and know which noise shaping technique will work best for your material.

How much dither noise is added ?

Very small amounts are added.

When calculating signal levels and comparing to dB values you must use this formula because the decibel is a logarithmic scale.

N(dB) = 20(LOG A - LOG B)

This is as far as I am going to go in this tute for now..... Hope u learnt heaps and understood most if not all.. Any comments about this feel free to email me.

Digital Audio

What is sound?

Sounds are pressure waves of air. If there wasn't any air, we wouldn't be able to hear sounds. There's no sound in space.

We hear sounds because our ears are sensitive to these pressure waves. Perhaps the easiest type of sound wave to understand is a short, sudden event like a clap. When you clap your hands, the air that was between your hands is pushed aside. This increases the air pressure in the space near your hands, because more air molecules are temporarily compressed into less space. The high pressure pushes the air molecules outwards in all directions at the speed of sound, which is about 340 meters per second. When the pressure wave reaches your ear, it pushes on your eardrum slightly, causing you to hear the clap.

A hand clap is a short event that causes a single pressure wave that quickly dies out. The image above shows the waveform for a typical hand clap. In the waveform, the horizontal axis represents time, and the vertical axis is for pressure. The initial high pressure is followed by low pressure, but the oscillation quickly dies out.

The other common type of sound wave is a periodic wave. When you ring a bell, after the initial strike (which is a little like a hand clap), the sound comes from the vibration of the bell. While the bell is still ringing, it vibrates at a particular frequency, depending on the size and shape of the bell, and this causes the nearby air to vibrate with the same frequency. This causes pressure waves of air to travel outwards from the bell, again at the speed of sound. Pressure waves from continuous vibration look more like this:

How is sound recorded?

A microphone consists of a small membrane that is free to vibrate, along with a mechanism that translates movements of the membrane into electrical signals. (The exact electrical mechanism varies depending on the type of microphone.) So acoustical waves are translated into electrical waves by the microphone. Typically, higher pressure corresponds to higher voltage, and vice versa.

A tape recorder translates the waveform yet again - this time from an electrical signal on a wire, to a magnetic signal on a tape. When you play a tape, the process gets performed in reverse, with the magnetic signal transforming into an electrical signal, and the electrical signal causing a speaker to vibrate, usually using an electromagnet.

How is sound recorded digitally ?

Recording onto a tape is an example of analog recording. Audacity deals with digital recordings - recordings that have been sampled so that they can be used by a digital computer, like the one you're using now. Digital recording has a lot of benefits over analog recording. Digital files can be copied as many times as you want, with no loss in quality, and they can be burned to an audio CD or shared via the Internet. Digital audio files can also be edited much more easily than analog tapes.

The main device used in digital recording is a Analog-to-Digital Converter (ADC). The ADC captures a snapshot of the electric voltage on an audio line and represents it as a digital number that can be sent to a computer. By capturing the voltage thousands of times per second, you can get a very good approximation to the original audio signal:

Each dot in the figure above represents one audio sample. There are two factors that determine the quality of a digital recording:

  • Sample rate: The rate at which the samples are captured or played back, measured in Hertz (Hz), or samples per second. An audio CD has a sample rate of 44,100 Hz, often written as 44 KHz for short. This is also the default sample rate that Audacity uses, because audio CDs are so prevalent.

  • Sample format or sample size: Essentially this is the number of digits in the digital representation of each sample. Think of the sample rate as the horizontal precision of the digital waveform, and the sample format as the vertical precision. An audio CD has a precision of 16 bits, which corresponds to about 5 decimal digits.

Higher sampling rates allow a digital recording to accurately record higher frequencies of sound. The sampling rate should be at least twice the highest frequency you want to represent. Humans can't hear frequencies above about 20,000 Hz, so 44,100 Hz was chosen as the rate for audio CDs to just include all human frequencies. Sample rates of 96 and 192 KHz are starting to become more common, particularly in DVD-Audio, but many people honestly can't hear the difference.

Higher sample sizes allow for more dynamic range - louder louds and softer softs. If you are familiar with the decibel (dB) scale, the dynamic range on an audio CD is theoretically about 90 dB, but realistically signals that are -24 dB or more in volume are greatly reduced in quality. Audacity supports two additional sample sizes: 24-bit, which is commonly used in digital recording, and 32-bit float, which has almost infinite dynamic range, and only takes up twice as much storage as 16-bit samples.

Playback of digital audio uses a Digital-to-Analog Converter (DAC). This takes the sample and sets a certain voltage on the analog outputs to recreate the signal, that the Analog-to-Digital Converter originally took to create the sample. The DAC does this as faithfully as possible and the first CD players did only that, which didn't sound good at all. Nowadays DACs use Oversampling to smooth out the audio signal. The quality of the filters in the DAC also contribute to the quality of the recreated analog audio signal. The filter is part of a multitude of stages that make up a DAC.

How does audio get digitized on your computer?

Your computer has a soundcard - it could be a separate card, like a SoundBlaster, or it could be built-in to your computer. Either way, your soundcard comes with an Analog-to-Digital Converter (ADC) for recording, and a Digital-to-Analog Converter (DAC) for playing audio. Your operating system (Windows, Mac OS X, Linux, etc.) talks to the sound card to actually handle the recording and playback, and Audacity talks to your operating system so that you can capture sounds to a file, edit them, and mix multiple tracks while playing.

Standard file formats for PCM audio

There are two main types of audio files on a computer:

  • PCM stands for Pulse Code Modulation. This is just a fancy name for the technique described above, where each number in the digital audio file represents exactly one sample in the waveform. Common examples of PCM files are WAV files, AIFF files, and Sound Designer II files. Audacity supports WAV, AIFF, and many other PCM files.

  • The other type is compressed files. Earlier formats used logarithmic encodings to squeeze more dynamic range out of fewer bits for each sample, like the u-law or a-law encoding in the Sun AU format. Modern compressed audio files use sophisticated psychoacoustics algorithms to represent the essential frequencies of the audio signal in far less space. Examples include MP3 (MPEG I, layer 3), Ogg Vorbis, and WMA (Windows Media Audio). Audacity supports MP3 and Ogg Vorbis, but not the proprietary WMA format or the MPEG4 format (AAC) used by Apple's iTunes.

For details on the audio formats Audacity can import from and export to, please check out the Fileformats page of this documentation. Please remember that MP3 does not store uncompressed PCM audio data. When you create an MP3 file, you are deliberately losing some quality in order to use less disk space

basic audio

Introduction to Audio

SoundThis beginner-level tutorial covers the basics of audio production. It is suitable for anyone wanting to learn more about working with sound, in either amateur or professional situations. The tutorial is five pages and takes about 20 minutes to complete.

What is "Audio"?

Audio means "of sound" or "of the reproduction of sound". Specifically, it refers to the range of frequencies detectable by the human ear — approximately 20Hz to 20kHz. It's not a bad idea to memorise those numbers — 20Hz is the lowest-pitched (bassiest) sound we can hear, 20kHz is the highest pitch we can hear.

Audio work involves the production, recording, manipulation and reproduction of sound waves. To understand audio you must have a grasp of two things:

  1. Sound Waves: What they are, how they are produced and how we hear them.
  2. Sound Equipment: What the different components are, what they do, how to choose the correct equipment and use it properly.

Fortunately it's not particularly difficult. Audio theory is simpler than video theory and once you understand the basic path from the sound source through the sound equipment to the ear, it all starts to make sense.

Technical note: In physics, sound is a form of energy known as acoustical energy.

The Field of Audio Work

The field of audio is vast, with many areas of specialty. Hobbyists use audio for all sorts of things, and audio professionals can be found in a huge range of vocations. Some common areas of audio work include:

  • Studio Sound Engineer
  • Live Sound Engineer
  • Musician
  • Music Producer
  • DJ
  • Radio technician
  • Film/Television Sound Recordist
  • Field Sound Engineer
  • Audio Editor
  • Post-Production Audio Creator

In addition, many other professions require a level of audio proficiency. For example, video camera operators should know enough about audio to be able to record good quality sound with their pictures.

Speaking of video-making, it's important to recognise the importance of audio in film and video. A common mistake amongst amateurs is to concentrate only on the vision and assume that as long as the microphone is working the audio will be fine. However, satisfactory audio requires skill and effort. Sound is critical to the flow of the programme — indeed in many situations high quality sound is more important than high quality video.

Most jobs in audio production require some sort of specialist skill set, whether it be micing up a drum kit or creating synthetic sound effects. Before you get too carried away with learning specific tasks, you should make sure you have a general grounding in the principles of sound. Once you have done this homework you will be well placed to begin specialising.

The first thing to tackle is basic sound wave theory...

Pengenalan Audio


Pengenalan Audio

Suara yang kita dengar sehari-hari adalah gelombang analog yang berasal dari tekanan udara di sekeliling kita dan dapat didengar dengan bantuan gendang telinga. Gendang telinga bergetar lalu getaran dikirim dan diterjemahkan menjjadi informasi suara. Saat berbicara, kita menghasilkan suara bebrbentuk tekanan udara yang dihasilkan oleh pita suara. Pita suara akan bergetar dan menyebabkan perubahan tekanan udara sehingga kita mengeluarkan suara.

Komputer hanya mampu mengenal sinyal dalam bentuk digital. Bentuk digital hanya mempunyai dua nilai yaitu 0 dan 1 atau yang disebut bit. Tegangan yang mendekati 5 volt diberi nilai 1 dan mendekati 0 volt diberi nilai 0. Angka 1 sebagai simbol ON, sedangkan angka 0 sebagai simbol OFF. Jadi pada dasarnya komputer bekerja dengan prinsip saklar ON atau OFF, tapi dengan kecepatan yang sangat tinggi, sehingga komputer mampu melihat susunan angka 0 dan 1 menjadi kumpulan bit dan mnerjemahkan menjadi sebuah informasi yang bernilai.

Digital Audio menggunakan apa yang disebut sample untuk merepresentasikan gelombang suara. Gelombang suara dinamakan Waveform. Suatu Waveform menunjukkan ukuran kekerasan atau amplitude getaran sejalan dengan waktu.

Pada dasarnya, suara audio pada digital audio adalah representasi suara seperti yang kita dengar, tetapi dalam bentuk deretan angka 1 dan 0 dan tersusun sedemikian rupa sehingga komputer dapat memahaminya. Melalui komputer, susunan angka 1 dan 0 direpresentasikan kembali menjadi suatu rangkaian suara yang dapat kita dengar


Digital Recording

Digital Recording adalah proses perekaman suara asli yang berupa sinyal analog menjadi sinyal digital. Sinyal analog dapat berasal dari

  1. Output instrumen musik
  2. Microphone
  3. Output kaset analaog
  4. Output mixer analog, dll.

Kita ambil contoh keluaran dari Mic :

Prosesnya adalah kita bernyanyi, berbicara atau mengeluarkan suara didepan Microphone (transducer) yang akan menangkap getaran udara dan mengirimkan sinyal elektronis. Komputer akan merekam sinyal dan merubahnya dalam bentuk digital untuk dapat diproses didalam software pengolah digital audio. Setelah selesai dalam proses pengolahan atau editing, agar kita dapat memndengar kembali melalui headphone atau speaker data digital ini dirubah kembali menjadi data analog. Ingat panca indera manusia, termasuk telinga hanya dapat menerima sinyal analog.

Pengenalan Audio

Bab 1

Pengenalan Audio

Mengenal Jenis Data Audio

Suara yang kita dengar sehari-hari adalah gelombang analog yang berasal dari tekanan udara di sekeliling kita dan dapat didengar dengan bantuan gendang telinga. Gendang telinga bergetar lalu getaran dikirim dan diterjemahkan menjjadi informasi suara. Saat berbicara, kita menghasilkan suara bebrbentuk tekanan udara yang dihasilkan oleh pita suara. Pita suara akan bergetar dan menyebabkan perubahan tekanan udara sehingga kita mengeluarkan suara.

Komputer hanya mampu mengenal sinyal dalam bentuk digital. Bentuk digital hanya mempunyai dua nilai yaitu 0 dan 1 atau yang disebut bit. Tegangan yang mendekati 5 volt diberi nilai 1 dan mendekati 0 volt diberi nilai 0. Angka 1 sebagai simbol ON, sedangkan angka 0 sebagai simbol OFF. Jadi pada dasarnya komputer bekerja dengan prinsip saklar ON atau OFF, tapi dengan kecepatan yang sangat tinggi, sehingga komputer mampu melihat susunan angka 0 dan 1 menjadi kumpulan bit dan mnerjemahkan menjadi sebuah informasi yang bernilai.

Digital Audio menggunakan apa yang disebut sample untuk merepresentasikan gelombang suara. Gelombang suara dinamakan Waveform. Suatu Waveform menunjukkan ukuran kekerasan atau amplitude getaran sejalan dengan waktu.

Pada dasarnya, suara audio pada digital audio adalah representasi suara seperti yang kita dengar, tetapi dalam bentuk deretan angka 1 dan 0 dan tersusun sedemikian rupa sehingga komputer dapat memahaminya. Melalui komputer, susunan angka 1 dan 0 direpresentasikan kembali menjadi suatu rangkaian suara yang dapat kita dengar

Digital Recording

Digital Recording adalah proses perekaman suara asli yang berupa sinyal analog menjadi sinyal digital. Sinyal analog dapat berasal dari

  1. Output instrumen musik
  2. Microphone
  3. Output kaset analaog
  4. Output mixer analog, dll.

Kita ambil contoh keluaran dari Mic :

Prosesnya adalah kita bernyanyi, berbicara atau mengeluarkan suara didepan Microphone (transducer) yang akan menangkap getaran udara dan mengirimkan sinyal elektronis. Komputer akan merekam sinyal dan merubahnya dalam bentuk digital untuk dapat diproses didalam software pengolah digital audio. Setelah selesai dalam proses pengolahan atau editing, agar kita dapat memndengar kembali melalui headphone atau speaker data digital ini dirubah kembali menjadi data analog. Ingat panca indera manusia, termasuk telinga hanya dapat menerima sinyal analog.

Proses perubahan sinyal analog menjadi sinyal audio digital disebut Analog to Digital Conversion atau A-to-D Convertion. Sebaliknya, proses mengubah sinyal digital kembali menjadi sinyal analog disebut Digital to Analog Convertion atau D-to-A Convertion.Proses mengubah sinyal disebut Sampling.


Representasi sinyal analog berbentuk sine wave

Alat untuk mengubah sinyal analog ke digital atau digital ke analag disebut ADDA Converter. Pada PC multimedia yang biasa kita pakai, Soundcard mempunyai kemampuan sebagai ADDA Converter. Untuk perekaman audio yang lebih bagus lagi atau yang biasa digunakan pada profesional digital recording kemampuan ADDA Converter ini ada pada DAW (Digital Audio Workstation). Alat ini mempunyai keunggulan lebih daripada soundcard yang biasa terdapat pada PC, baik dari segi fasilitas maupun kualitas Digital recording yang dihasilkan.

Untuk mengenali kualitas Soundcard atau DAW kita, cara yang paling mudah adalh dengan merekam suara menggunakan pogram perekam suara pada Windows, misalnya program Sound Recorder. Kemudian dengarkan hasilnya, apabila suara kita terdengar bersih berarti proses sampling cukup baik. Namun, bila hasilnya ada tambahan suara mengganggu berupa desis (noise), berarti soundcard atau DAW kita kurang baik untuk digunakan dalam Digital Recording.

Pada sistem digital yang baik, kualitas suaranya sangat bersih dan mempunyai wilyah frekuensi yang lebar, selain itu data digital audio dapat diperbanyak tanpa mengalami penurunan kualitas suara, Sebaliknya pada analog, kualitas suara akan semakin menurun bila diperbanyak, misalnya rekaman dari suat kaset. Apabila merekam kembali kaset ke kaset secara langsung berarti kita merekam secara analog sehingga hasilnya akan mengalami penurunan kualitas suara, karena ada tambahan noise. Noise timbul karena gelombang suara asli mengalami penyimpangan dan perubahan sudah tidak sesuai lagi dengan aslinya.

Lain halnya dengan data digital, penggandaan berulang ulang akan mengahasilkan data yang sama, karena datanya hanya berupa susunan angka 0 dan 1.

Saat merekam data audio, kita dapat memilih sampling rate. Sampling rate adalah banyaknya pengambilan sample pada sinyal analog saat dirubah menjadi sinyal analog. Pada sampling rate 44100 Hz berarti terdapat pengambilan sample sinyal analog sebanyak 44100 kali dalam satu detik, semakin besar sampling rate berati semakin banyak dan teliti pengambilan sample sehingga hasil yang didapatkan mendekati atau menyamai sinyal aslinya. Seperti terlihat pada gambar dibawah pengambilan satu sample diwakili oleh satu kotak, jadi semakin banyak sample yang diambil maka kotak yang ada semakain banyak pada waktu yang sama sehingga ketelitian dalam mengubah sinyal analog ke digital semakin baik atau mendekati/menyamai aslinya.

Sampling sinyal Analog ke Digital

Jadi bisa dikatakan Sampling rate adalah tingkat atau kualitas perekaman data audio oleh komputer.

Table Sampling Rate

Sampling Rate

Mono 16 bit

Stereo 16 bit

Setara

11025 Hz

1,3 MB

2,5 MB

Suara telepon

22050 Hz

2,8 MB

5,0 MB

Suara Radio AM

44100 Hz

5,0 MB

10,1 MB

Suara CD audio

48000 Hz

5,5 MB

11,0 MB

Suara DAT Player

96000 HZ

11,2 MB

22,0 MB

Suara DVD Audio

Tabel hanya menunjukkan perkiraan kasar ruang hardisk yang dibutuhkan data audio pada durasi sekitar 1 menit (60 detik)

Analog to Digital-Digital to Analog Convertion (AD-DA Convertion)

Proses analog to digital :

Proses pengubahan dari tegangan analog ke data digital terdiri atas beberapa tahap :

Memabatasi frekuensi sinyal yang akan diproses dengan Low Pass Filter-LPF atau High Pass Filter-HPF

Melakukan sampling

Hasil Sampling diberi nilai berbentuk digital

Proses digital to analog

Menghitung data digital menjadi amplitudo-amplitudo analog

Menyambung amplitudo analog menjadi sinyal analog

Memberi filter keluaran dengan LOW Pass Filter agar output lebih baik.

Ada berbagai macam jenis ADDA Converter, mulai dari soundcard on board sampai Digital Audio Work Sation yang berharga puluhan juta rupiah.

Memilih Soundcard

Hal pertama yang perlu diingat dalam memilih soundcard adalah sesuai dengan kebutuhan kita. Apakah anda hanya membutuhkan channel setereo atau banyak channel dalam satu soundcard? Berapa bayak track yang dibutuhkan utnuk melakukan perekaman dalam waktu bersamaan? Saat ini banyak soundcard menawarkan resolusi lebih dari 16 bit bahkan ada yang mencapai 20 dan 24 bit.

Koneksi dasar pada sebuah souncard

Agar perekaman baik, sebaiknya anda memilih soundcard yang mempunyai resolusi hingga 24 bit. Soundcard yang biasa kita temui paling tidak mempunyai tiga koneksi yaitu mic, line in dan line out. Soundcard yang baik atau high end mempunyai koneksi digital yang memudahkan anda mengirimkan data audio dari komputer ke perekam digital lain seperti DAT atau Mini Disc.

Jenis koneksi digital paling populer adalah S/PDIF (Sony Philips Digital Interface). Tipenya ada 2 yaitu coaxial yang menggunakan soket phono dan toslink yang menggunakan kabel serat optik (fibber optic)

Soundcard profesional menggunakan AES/EBU, yaitu sistem balance yang menggunakan konektor XLR. Namaun koneksi ini sering terdapat pada soundcard kelas tinggi (High end). Konektor lain yang bisa digunkan adalah TRS. Lebih lanjut, ada pula soundcard yang menggunakan multi input. Maksudnya, kita bisa menggunakan konektor TRS maupun XLR dalam satu koneksi.

Kenudian dalam memilih soundcard, anda perlu pula memperhatikan ketersedian pre amp untuk microphone. Hal ini berguna bila kita akan merekam dengan menggunakan mikrofon supaya input bisa lebih baik.

Istilah dalam Digital Audio

Channel

Channel, dalam hal ini adalah banyaknya channel audio yang dipakai. Misalnya, audio mono memerlukan satu channel dan audio stereo memerlukan 2 channel.Lebih lanjut jumlah channel bis alebih dari 2 sesuai kebutuhan.

Sampling rate

Ketika ADDAC mengubah data analog menjadi digital, ADDAC akan memecah suara memnjadi potongan-potongan sinyal dengan nilai tertentu dalam satu satuan waktu. Potongan sinyal tersebut dinamakan sampling rate. Semakin besar sampling rate semakain bagus kualitas suara yang dihasilkan.

Bit per Sample

Sampling mempunyai besaran amplitudo. Besaran amplitudo akan disimpan dalam bit-bit digital. Banyak bit yang dapat dipakai untuk mempresentasikan besaran amplitudo dinamakan bit per sample. Makin banyak bit yang dipakai untuk mempresentasikan besaran amplitudo, makin halus besran amplitudo yang dihasilkan. Contohnya suara 8 bit memiliki 2 pangkat 8 kemungkinan amplitudo yaitu 256. Kemudian 16 bit mempunya 2 pangkat 16 kemungkinan yaitu 65536 kemungkinan amplitudo

Bit rate

Bit rate adalah perkalian antara jumlah kanal, frekuensi sampling dan bit per sample. Dengan demikian, untuk mengetahui bit rate yang dibutuhkan untuk menyimpan sebuah lagu stereo CD Quality (44100 Hz, 16 bit) adalah : 2 x 44100 x 16 = 1.411.200 bit per second (bps).

Desibel

Satuan logaritmis untuk menyatakan level sinyal audio dalam relasinya terhadap tegangan listrik

DAT

Kaset bermedia pita format digital dengan sample rate mencapai 48.000 Hz. Telah menjadi standar mastering dunia.

Bit

Dua buah angka 0 dan 1 yang digunakan untuk menetapkan sistem digital. Bit terpakai dalam BNS (Binary Numbering System)

Bandwidth

Jangkauan frekuensi yang diizinkan lewat oleh sebuah perangkat. Sederhananya, bandwidth bisa disebut dengan kelebaran frekuensi.

Amplifier

Perangkat yang berfungsi untuk mempekuat atau memperlemah amplitudo sinyal audio. Sebagian besar amplifier yang tersedia di pasaran untuk memperkuat.

Amplitudo

Kekuatan sinyal audio yang dinyatakan dengan desibel (dB). Secara teknis, amplitudo sebuah sinyal dapat diketahui dengan mengukur jarak garis tengah sinyal tersebut terhadap titik terkuat (peak) dari sinyal itu sendiri

Analog

Sistem proses sinyal (audio) yang menggunakan perangkat elektrik dan magnetik tanpa dibantu oleh chipset binary

Digital

Sebuah tata kerja elektronik yang menggunakan binary numbering system sebagai formula dasar. Pada perekaman digital, sinyal elektrik analog diubah terlebih dahulu oleh Analog to Digital Converter menjadi sinyal digital supaya bisa diterima, dibaca dan diproses oleh sistem digital tersebut.

BAB 2

DASAR-DASAR SOFTWARE DIGITAL AUDIO

Ada beberapa software digital audio yang banyak dipakai. Cool Edit Pro, Acid Pro, Cakewalk pro Audio sampai Cakewalk Sonar, Adobe Audition, Cubase adalah beberapa software pengolah data digital audio yang banyak dipakai.

Pada prinsipnya software –software tersebut mempunyai prinsip kerja yang sama, yang membedakan antara satu software satu dengan yang lain adalah istilah yang digunakan dan beberapa aksesoris yang mendukung. Tapi prinsip dasar kerja dari software-software tersebut pada umumnya sama.

Agar lebih mudah menggunakannya, kita perlu menenal istilah dasar dan prinsip kerja yang digunakan hampir pada seluruh Software Pengolah Audio digital, antara lain :

  1. Jenis data
  2. Konsep Track
  3. Klip
  4. Posisi Lagu

Jenis Data

Ada dua jenis data yang biasanya didukung oleh software digital audio, yaitu data MIDI dan Audio.

Data Midi adalah data yang direkam ke dalam komputer menggunakan instrumen MIDI (keyboard, gitar, alat musik tiup, perkusi elektrik yang dilengkapi MIDI interface). Data ini hanyalah berupa perintah perintah untuk membunyikan dan mengontrol alat musik tersebut, yang berupa bit-bit 1dan 0 yang masing masing mempunyai arti tersendiri untuk membunyikan alat musik tersebut. Tipe file untuk data ini berekstensi.MID..

Data Audio adalah data suara yang direkam ke komputer melalui soundcard atau DAW menggunakan transducer, perangkat audio player, dan instrumen musik melalui kabel audio, baik digital (fiber optic dan coaxial) maupun analog (XLR dan TRS). Hasilnya adalah file bertipe WAVE yang pada komputer ditampilkan berupa file dengan ekstensi .wav.

Kualitas perekaman data audio oleh komputer ditentukan oleh sampling rate.Sampling rate adalah jumlah contoh suara yang diambil oleh komputer untuk menampilkan kembali suara yang direkam.

Semakin besar sampling rate yang dipilih, suara yang dihasilkan akan semakin bagus,tetapi semakin besar pula file yang dihasilkan. Keungggulan data audio digital antara lain suara yang dihasilkan sesuai sumber bunyinya atau dikatakan lebih sempurna. Ukuran file data audio digital pun lebih besar dari data MIDI. Untuk data audio lagu berdurasi lima menit pada sampling rate 44,1 kHz ukuran filenya bisa mencapai 30-50 MB. Pada file MIDI hanya berkisar dalam ukuran ratusan kilobytes.

Perbedaan Data Audio Digital dengan Data MIDI

Pada data audio digital selain suara instrumen musik kita dapat merekam suara manusia atau suara lain dari Microphone sebagai transducer. Pada data MIDI, kita hanya bisa merekam intrumen musiknya, karena pada dasarnya data MIDI tidak menyimpan suara asli tapi hanya sederetan perintah (MIDI message) yang membunyikan alat musik/instrumen yang digunakan. Oleh karena itu kualitas suara data MIDi sangat tergantung dengan instrumen yang digunakan atau pembangkit suara yang digunakan.Soundcard bisa mengeluatkan suara data MIDI karena didalamnya ditanamkan pembangkit suara MIDI.

Standar MIDI yang biasa digunakan adalah GM (General Midi).Semua instrumens maupun souncard menerapkan standar ini. Standar ini digunakan untuk Mapping instrumen musik dari berbagai merek yang ada sehingga semua bank dalam intstrumen musik mempunyai bank dan suara alat musik yang sama. Contoh bank 01 pada Merek A dan merek B sama yaitu suara Piano.Jika tidak ada standar ini suara yang dikeluarkan perintah MIDI bisa menghasilkan suara yang lain untuk tiap merek alat musik.

Kelebihan lain data audio digital adalah meskipun digandakan berulang-ulang, data akan mengalami penurunan kualitas suara. Bandingkan dengan audio analaog yang biasanya berupa kaset atau pita. Kualitasnya akan cenderung menurun bila digandakan berulang ulang biasanya ditandai dengan bertambahnya noise.

KONSEP TRACK

Track didalam software Digital audio bisa disamakan dengan layer di software Grafis.seperti di dalam pita kaset setereo yang dibagi menjadi dua jalur kiri dan kanan. Jalur tersebut dinamakan track. Dimna pada masing-masing track disimpan suara.

Jadi pada prinsipnya Track adalah tempat menyimpan sumber suara. Beberapa cara menyimpan sumbe suara dalam satu track, yaitu :

Satu Track berisi satu sumber suara

Satu Track berisi lebih dari satu sumber suara.

Pada Software yang multi tracker kita dapat merekam banyak sumber suara sesuai dengan jumlah track yang ada. Misalkan track 1 untuk vokal, track 2 untuk gitar, track 3 untuk bass, track 4 untuk piano, track 5 untuk drum dan seterusnya.

Tampilan Multi track pada Cool Edit Pro 2.0