Thursday, September 23, 2010

Combining Two Passions: Programing Music

As we progress, lets us now consolidate what we have learned from our time here under Applied Physics 186. So why not combine two different things to produce something amazing. Lets start.

Problem. How do we use programing to produce music?

Solution. Use Scilab and instantly detect notes in a sheet music and produce the sound.

So how do we do it?

We start by acquiring a sheet music of a simple song, London Bridge for example. Now, we apply different techniques to gather the needed information, namely frequency, and time duration. These information is stored in the height of the note and the shape of the note used respectively. So lets start.

Line 1 of London Bridge

Line 2 of London Bridge

First isolate 1 line of the song. This can be cropped using paint. Next, we get an image of a half note from the same music sheet. We need to make sure that the canvass of the half note is the same size as the whole line of the song and is placed in the center. Note that since half notes and quarter notes are similar in the sense that they both have staffs, we can choose to remove the staff allowing us to center the image better. Now, we correlate the line and the image of the half note. This will give us correlation with all the notes including the quarter notes since they are very similar. Now, this correlation is in gray scale. Let us binarize it. This will give us a isolate the points of correlation to a few pixels. Then, using an algorithm I used in my Applied Physics 171 - Modeling X-ray Diffraction Pattern, we reduce the few pixels to a 1 pixel per note. Now, given these singular points, we can use activity 1 of AP 186 to find a correspondence to the row coordinate of the point and the frequency of the note. Its note worthy that notes of the same frequency resulted in the exactly the same row coordinate give or take 1 pixel at most.

Half Note

Now, we have the frequency of all the notes but not the timing. So how do we gather that information? Well, although using a half note as a pattern for correlation produces a high correlation with all notes, using a quarter note will only give us high correlation with quarter notes. Then using the difference between the column coordinates of the half note correlation and the quarter note correlation, we can distinguish the half notes and the quarter notes.

Quarter Note

Now, we have a London Bridge tune. But wait! Scilab produces sounds similar to that of old cellphones, just an output of a series of sine waves without a distinction between each note. So how do we adjust that? Well, instruments have what is called ADSR or attack-decay-sustain-release. This simulates how a person plays an instrument and how the instrument reacts. These reactions adjust the amplitude of the sound waves not the frequencies. So, by creating a function that will produce a time-dependent factor. Attack is a linear increase, decay is a linear decrease, sustain is a constant while release is a 1/t decay. By multiplying this element-wise to the sinusoid signal, we can produce a more realistic sound, more polyphonic.

Attack-Decay-Sustain-Release Graph

Finally, just for the fun of it, how do we simulate chords of a guitar or a piano. Well, the concept is simple. Sound wave just superimpose with each other. Meaning if we want to produce two frequencies at the same time, all we have to do is add them and normalize. How about three? Why not? We can now produce the three notes that make a chord.









No comments:

Post a Comment