Multi-Resolution Sound Texture Synthesis

1.First Result Set (Dubnov et al Comparison)

2.Further Result Set

3.Sources




1. First Result Set (Dubnov et al Comparision)

Training Examples [1]

Ye training example Fs[Hz] t1[s] Comment
1 drum loop 22k 3 Near-periodic; Percussion with occasional cymbal
2 baby crying 11k 13 Quasi-periodic; Baby crying
4 traffic jam 11k 22 Event-on-noise; Traffic noise with "yelling" and "honking" events, Recorded silence at extremities
3 shore, splashing 11k 18 Event-on-noise; Seawater ebbing with "splashing" event, Recorded silence
5 formula 1 race 11k 16 Event-on-noise; F1 cars accelerating with "gear-shifting" event, Recorded silence

Our Best Sound Textures

Ye sound texture, Ys t2_o[s] K w ε Seed Comment
1 drum loop 63 8 41 0.3 Beginning, samples=1-125 (1.5secs) Varied, Random emergence of cymbal, Barely audible clicks
2 baby crying 73 6 23 0.1 Beginning, samples=100-170 (0.4secs) Varied, Annoying ambience well reproduced
3 traffic jam 70 8 5 0.01 Centered, samples=300-311 (0.1secs) Good variation, Honking loops at the end, Silence on training example was clipped
4 shore, splashing 74 8 51 0.1 Beginning, samples=400-550 (3.5secs) Good variation, Silence on training example was clipped
5 formula 1 race 75 8 21 0.1 Beginning, samples=200-300 (2.3secs) Event-on-background ambience reproduced well, Silence on training example was clipped

Other Experimental Sound Textures

Ye sound texture, Ys t2_o[s] K w ε Seed Comment
1 drum loop 19 8 51 0.1 Centered, samples=1-87 (0.18secs) Repetitive but plausible, Evolves and gets locked into a different beat structure
1 drum loop 19 8 201 0.1 Centered, samples=1-201 (2.3secs) Heavier quality but plausible, Quicker cymbals, Evolves and gets locked again
1 drum loop 63 8 61 0.1 Beginning, samples=1-125 (1.5secs) Repetitive, Evolves and gets locked again
1 drum loop 63 8 201 0.1 Beginning, samples=1-201 (2.3secs) Perfectly repetitive but with no cymbals
2 baby crying 29 8 101 0.1 Centered, samples=380-580 (4.6secs) Evolves and gets stuck in a short repetitive loop but plausible
2 baby crying 53 6 751 0.1 Centered, samples=10-380 (2.1secs) Training example is smoothly tiled
2 baby crying 53 6 51 0.1 Centered, samples=100-250 (0.9secs) Good variation overall
2 baby crying 73 6 25 0.1 Beginning, samples=100-170 (0.4secs) Good variation overall
3 traffic jam 38 8 31 0.1 Centered, samples=200-400 (2.3secs) Training example is smoothly tiled
3 traffic jam 70 8 63 0.01 Centered, samples=500-650 (1.7secs) Some variation and some tiling, Too much honking, Silence on training example was clipped
3 traffic jam 82 7 3 0.1 Beginning, samples=450-457 (0.04secs) Training example is tiled, Silence on training example was clipped
4 shore, splashing 78 7 11 0.1 Beginning, samples=400-450 (0.6secs) Varied, but silences appear abruptly
4 shore, splashing 78 8 5 0.1 Beginning, samples=400-411 (0.3secs) Repetitive, Smooth and varied silence periods
4 shore, splashing 74 8 101 0.1 Beginning, samples=100-300 (4.6secs) Good variation, Silence on training example was clipped
5 formula 1 race 32 8 51 0.1 Centered, samples=300-500 (4.6secs) Training example is tiled, Some abrupt silences
5 formula 1 race 75 8 21 0.1 Beginning, samples=400-500 (2.3secs) Good variation overall, One click at end, Silence on training example was clipped




2. Further Result Set

Further Training Examples

Ye training example Fs[Hz] t1[s] Comment
6 crowd chatter 22k 13 Event-on-noise; Baseball crowd chatter, Man speaking and laughing and vendor shouts "nuts" events [2]
7 piano phrases 22k 26 Monophonic; Two phrases of piano music separated by a period of silence [3]
8 german speech 22k 12 Speech; Male speaking the German poem "Schiller" [2]
9 english speech + music 22k 10 Speech-on-music; Male preaching in English over background music [4]

Our Best Sound Textures

Ye sound texture, Ys t2_o[s] K w ε Seed Comment
6 crowd chatter 71 7 11 0.1 Beginning, samples=100-130 (0.2secs) Good variation of "nuts" shouting event, Looping of "nuts" results in slight "whirring"
7 piano phrases 116 8 51 0.3 Beginning, samples=100-200 (1.2secs) The two piano phrases occur again and again but vary in order, No clicks
8 german speech 28 6 51 0.001 Centered, samples=200-400 (0.6secs) Good long-term variation overall, One noticeable click
9 english speech + music 26 8 201 e=0.1 Centered, samples=400-550 (1.7secs) Good long-term variation overall, No clicks

Other Experimental Sound Textures

Ye sound texture, Ys t2_o[s] K w ε Seed Comment
6 crowd chatter 27 7 51 0.01 Centered, samples=100-200 (1.2secs) Variation between right and left of seed looping, Acceptable as background noise
6 crowd chatter 41 7 71 0.01 Centered, samples=100-300 (2.3secs) Good variation overall, Slight whirring heard after "nuts"
6 crowd chatter 71 8 11 0.1 Beginning, samples=100-130 (0.3secs) Smoother, No whirring but looping of vendor then man talking
7 piano phrases 146 8 51 0.3 Beginning, samples=100-200 (1.2secs) Different ordering of the two piano phrases
7 piano phrases 146 15 3 0.001 Beginning, samples=10-14 (5.9secs) Decomposition of structure to almost note-level, Fairly smooth random variation of notes, Segments of the original phrases still emerge
7 piano phrases 86 15 3 0.3 Beginning, samples=10-14 (5.9secs) Structure more erratic, Slightly more short-term looping, Abruptness
8 german speech 28 8 251 0.001 Centered, samples=400-700 (3.5secs) Variation between right and left of seed, Some looping, Plausible
8 german speech 60 15 3 0.3 Beginning, samples=1-4 (6secs) Structural decomposition, Good variation, Slight noise, Some annoying looping but acceptable
8 german speech 60 16 3 0.3 Beginning, samples=1-4 (12secs) Further structural decomposition, No noise, Strange sounds and repetition
9 english speech + music 70 15 3 0.1 Beginning, samples=1-4 (12secs) Good variation with fast tempo, Periods of major distortion but interesting
9 english speech + music 70 14 3 0.1 Beginning, samples=1-4 (6secs) Good variation with faster tempo, Periods of major distortion but interesting
9 english speech + music 70 13 3 0.1 Beginning, samples=1-4 (1.7secs) Structural decomposition, even faster tempo, interesting and varied, occasional major distortion still
9 english speech + music 70 12 3 0 Beginning, samples=1-4 (0.9 secs) Very fast tempo, Perhaps almost phoneme-level variation, Occasional appearance of novel words sounding like "moses", "harvest" and "robot", Occasional major distortion
9 english speech + music 70 15 3 10 Beginning, samples=1-4 (12secs) Even further decomposition, Bizarre sounding!




3. Sources

[1] S. Dubnov, Z. Bar-Joseph, R. El-Yaniv, D. Lischinski and M. Werman, Synthesizing Sound Textures through Wavelet Tree Learning, IEEE Journal of Computer Graphics and Applications, pp. 38-48, 2002. Dubnov et al Training Samples

[2] The Freesound Project (Large, tagged database of sound samples)

[3] Sound Ideas (Royalty free sound samples for purchase)

[4] Source unknown.