1 Introduction

When a kind of sounds is heard, often maybe a feeling with some kind of color is aroused simultaneously, which is described as synesthesia [22, 23]. Synesthesia refers to human’s psychological feeling that music could be seen and color could be heard [30]. The interaction between music and color has been vastly explored. That is, the audio and image are important components of in human interaction [21]. Even if there should be some relationship between music and color, because of the complexity, there is still lacking of convincing supports on both theoretical and practical aspects. If music inter-weaves in some correct way with colorful light flashing, impressive impacts on the visual and audio appreciation will be created. The key is that music should be matching with color in some way. For addressing this problem, a lot of match patterns for music and color were developed [5]. Poast [20] used color as expression tool and sound as music notation to develop an alternative notation system for expressing the meaning of color and music from the performers. Palmer et al. [19] established a relation between music rhythm and color for analyzing human varying emotion. That is, the faster the music, the more saturated obtained color, according to US and Mexican participants. A cross-modal [18, 19] utilized color and music to express human emotion, according to tempo (slow/fast), note-density (space/dense), mode (major/minor) and pitch-height (low/high). In terms of applications of music, [41] utilized different sensorium to analyze relation between color and music. This found that green color was highest relatively with campus songs, where denoted bigger amplitude P300 and youth. Combining music and color can better explore nature of perception and relations both language and technology [4]. For shopping, rhythm of music had certain effect on different color goods [31]. For example, rhythm of music is faster, human perfected dark color goods. Other-wise, human may purchase light color goods. Fusing perceptual idea into music player was a good tool for matching the music and color [32]. Whiteford et al. [32] conducted an association between perceptual features (i.e., loud, punchy) and chosen color (i.e., darker, redder and more saturated colors) for explain classical music. Understanding relation of color and music was very useful to treat synesthesia patients [2, 13].

Although these methods obtained good performance in matching color and music, they simply mapped music and certain color. Some electronic music players with colorful lights shining appear in market. However, there are no matching patterns among music and color. For most of them, color lights shine randomly, and for some others, light flashing just follows arranged color patterns in a fixed way. It not easy to make a light color match with music playing, because music changes all the time with a variety of styles. The purpose of the paper concentrates on resolving the problem of mismatching between music and color. On the basis of researching the fundamental of music and color, the corresponding relations between music and color which are the preconditions of the playing music recognition are founded.

In order to reach the effect that light colors change along with music playing, some works such as researching music acoustics and principle of color, audio signal recognition, and RGB-LED mixed lights driving must be completed. The complexity of relations between music and color involves that to determine a color is affected by many factors such as pitch, rhythms, loudness, chords, tonality etc. The relationship between music and color is analyzed theoretically. Even if music and color are formally different, some rules are set up in which a color can be determined corresponding to the music playing in real time. The relationship between music and color causes vast concerns, which leads to the proposed method on the intrinsic quality of music and color. With the developments of computer and multimedia technology, it provides a feasible way to make music stored, transferred, processed, and played accompanied with lights flashing, and also make the visualizing music be possible, which promotes the researches on psychology and cognition science [16].

Based on the work of music chord and color, the study on the correspondence between music and color and the method of identifying the music spectrum are presented. The purpose is to provide an expecting way to develop devices that play music with the matching color flashing in real time, which can enhance the development of color music. Music and color work together to produce synesthesia to the human consciousness, the nervous system and psychological impact, which make people have a wonderful illusion and a variety of feelings. People try to find the connection between the audio and lights. The simplest way is to try to set up the links between the scale of the seven tones and seven colors. Some people link the range of audio frequency for humans with the spectrum of lights in proportion way, that is, the lowest bass is red, until the highest tone is violet. This is only a mechanical link. There are some other people who try to find music and color links from the rhythm, tunes, harmony and other aspects. Most of the previous illustrations on music and color are limited in theory analysis.

Besides the theoretical study, the work of the paper focuses on the technology to implement a color music player whose function is to reach the matching of music and color. On the basis of studying the relationship between music and color, the work explores the mechanism of various chords forming music, carries out real-time recognition of music in play, and studies the mixed lights of RGB LEDs. Through audio spectrum identification method to determine the corresponding color of the audio signal playing, and control the color of light to achieve real-time music and color match the exact results. An experimental platform with a microprocessor embedded is developed, on which the tests such as music files reading and playing, music audio signal sampling and spectrum analysis, RGB-LED mixed lights with PWM driving can be performed. Among all the factors of music, pitch and loudness maybe possess the most significant frequency features with which the color corresponding to a sound can be determined. The audio signal is sampled when music is playing, and its frequency spectrum which reflects pitch can be obtained by Fast Fourier Transform (FFT) [12, 39]. When a color corresponding to an acoustic signal is determined, the color can be obtained by RGB LED lights emitting along with the music playing. This work can enhance the development of color music, which has a prosperous application in practice.

The contributions of this paper are as follows.

  1. 1.

    This paper establishes a relation table between color and music (frequency) via priori knowledge.

  2. 2.

    Sampling of FFT can timely achieve the match between color and music when the audio spectrum system works.

  3. 3.

    RGB-LED mixed lights with PWM is embedded into microprocessor to facilitate the audio spectrum system for working well.

2 Related work

2.1 Relation of music and color

Even some scholars and musicians believe in the existence of corresponding relations of music and color, but music is constituted by many factors, it is difficult to define what a color should be correspondent to a piece of music. The first work is to show the components of music such as loudness, pitch, tone, chord, rhythms etc., to set up the mapping relation between music and color. The second is to illustrate frequency analysis such as FFT algorithm [29] on audio signals, to establish rules to recognize audio signals from music. The third is to perform experiments on the control of RGB-LED mixed lights, on music audio signal sampling and analyzing, and on the effect of matching of music and light.

Sound can be divided into two kinds, musical tone, and noise. The vibration of musical sound possesses period waves. Music which is described as tonality, loudness and tone is formed by notes that are played along with time sequence. A note sounding can be analyzed by three parameters which are amplitude, fundamental frequency and frequency multiplication that covers an octave. As to the frequency of a note sounding, the pitch is the critical factor. The loudness of a note reflects its intensity which is determined by vibrating amplitude. Tone means components of frequencies whose proportion constitutes the features of a tone. The frequency spectrum which is obtained by frequency analysis shows the most significant character for musical audio signals. From the point of view of Physics, both of music and color belongs to undulatory property, and they just fall into different frequency categories that are about 20\(\sim \)20kHz for sound, and 3.9 × 10\(^{14}\sim \)8.6 × 1014Hz for visual light. The famous experiment shows that the triple prism resolves the light into seven colors of the spectrum. From that, Newton naturally proposed that the seven colors which are red, orange, yellow, green, cyan, blue, and violet are proportionally correspondent to the seven notes CDEFGAbB in frequency. It is so miraculous that when some people listen to music which can arouse the feeling with colors. The famous Russian pianist, Scriabin said that some colors are emerging in front of his eyes when he plays piano. Although the feeling of color and music seems subjective to people, the mapping relations of music and color also reflect some objective regularity [12]. Some explorations have been made on relations between rhythm and color, and rhythm represents note movements along time, including note length, and intensity that mainly relates to light brightness. Another important feature of music is chord whose relation with color has been discussed in [35]. As to tonality, Scriabin thinks A major is green, and C major red, and the colors change from color red to violet along with increasing pitch. But it lacks theoretical supports. Additionally, similar idea of establishing a mapping via priori knowledge can be extended into other field. For example, Silvia et al. [15] conducts a best path via walking preferences of walkers for saving time. Inspired by that, we map a relationship between color and music by priori knowledge.

2.2 Musical note and color

Relations among music and color are deeply tangled. Both audio signal and light possess characteristics of vibration and wave. Scriabin considers that colors change from red to purple with pitch increasing. The musician Rimsky Korsakov thinks that C major represents white, D major is yellow, and A major is rosy, etc. The varying of music and color shows that peoples emotional feeling also reflects some regularity [8]. Different notes in scales representing pitch possess different frequencies. The frequency ratio of middle C, c1 and its octave upper sound c2 is 1:2, that c1 is equivalent to 262 vibrations per second, and c2 is 524. These two sounds have no any conflicts in vibration, are similar for ears, and maybe the synesthesia colors are the same. The definition of melodic color scale is presented in [8]. Thus, if two notes satisfy the octave relation, they are treated as possessing the same color. If two notes are perfect fifth, their frequency ratio is 2:3, then the perfect fifth note to c1 is g1 (393 times per second) which is upped to 7 half-steps. The c major including notes (f), c, g, d, a, e, (b) is constituted by perfect fifth. Changing these notes position in pitch order and letting them be conformed into octave, then the natural scale, C major is obtained, and these notes c, d, e, f, g, a, b whose frequencies are 528, 596.6667, 660, 704, 770, 880, 990Hz, which are extremely conform to human’s feeling and naturally remind us to relate them with seven colors.

There exists an interval of wavelength corresponding to a kind of color to our eyes, and for the primary colors of light, the wavelength of red light is about 750\(\sim \)680nm, green 570\(\sim \)520nm and blue 445\(\sim \)470nm. The seven colors corresponding to the seven notes of natural scale are deduced in perfect fifth relation as shown in Table 1, which are red, orange, yellow, green, cyan, blue, violet.

Table 1 Seven colors deduced in perfect fifth relations

The first one who proposed twelvetone equal temperament is Chinese Ming Dynasty musician Zaiyu Zhu in Wanli years (1584). He proposed “new way close calculating” method, deduced the ratio \(\sqrt [{{\text {12}}}]{{\text {2}}}\) in order to calculate the ratio of the octave which is divided into twelve equal portions of algorithms, and created the world’s first tempered instrument. This theory is very likely that they were brought to Europe by the missionary Matteo Ricci. But, twelvetone equal temperament was not accepted at that time, when Baroque multi-voice music was pure intonation, very Concord resonance. To 1722, composer Bach published “The Well-Tempered Clavier”, that means “perfect tuning keyboard instruments.” Twelvetone equal temperament soon reveals its significance and value, that is, on the redeployment extremely convenient, and pitch tuning can be changed without any hindrance, which will greatly satisfy the needs of musicians. These twelve notes c, #c, d, #d, e, f, #f, g, #g, a, #a and b are equivalent to that the 11 notes are equally inserted in an octave. The frequency ratio of an adjacent two notes is \(\sqrt [{{\text {12}}}]{{\text {2}}}\).

Taking the frequency of note c as 1 for example, the correspondent frequency values of other notes are obtained shown in Table 2. Similarly the colors corresponding to the twelve-tone equal notes are also deduced as shown in Table 3. It should be indicated that the colors which are determined according to every note may not be objective but subjective. The attempt is to get the mapping relations among notes and colors, which deduces these 12 colors corresponding to twelve-tone equal notes between note c and c1. As to the other cases, like notes between c1 and c2, colors are also the same because of the cycling change. There is the same color hue even for different octave scales that is for the same pitch name such as e, no matter e1 or e2, the corresponding color is yellow. Here the mapping among musical notes and colors are set up. Besides musical note that represents pitch, the other factor, loudness also affects color. But the color hue is the same for the same note, and loudness just affects luminous intensity. If the frequency of a tone is obtained, then the color is definitely determined.

Table 2 Frequency ratio among notes of twelvetone equal temperament
Table 3 Colors according to notes of twelvetone equal temperament

2.3 FFT of audio signals

Signal processing, especially FFT [1] is very effective for video [11], image [27, 28] and speech [26]. Methods on audio signal analysis are widely used in digital audio engineering [36]. FFT is a useful algorithm adopted in real time test and control systems [34, 37]. From the above analysis, if the frequency of a note is determined, the color is obtained. The frequency spectrum of musical audio signal can be gotten by FFT. In order to match the color to the music in the play, spectral analysis of the played audio signal alone is not sufficient and it is necessary to determine the spectrum of the audio being played in real time by comparing it with the spectrum of standard music notes or chords discussed earlier. When the frequent components of the audio signal are obtained, then peaks of the spectrum are just corresponding to the notes or chords, and then the colors are determined, which decides PWM driving the RGB-LEDs. For a piece of music in playing, the spectrum of the music is obtained through its audio signal A/D sampling and FFT transformation. FFT algorithm makes use of the periodicity, symmetry and other properties of the rotation factor. The Discrete Fourier Transform (DFT) algorithm is decomposed into several points in succession by using the property of FFT. Thus, FFT is an improved algorithm, which reduces the amount of computation. In the embedded experimental system, the FFT algorithm is adopted to analyze audio signals. It is possible in the system to implement audio signal sampling, FFT, and determining spectrum of the audio signal in real time playing. The spectrum of the audio signal of the music being played is compared with the spectrum of the musical chord, that is, the spectrum of the audio signal corresponding to or similar to the spectrum of the musical chord is determined by the frequency comparison method. Then, the color corresponding to the music should be determined.

The frequency scope is limited in 20\(\sim \)20KHz for audio signals, in order to keep the converting precision, the sampling frequency fs should be at least 40KHz according to Nyquist sample theorem to avoid the distortion. To estimate the total sampling time Tts, if AD converting time is 20s, number of sampling points N is 2,048, Tts is about 41ms. fs can reach 50KHz, which can satisfy Nyquist sample theorem. The sampling frequency fs is 44KHz for audio signal sampling in industry standard, fs and N can be lower in practical application [12].

Discrete Fourier Transform needs to compute the array equation as follows:

$$ X(n) = \sum\limits_{n = 0}^{N - 1} {x(k){e^{- j2\pi nk/N}}}, n = 0,1,2,...,N - 1 $$
(1)

Let W be the rotation factor, then

$$ {W^{nk}} = {e^{- j2\pi nk/N}} = \cos (2\pi nk/N) - j\sin (2\pi nk/N) $$
(2)

FFT makes DFT algorithm be decomposed into less points, reduces the tremendous complex computation by using the periodicity and duality of W. FFT and the short time Fourier transform (STFT) are essential techniques in harmonic analysis in real time [10]. Formulation above can be expressed as the corresponding flow chart as show in Fig. 1. Performing FFT needs following along the process of signal computing flow chart. Calculation of N = 2M points is divided into M level, each level has N/2 butterfly computing [6, 38]. To compute W involves complex number computing that is converted into cos and sin computing by using Eulers formula, which simplifies the computing process. The results of outputs X(n) are complex number including real and image parts, they can be converted into amplitude expression. Because the order of X(n) are not the same as that of the original signal x(n), the procedure of unscrambling is to reverse the order of position bits. The FFT signal flow chart as shown in Fig. 1 is the computing process case when N is 8. The output results \(x(0)\sim x(7)\) are in order, but the original signals \(x(0)\sim x(7)\) are not [36]. The unscrambling procedure is needed. The position number of x is binary encoded, as 000, 001,010,011,100,101,110 and 111, and reverse them into 000, 100, 010, 110, 001, 101, 011, and 111, which is according to the sequence x(0), x(4), x(2), x(6), x(1), x(5), x(3), and x(7). For an audio signal, the original sequence of sampling data is unscrambled into a new sequence which is used as input to perform FFT, and the data of frequency spectrum of the audio signal are obtained [17].

Fig. 1
figure 1

Signal flow chart of FFT

When short pieces of chord tones are chosen to perform FFT, the frequency of sampling fs is 44KHz and number of sampling points N is 2,048 which is to keep the precision of conversion. The total sampling period lasts just about 47ms. According to Nyquist sampling theorem, the highest frequency fm would be 22KHz, and all the audio signals with frequencies in the scope of fm will be differentiated. This can satisfy the application for audible frequency range is about 20Hz\(\sim \)20KHz. If the amplitude of an audio signal in time domain changes fast in per unit time, then the audio signal possesses high frequency components. Vice versa, if it changes slowly, then it has more low frequency components. Fig. 2 corresponds to the A major triad in the time domain and frequency domain waveform respectively, Fig. 3 corresponds to G major triad. The features of audio signals are significant in spectrum of spectrum [8].

Fig. 2
figure 2

A major triad audio signal and its spectrum

Fig. 3
figure 3

G major triad audio signal and its spectrum

When music is being played, its audio signal is sampled, the frequency spectrum of the signal is obtained by FFT and then the frequency components of the signal are recognized. After the musical note and its corresponding color are determined, the RGB-LED lights are driven to form the mixed color. All the processes are finished instantly, and colorful lights flash in real time along with the music playing. But for music, it is hard to imagine that an audio signal is just constituted of one only frequency, but often it contains musical harmonic, chords etc., includes several frequencies at a time.

The frequency spectrum of a piece of musical chords is obtained by FFT, as shown in Fig. 2. There are four peaks that represent four frequency components at the time. The peaks of frequency data shown in Table 4 are 301.46Hz, 301.46Hz, 430.66Hz, 882.86Hz, the corresponding musical notes should be d1(293.66Hz), #f1(369.99Hz), a1(440Hz), a2(880.375Hz) respectively. According to Table 3, there are three colors red-orange, green, and blue (both a1, a2 are the same color) corresponding to these notes. What color should it be when the piece of musical chords are being played at the time? If these three colors are mingled together, some color for example yellow may be created. But, it is hard to implement. To solve the problem, the time that the RGB LED lights flashes last is proportional to the frequency amplitude in the frequency spectrum. The ratio of the three colors flashing time is about 1.82:1.2:1(1.2495: 0.5110 + 0.3199: 0.6889), which is corresponding to the color green, blue and red-orange. If the total lights flashing time to the chords is about 800ms, then the color green should last about 360ms, blue 240ms and red-orange 200ms. Even each color flashes individually in time division pattern, for eyes retention of vision phenomenon, some other color pattern is produced.

Table 4 Data of FFT results

3 Development of the system platform

3.1 System overall design

The system with a microprocessor IAP15W4K61S4 embedded is developed in order to perform experiments related with the work on matching of music playing and colors in real time. Music playing is implemented by an embedded system related with audio file reading and decoding [25, 42]. IAP15W4K61S4 which belongs to STC15 family microprocessors is with the latest seventh generation encryption technology for enhanced high-performance MPU, containing 4K of data RAM, program memory and 61K of ROM, dozens of I/O ports, 8-channel A/D converters PWM outputs, SPI interface, UART interfaces and timers. The system platform is developed based on both hardware and software modularization, has resource-rich, good scalability, and experiments can be carried out according to the specific combination of functional modules.

The system hardware design is shown in Fig. 4, which contains LCD display, U disk and SD/TF card interface CH378L, RGB-LED mixed lights driving circuit, VS1003 audio decoder, LED display, keyboard circuit, and audio signal sampling circuit etc. LCD display is a LCD12864 module, and LED display is 6 digit LEDs which is driven via chips 74CH595 in a serial way. The RGB-LEDs are driven by the PWM outputs from the microprocessor. As secondary storages, SD / TF cards and U disks with large capacity can store large data files, such as audio files [3, 14]. The interface chip CH378L is selected because of its good features for U disk and SD/TF card interfacing. The microprocessor accesses audio files which are stored in a U disk or a SD/TF card by CH378L, deliver them to the audio signal decoder VS1003 whose audio outputs drive speakers to implement musical audio play. In fact there are two channel audio outputs which are enlarged via two amplifiers and sampled by ADC, and the sampled discrete data are transformed by FFT to get the frequency of spectrum of the audio signal for color recognition. VS1003 interfaces with the microprocessor via the SPI bus, and the operation is introduced [35, 38]. There are two channel audio outputs from the VS1003, which drive the two speakers. A VS1003 chip can recognize MP3 and WAV formats of audio data [7].

Fig. 4
figure 4

System Overall Diagram

When the system works, an audio file which is stored in the U disk or the SD card or TF card can be read through the chip CH378L, and via the parallel port bus into the microprocessor IAP15W4K61S4. Then the microprocessor feeds data of the audio file to the audio file decoder VS1003 through the SPI bus at a steady speed to keep the music playing fluently. When the music is played, its analogue audio signal is sampled by the ADC in the microprocessor at the frequency of sampling fs which can reach up to 44KHz, the number of sampling points N can be 256 in order to save a sampling process time for the embedded system. The interval between two group N sampling can be set to 500ms, which means the microprocessor must complete all the work including audio file reading, playing, N points sampling, and FFT transform to the N points of signal in time domain, spectrum judging, determining a color, and driving PWM output to present the mixed color.

3.2 Interface design of CH378L

As a novel interface chip with U disk and SD/TF card, CH378L has multiple buses such as the serial, SPI, and parallel which are convenient to hook up with a microprocessor [7]. The advantages of possessing a data buffer which is as large as 20K, and supporting FAT32 file system make CH378L superior to other interface chips such as CH375 which can just interface with a U disk but cannot with an SD/TF card. A microprocessor can connect an SD/TF card directly by SPI, but it needs a file system such as FAT32 to manipulate the SD/TF card, which increases memory space to hold the file system, and also creates a burden to the microprocessor. With the file system embedded with, CH378L can operate files on a U disk and SD/TF card connected to it. It is neither necessary to know USB communication protocol, nor to understand a file system. The operation on files stored on a U disk and SD/TF card is implemented just through CH378L’ commands. The parallel data interface between the CH378L and the microprocessor possesses advantages for large audio files fast transmission. The parallel port connection is adopted for the microprocessor with CH378L, which accelerates the data transfer operation [7]. INT# is the state port of CH378L, when it becomes 0, an interrupt is issued. The parallel interface signal lines include address signal A0, chip select signal PCS#, read signal RD#, write signal WR# and bidirectional data bus D7 D0. The interface between CH378L and the microprocessor IAP15W4K61S4 is implemented by using the parallel data bus to implement as shown in Fig. 5 [7]. The interfaces between CH378L and an SD/TF card, a U disk are as shown in Fig. 5. The CH378L works as the interface between the microprocessor and the storages which includes a U disk and an SD/TF card. With a built-in USB transmission control protocol, FAT32 file system firmware for CH378L, it is convenient for the microprocessor to get the data of files in U disk or SD/TF card via CH378L [9].

Fig. 5
figure 5

Interfaces with SD/TF card and U disk

3.3 RGB-LED mixed light driving

Red, Green, Blue (RGB) are the primary colors with which other colors can be formed [33]. For light emitting diode (LED) lights, to change the proportion of the luminance of RGB LED lights, the color of mixed lights is changed [40]. The analysis on color difference of an image also can be implemented through the RGB color measurement [24]. International Commission on Illumination (CIE) specifies that white color light is formed by the equal mixed ratio of red, green and blue lights [17]. Similarly, the yellow light is got by the equal ratio of mixing red and green colors of lights, and magenta is formed by the equal ratio of red and blue lights. Any color of lights can be obtained by changing the proportion of RGB. Three PWM outputs of the microprocessor are used to drive RGB LED lights respectively. There is a color and RGB table in which each color is corresponding to a set of values of RGB. Part of mixed colors and the corresponding RGB values are listed in Table 5. When a color is determined, then its RGB values are obtained according to the table.

Table 5 Mixed light color with RGB value

The PWM output of the microprocessor IAP15W4K61S4 drives the RGB-LED circuit. The schematic of driving circuit is shown as in Fig. 6. The LED lights are driven by NPN triodes. Because the conduct voltages of red, green and blue LEDs are different, the values of resistors that limit current for red, green, blue tubes are also different. As to red, green and blues, because their turn-on voltages are different, the uniformity brightness under the same RGB value can be achieved by adjusting their limiting resistors. The mixed lights are implemented by the three PWM outputs whose duty ratios determine the effect of mixed RGB-LED lights. Figure 7 is the component of RGB-LED on which each LED is individually located. For each group of LED lamps, there are 12 LED lights in this design, including red (R), green (G) and blue (B), respectively, depending on the number of LED lights which can increase or decrease. (T1/T) determines the ratio of the RGB lights which directly control the LED mixed light effect. It should be noted that the previously discussed principle of mixed light in the three primary colors R, G, B LED luminous tube that is based on the consistency of brightness for LEDs. But, because the three kinds of RGB LEDs have different light-emitting tube on-voltage drops in conduction state, their brightnesses are also different under the same conditions. So in practical application, there also needs software modification to adjust the mixed light reaching expecting effects.

Fig. 6
figure 6

Schematic of RGB LED lights driving

Fig. 7
figure 7

Component of RGB LEDs

The experiments which are performed on the system platform includes aspects of audio files playing and mixed lights. In order to keep music playing, the audio decoder VS1003 should be fed at a constant rate that makes the audio data stream flow in a continue way. There is a buffer of 512KB for VS1003, when it is becoming empty, the signal DREQ goes high, then the microprocessor should transfer at least 32KB to VS1003 via SPI at a time. The VS1003 decodes the data whose forms are MP3, WAV and WMA formats, and outputs duel-channel audio signals to the speakers to implement music playing.

The whole system is comprised of modules including the main board, the LED display, LCD display, CH378L module and VS1003 module etc. First of all, all the components and modules are debugged to make sure they function well. The system software is compiled in Keil version4 environment, using the C programming language. Tests on the minimum system of the microprocessor, the LED display circuit, the keyboard circuit, the LCD display are completed. The test on the PWM output driving LED RGB to form mixed lights involves color calibration, as it is discussed in Section 3.3, and as shown in Table 5. As to the color and RGB value table, the RGB value for color white is 255, 255, 255, and the white color number is correspondent to 25, So to choose the color white (25), and the three PWM outputs are all in full duty which means their PWM registers are all set to 255, then the mixed color should be pure white, if it is not the case, then to adjust the values of these three PWM registers, to observe the mixed light going to be pure white. The mixed colors test with music playing is shown in Fig. 8. The figure shows that the mixed color is 117 which is magenta and the effect of mixed light is not well distributed. It is because all the RGB LEDs are individual tubes as shown in Fig. 7. The compound surface mounting LEDs are chosen to replace the individual tubes (see Fig. 9.) which create homogeneous mixed effects as shown in Fig. 10.

Fig. 8
figure 8

Mixed color calibration

Fig. 9
figure 9

Compound surface mounting LEDs

Fig. 10
figure 10

Effect of mixed RGB LED light

4 Experiments

The sampling frequency of audio signals conforms to Nyquist sampling theorem. The interval between two groups of sampling falls in 500 to 800 milliseconds. Sampling number N is 256 by considering the factor of the precision of spectrum analysis and the FFT time. The time cost by audio file reading, transferring to the VS1003, sampling, FFT, color recognition and RGB LED driving is limited in the interval. So time scheduling is important for the program. The audio signals are also sent to the amplifiers at the same time to the ADC ports of the microprocessor. The audio digitalized data are used for frequency spectrum analyzing by FFT to determine colors for RGB LED lights.

The process to operate the CH378L is to initialize it, to determine an SD/TF card or a U disk connection, to determine audio files in the SD/TF card or U disk, then to set up an audio file table for indexing to implement reading a file in the table at any time. The microprocessor reads a file data via CH378L, outputs them to the audio decoder VS1003 to keep audio data stream flowing to make music playing. The debugging information of music playing is as shown in Fig. 11. Although colors can be got theoretically by mixed via the proportions of RGB LED lights whose conducting time are controlled by PWM outputs, the real effect of light colors needs to be adjusted by modifying the duty ratios of PWM outputs. The mixed lights of RGB LEDs are implemented by using three PWM outputs that drives the RGB-LEDs. The mixed color test is performed, and the driving waveforms of three PWM outputs are shown in Figs. 12 and 13. The color response to PWM output control is fast, which can satisfy any RGB light flash pattern in any color in real time control way. The color change controlled by PWM outputs can reach the requirement of the music playing.

Fig. 11
figure 11

Information of music playing

Fig. 12
figure 12

PWM Driving of Red and Green

Fig. 13
figure 13

M Driving of Red and Blue

The key to play music is to ensure the continuity of audio files reading from a U disk or an SD/TF card and the audio data playing via the VS1003. Besides reading and playing audio files, the system samples audio signals of the two channels, converts them into the frequency signals them via FFT, and analyzes them. There are many other works for the microprocessor to do, such as transferring the audio data to the decoder VS1003, LED and LCD displaying, RGB LEDs driving etc. If the timing the system is not arranged properly, there are conflicts that spoil the system working. The system is debugged through the experiments, and the scheduling problems are solved through multi-interrupt technology.

Although color music has undergone a long developing history, the mapping relation between color and music has not been determined. According to an online report, An Autodesk engineer named Evan Atherton has successfully created a device of sound equipment, Objet Connex 500 which not only plays music, but also has colorful flashing that are automatically converted into colors with the music rhythm [8]. Although Objet Connex 500 is a relatively new music player, it has not reached the effect of RGB LEDs mixed lights. It belongs to the category of color music and is still in the development stage. There are many researches that focus on music and color. The previous study on audio spectrum analysis, and spectrum recognition still have unresolved problems left, which try to set up standard frequency spectrum of musical triad chords and corresponding colors to each chord.

Music chords are considered as a standard reference. The most similar chord to the music playing is recognized via comparing the spectrum of standard music chords and audio signals being played, and then the corresponding color is determined. When music is playing, its spectrum of the audio signal is obtained, then the corresponding chords are recognized, and the color is determined. But there are problems that music is not just composed of chords, the recognition of spectrum is not precise enough, and also the mapping relation between chords and color is difficult to set up. The paper presents a new method that to make music can be matched with colors in real time. The perfect fifth relationship of colors which is corresponding to music perfect fifth is deduced that means colors possess the property of music. The colors corresponding to the musical notes of twelvetone equal temperament are set up, which means that each note or a frequency of vibration has a color correspondent. When the frequency spectrum of music is obtained at a time, no matter it is complex with many frequencies or simple with just one frequency, then the corresponding colors can be shown with PWM controlling RGB LED driving. The experiments on RGB LED mixed lights, music play, FFT analyzing, matching of music and color are performed on the system platform whose functions are developed to support the ideas. The purposes to optimizing parameters of sampling the audio signal, color recognition and driving RGB-LED are reached. A color is determined corresponding to the music at a moment and the effect is implemented that matching colors are presented automatically to arbitrary playing music. Although color music has been studied by many people, the match between music still does not reach an ideal effect. The researches in this work try to find a new way to resolve the problem. When the relations among music notes and colors have been set up, the color corresponding to each note can be determined. When music is playing, the spectrum of the audio signal is obtained through FFT, which means the frequency components of the sound at the time are determined. The musical notes are obtained through the recognition of audio signals, and then the color is determined. This recognizing process is an interval limited in 500ms, and then the mixed color flashing is kept in concurrency with the music sound. If there are several notes, assuming there are three notes that are recognized, what color it should be. The color deducing method seems very difficult. Time division modulation (TDM) is used to solve this problem well. Thus, this paper is very meaningful to industry.

5 Conclusion

The paper adopts technologies such as digital signal processing, FFT and microprocessor. The work involves the embedded system design which includes the system board, the interface with the SD card, U disk, the audio decoding chip interface circuit VS1003 and the driving RGB LED lights. The spectrum of the audio is analyzed, and the relationship between the spectrum of the visible spectrum and the audio spectrum which is the essence of the sound has been explored. According to the characteristics of the audio spectrum which can reflect the change of the music emotion, the color is determined. The audio spectrum analysis, optimized signal processing parameters, color mixing test, and timing scheduling are performed on this audio playing system. In the application, the research is based on spectrum-driven RGB-LED mixed lights of the audio player, which achieves the effect that for randomly being played music, there is automatic real-time rendering with its matching color effect. To achieve the effect of playing music and color match exactly, music and color can be matched in the synergistic way, which gives people an impact on the audio-visual, psychological aspects. If wireless network technology such as Zigbee is adopted, the system can be formed in a larger system covering a large area with others musical equipments, which will enhance the application of color music in landscape effect. This study can reveal the relationship between music and color from the perspective of spectrum, promote the development of color music, and have a wide application prospect in urban landscape, stage lighting and psychological treatments.