In the near future, embedded systems designers will be able to use hardware and software interchangeably depending on which best solves a specific design problem. Up until now, the barriers were so high for software engineers who wanted to learn hardware design that few explored the space. The barrier is crumbling because of the similarity between hardware description languages and programming languages. Also, several reasonably low-cost demonstration boards are available that include a field-programmable gate array (FPGA), a microprocessor, and tools that even a software developer can use to learn hardware design.
This article offers an example of this new design process using an FPGA. We'll look at how to implement pulse width modulation (PWM) in software and then turn the design into a logic block that can run from an FPGA and be controlled via software using a memory-mapped I/O interface. You can do everything in this article with the FPGA development kits that are available today from major FPGA manufacturers.
Several things have changed that make it easier for software engineers to participate in hardware design. Both hardware and software modules are now designed using programming languages. As you know, C is the lingua franca of embedded software design. On the hardware side, Verilog is often the popular choice (though both VHDL and Verilog are popular). The syntax and structure of Verilog is similar to that of the C programming language, as the examples in this article will illustrate.
At the same time, hardware is getting easier to update and change. It used to be that software could be changed simply by downloading a new executable image while hardware could not. That's no longer entirely true. Just as a software developer can make a quick edit, recompile, and then download the new code into memory, hardware designers using programmable logic have a similar capability. Programmable logic changes the method for designing embedded systems by enabling you to change the hardware as easily as the software. In other words, it affords you the flexibility during design and debugging to choose the best way to handle these tasks—either in software or hardware.
Tools are available from FPGA vendors that enable a designer with a little knowledge of hardware to develop an embedded system for programmable logic, such as an FPGA. For example, the SOPC Builder from Altera (my employer) enables system designers to select and configure peripherals from an existing library as well as add user logic to create and tie peripherals together. With programmable logic and some hardware knowledge, software engineers can take advantage of the benefits of hardware to improve their systems.
A PWM controller produces a stream of pulses like those shown in Figure 1. Usually the period and pulse width are specified. The duty cycle, or on time, is defined as the ratio of the pulse width to the period. Figure 1 shows a PWM waveform of about 33% duty cycle.
Figure 1: A PWM waveform
PWM is used in many applications, most frequently to control analog circuitry. Because the digital signal varies continuously at a relatively fast rate (depending on the period, of course), the resulting signal will have an average voltage value, which can be used to control an analog device. For example, if a stream of PWM pulses is sent to a motor, it will turn at a rate proportional to the duty cycle (from 0% to 100%). If the duty cycle is increased, the motor will turn faster; likewise, if the duty cycle is decreased the motor will slow.
For more information about PWM, consult Michael Barr's Beginner's Corner article "Introduction to Pulse Width Modulation" (September 2001, p. 103).
Generally speaking, PWM is implemented in hardware because the output signal must be continuously updated—going high from the start of each period for the proper time, then low for the remainder of the period. Software usually just handles the selection of the period and duty cycle and perhaps occasionally making changes to the duty cycle, to effect a behavioral change in whatever is attached to the PWM output.
There's no reason, however, that software couldn't be used to implement PWM, say by bit-banging a spare output pin. Writing such a PWM controller in software is a relatively trivial task and helps illustrate what we will do in Verilog shortly. Listing 1 shows the C code for PWM.
Listing 1: A bit-banging PWM controller implemented entirely in software
pwmTask(uint32_t pulse_width, uint32_t period)
uint32_t time_on = pulse_width;
uint32_t time_off = period - pulse_width;
pwm_output = 1;
pwm_output = 0;
Based on the pulse_width and period arguments to this function, the PWM calculates the amount of time the output will be high and low. The infinite loop then sets the output pin high, waits for time_on time units to elapse, sets the output low, waits for time_off, and then repeats the cycle for the next period.
Listing 2 shows a simple Verilog module implementing an 8-bit wide register with an asynchronous reset. The input of the register, in, is assigned to the output, out, upon the rising edge of the clock, unless the falling edge of the clr_n reset signal occurs (in which case the output is assigned a value of 0).
Listing 2: Verilog module for a register with asynchronous reset
module simple_register(in, out, clr_n, clk, a);
| ||// port declarations|
| || input |
| clr_n; |
| ||// signal declarations|
| || reg [7:0] |
| out; |
| || // implement a register with asynchronous clear |
always @(posedge clk or negedge clr_n)
| || ||if (clr_n == 0) // could also be written if (!clr_n)|
| || || ||out <= 0;|
| || ||else|
| || || ||out <= in;|
| || || end |
// continuous assignment
Glancing at the Verilog listing, you should notice several similarities to the C programming language. A semicolon is used to end each statement and the comment delimiters are the same (both /* */ and // are recognized). An == operator is also used to test equality. Verilog's if..then..else is similar to that of C, except that the keywords begin and end are used instead of curly braces. In fact, the begin and end keywords are optional for single-statement blocks, just like C's curly braces. Both Verilog and C are case sensitive as well.
Of course, one key difference between hardware and software is how they "run." A hardware design consists of many elements all running in parallel. Once the device is powered on, every element of the hardware is always executing. Depending on the control logic and the data input, of course, some elements of the device may not change their outputs. However, they're always "running."
In contrast, only one small portion of an entire software design (even one with multiple software tasks defined) is being executed at any one time. If there's just one processor, only one instruction is actually being executed at a time. The rest of the software can be considered dormant, unlike the rest of the hardware. Variables may exist with a valid value, but most of the time they're not involved in any processing.
This difference in behavior translates to differences in the way we program hardware and software code. Software is executed serially, so that each line of code is executed only after the line before it is complete (except for nonlinearities on interrupts or at the behest of an operating system).
A Verilog module starts with the module keyword followed by the name of the module and the port list, which is a list of the names of all the inputs and outputs of the module. The next section contains the port declarations. Note that all of the input and outputs appear in both the port list in the first line of the module and in the port declarations section.
In Verilog, two types of internal signals are widely used: reg and wire. These types differ in function. All parts have a signal by the same name implicitly declared as a wire. Therefore, the line declaring a as a wire is not necessary. A reg will hold the last assigned value so it doesn't need to be driven at all times. Signals of type wire are used for asynchronous logic and sometimes to connect signals. Because a reg holds the last value driven, inputs cannot be declared as a reg. An input can change at any time asynchronous to any event in the Verilog module. The main difference, however, is that signals of type reg can only be assigned a value in procedural blocks (discussed later) while signals of type wire can only be assigned a value outside of procedural blocks. Both signal types can appear on the right-hand side of the assignment operator inside or outside of any procedural block.
It's important to understand that using the reg keyword doesn't necessarily mean the compiler will create a register. The code in Listing 2 has one internal signal of type reg that's 8 bits wide and called out. This module infers a register because of the way the always block (a type of procedural block) is written. Notice that the signal a is a wire and thus is assigned a value only in the continuous assignment statement while out, a reg, is assigned a value only in the always block.
An always block is a type of procedural block used to update signals only when something changes. The group of expressions inside the parentheses of the always statement is called the sensitivity list; it's of the form:
(expression or expression ...)
The code inside the always block is executed whenever any expression in its sensitivity list is true. The Verilog keywords for rising edge and falling edge are posedge and negedge, respectively. These are often used in sensitivity lists. In the example shown, if the rising edge of the clk signal or the falling edge of the clr_n signal occurs, the statements inside the always block will be executed.
To infer a register, the output should be updated on the rising edge of the clock (falling edge would work too, but the rising edge is more common). Adding negedge clr_n makes the register reset upon the falling edge of the clr_n signal. Not all sensitivity lists will contain the keywords posedge or negedge, though, so there won't always be an actual register in the resulting hardware.
Inside the always block, the first statement asks if the falling edge of the clr_n signal occurred. If it did, then the next line of code sets out to 0. These lines of code implement the asynchronous reset portion of the register. If the conditional statement were:
if (negedge clr_n and clk == 1)
then it would be a synchronous reset that depends on the clock.
You may have noticed that the assignment operators inside the always block are different from the one used in the continuous assignment statement that begins with the assign keyword. The <= operator is used for nonblocking assignments while the = operator is used for blocking assignments.
In a group of blocking assignments, the first assignment is evaluated and assigned before the next blocking assignment is executed. This process is just like C's serial execution of statements. With nonblocking assignments, though, the right hand side of all assignments are evaluated and assigned simultaneously. Continuous assignment statements must use the blocking assignment (the compiler will give an error otherwise).
To make the code less prone to errors, it's recommended that you use nonblocking assignments for all assignments in an always block with sequential logic (for example, logic that you want implemented as registers). Most always blocks should use nonblocking assignment statements. If the always block has all combinatorial logic, then you'll want to use blocking assignments.
One of the first tasks when writing a memory-mapped hardware module is to decide what the register map will look like from the software perspective. In the case of PWM, you want to be able to set the period and pulse width in software. In hardware, making a counter that counts system clock cycles is easy. Therefore, there will be two registers, the pulse_width and the period, both measured in clock cycles. Table 1 shows the register map for the PWM.
Table 1: Register map for PWM
|0||period||32 bits||Number of clock cycles for one period|
|1||pulse_width||32 bits||Number of clock cycles the output will be high|
Next, choose the ports for the PWM, most of which are already determined based on the bus architecture. Table 2 has a brief description of the signals for a generic memory-mapped PWM. Note that a popular naming convention for active low signals is to add an "_n" to the signal name, which are fairly common for control signals. The signals write_n, and clr_n in Table 2 are active low (falling-edge triggered) signals.
Table 2: Ports for PWM
|write_data[31:0]||Input||Write data (for registers in register map)|
|write_n||Input||Write enable, active low|
|addr||Input||Address (to access registers in register map)|
|clr_n||Input||Clear, active low|
|read_data[31:0]||Output||Read data output|
Now that we have defined the interface of the hardware module, we can start writing the Verilog code. An example implementation is shown in Listing 3.
Listing 3: PWM hardware implementation in Verilog
|module pwm (clk, write_data, cs, write_n, addr, clr_n, read_data, pwm_out);|
|// port declarations|
| input |
| clk; |
|// signal declarations|
| reg [31:0] |
| period; |
period_en, pulse_width_en; // write enables
| // Define contents of period and pulse_width registers |
// including write access for these registers
always @(posedge clk or negedge clr_n)
|if (clr_n == 0)|
| period <= 32'h 00000000; |
pulse_width <= 32'h 00000000;
|period <= write_data[31:0];|
|period <= period;|
|pulse_width <= write_data[31:0];|
|pulse_width <= pulse_width;|
|// read access for period and pulse_width registers always @(addr or period or pulse_width)|
|if (addr == 0)|
|read_data = period;|
| read_data = pulse_width; |
| // counter which continually counts up to period |
always @(posedge clk or negedge clr_n)
|if (clr_n == 0)|
|counter <= 0;|
| else |
if (counter >= period - 1) // count from 0 to (period-1)
|counter <= 0;|
|counter <= counter + 1;|
| end |
// Turns output on while counter is less than pulse_width; otherwise
// turns output off.
// !off is connected to PWM output
always @(posedge clk or negedge clr_n)
|if (clr_n == 0)|
|off <= 0;|
|if (counter >= pulse_width)|
|off <= 1;|
|if (counter == 0)|
|off <= 0;|
|off <= off;|
| // write enable signals for writing to period and pulse_width registers |
assign period_en = cs & !write_n & !addr;
assign pulse_width_en = cs & !write_n & addr;
// PWM output
The first signals are the port declarations, which were described in Table 2. After the port declarations come the internal signal declarations. The memory-mapped registers that make up the software interface to control the PWM are declared reg. The code allows for only 32-bit accesses to these memory-mapped registers. If you need 8-bit or 16-bit access, then you would split the registers into four 8-bit registers and add logic for byte enable signals. The Verilog code to implement this is straightforward. All the signals with assigned values in the always blocks are also declared reg. The signals declared wire are the write enables for the registers period and pulse_width. These signals are assigned values using continuous assignment statements.
The rest of the listing contains the actual code. There are four always blocks and several assignment statements at the end. Each always block describes the behavior for one signal or a group of signals that have the same basic behavior (in other words, use the same control logic). This is a clean way of writing Verilog code that keeps the code readable and less prone to errors. All of the always blocks have reset logic that sets the signal(s) to 0 when the clr_n signal is asserted (set to 0). While not strictly necessary, this is a good design practice so that every signal has a known value upon reset.
The first always block describes the behavior of the registers in the register map. The value of the write_data register is written into the period or pulse_width register if the appropriate enable signal is asserted. That is the only way to change the values of either register. The write enable signals are defined in the continuous assignment statements at the bottom of the file. The write enables for the period and pulse_width registers are asserted when the main write enable signal and the chip select signal are asserted; the addr bit should be set to 0 for period and 1 for pulse_width.
The second always block defines reading the registers in the register map. The period register will be at the base address of the peripheral, and the pulse_width register will be at the next 32-bit word.
The third and fourth always blocks work together to determine the output of the PWM. The third always block implements a counter that continually counts up to the value in the period register, resets to 0, and begins counting again. The fourth always block compares this counter value to the pulse_width register. While the counter value is less than the pulse_width, the PWM output is kept high; otherwise it's set low.
One thing to keep in mind is that every signal must have a defined value under all conditions. This goes back to one of the fundamental behaviors of hardware—it's always running. For example, in the last always block (the one that describes the off signal) the last line of code assigns off to itself. This may seem strange at first, but without this line, the value of off would be undefined for that case. An easy way to keep track of this is to make sure that every time a signal is assigned a value in an if statement, it is assigned a value in the corresponding else statement as well.
Now that the hardware is complete, the PWM can be controlled via software using the registers in the register map. You can use a simple data structure along with a pointer to connect to the registers in the PWM.
typedef volatile struct
For example, the PWM could be hooked to an LED. A variable called pLED of type PWM * could be initialized to point to the PWM base address. This abstracts the hardware into a data structure. Writing to pLED->period will set or change the period. Writing to pLED->pulse_width will change the duty cycle and cause the brightness of the LED to increase or decrease. If a blinking LED is desired, the period need only be lengthened, so that the human eye perceives the on and off periods as distinct.
The Verilog PWM implementation shown in Listing 3 was tested as a peripheral for Altera's Nios processor system and accessed via software using a C struct like the one I previously described. Altera's SOPC Builder creates macros that facilitate performing co-simulation in ModelSim, a hardware simulator from Mentor Graphics. Using the ModelSim simulator, the behavior of the PWM signals, along with the rest of the system's signals, can be observed while the system is executing C code.
Listing 4 shows the C code that was used to generate the waveform in Figure 2. The waveform shows the behavior of the pertinent PWM signals. The C code writes to the PWM registers to create a PWM output with a period of five cycles and a pulse width of four. Notice that at the beginning of the waveform, the cs and wr_n signals are asserted twice since we're writing to both the period and pulse_width registers. (The address signal is low when writing to the period register and high when writing to the pulse_width register.)
Listing 4 Test software used to produce waveforms in Figure 2
PWM * const pLED = ...
pLED->period = 5;
pLED->pulse_width = 4;
pLED->pulse_width = 2;
Figure 2: Waveform for software-controlled PWM hardware
After the new values have been written to the registers, the pwm_output signal begins to reflect the change. Then, just to add some delay so we can see the output, some NOP instructions are executed by the C code. Finally, the pulse width is changed to two cycles, and the PWM waveform changes accordingly while still having a period of five cycles.
Best of both worlds
Part of architecting an embedded system is partitioning the system into hardware and software modules to take advantage of the benefits of each. As development tools evolve, interchanging software and hardware modules is becoming more transparent to the designer.
Once you understand the concepts discussed in this article, you'll have the knowledge to develop hardware on an FPGA that can be hooked up as a memory-mapped peripheral in a microprocessor system and interfaced by simply writing software. Because certain algorithms run much faster in hardware, converting an algorithm from software to hardware may greatly increase system performance. Known as hardware acceleration, the ability to do this is key to using configurable processors implemented effectively in programmable logic. At long last, even a software engineer has the power to improve system performance and efficiency through hardware acceleration.
Lara Simsic is an applications engineer at Altera. She has developed embedded hardware and software for five years and has an EE degree from the University of Dayton. Contact her at firstname.lastname@example.org.