EE 2310 logo

COMPUTER ORGANIZATION AND DESIGN

FALL 1999

CURRENT HOMEWORK ASSIGNMENT

Reading:
Patterson and Hennessy, Chapter 7
Review Exercises (DO NOT hand in):
Waldron, Exercises for Chapters 7 and 8
Exercises (due December 2, 1999 for both sections):
  1. [30] How many clock periods does the following SPIM code segment take given a) no pipeline, b) a 5 stage pipeline assuming one clock period per instruction, but no data forwarding, c) a 5 stage pipeline with data forwarding as described in P&H, Chapter 6, section 4.
             or   $t1,$0,$a2          
             or   $t3,$0,$a0         
             or   $t4,$0,$a1          
             lw   $t5,0($t7)        
             lw   $t6,0($t8)        
             mul  $t2,$t5,$t6     
             addi $t1,-1      
             add  $t3,$a3,$t3      
             add  $t4,$a3,$t4      
                    
    
  2. [20] Find the data hazards in the following SPIM code segment, and show the hazards on a pipeline diagram, as in P&H, Fig. 6.44:
        lw   $t1,i1
        addi $t1,$t1,100
        or   $t2,$t3,$t1
        add  $a0,$a1,$t2
        ori  $a0,$a0,42
        add  $t5,$a0,$t2
    
  3. [10] Reorder the following code segment to remove the data hazards. Assume that data forwarding takes place:
         lw  $t0,24($a0) 
         sub $t4,$t4,$t0 
         sub $t8,$t8,$t3 
         add $t6,$t6,$t5 
         mul $t7,$t7,$t1 
    
  4. [10] What is the CPI for the reordered sequence of instructions in the preceding problem?
  5. [20] Find the control hazards in the following SPIM code segment, and show the hazards on a pipeline diagram::
            la   $t0, ar2 
            lw   $t1, size
            lw   $t2, nrows
            lw   $t3, ncols
            addi $t4, $t2, -1  # nrmax
            addi $t5, $t3, -1  # ncmax
            ori  $t6, $0, 0    # initialize row index to 0
            lwc1 $f0, val
            mfc1 $s4, $0 
    rloop:  mul  $t9, $t6, $t3 # multiply rindex by ncols
            mul  $t9, $t9, $t1 # multiply by size of one array element to get roffset
            ori  $t7, $0, 0    # initialize column index to 0
    cloop:  mul  $s0, $t7, $t1 # multiply cindex by size to get coffset
            add  $s1, $s0, $t9 # offset of ar2[rindex][cindex] = roffset + coffset 
            add  $t8, $s1, $t0 # address of ar2[rindex][cindex] = offset + base 
            sw   $s4, 0($t8)   # store val in ar2[rindex][cindex]
            addi $t7, $t7, 1   # increment the column index
            sub  $s2, $t5, $t7 # nc = ncmax - cindex
            bgez $s2, cloop    # branch back to cloop if nc >= 0
            addi $t6, $t6, 1   # increment the row index
            sub  $s3, $t4, $t6 # nr = nrmax - rindex
            bgez $s3, rloop    # branch back to rloop if nr >= 0 
            ori  $v0, $0, 10   # reach here if row loop is done
            syscall            # end of program!
    
  6. [10] Which instruction could do useful work in the branch delay slot, without changing the computational results in any way?
              or    $t0,$0,$a0          # Reg. t0 points to the array element
              or    $t1,$0,$a2          # Reg. t1 is a counter
    loop:     sw    $a3,0($t0)          # Store the value into the array element
              add   $t0,$a1,$t0         # Increment the pointer by the value of size
              addi  $t1,-1              # Decrement the counter
              bgtz  $t1, loop           # branch back to loop if counter >= 0
                                        #  (since we store at the head of the
                                        #   loop, we compute one more address 
                                        #   than necessary just to reduce 
                                        #   the number of compares & branches)
    beamup: jr $ra                      # Beam me up....
    

Solutions
Assignments 1-8