The main goals of this undergraduate research mentoring program are i) to provide opportunities exploring diverse research topics in computer architecture, firmware, middleware and operating system, ii) to guide how to perform insightful research, iii) to teach them how to use diverse academic research tools such as Gem5, GPGPUsim and NANDFlashSim, and iv) to provide a chance connecting with graduate students to get more practice in performing research. As one of mentoring programming examples, Prof. Jung and his students have a series of research cleaning seminar for about 4 hours per week. Registered undergrad students explore GPU, SSD, multicore, NUMA, NUCA and emerging memory system research topics demonstrated at the top-tire conferences in computer architecture (e.g, ISCA, ASPLOS, MICRO, HPCA), and will perform computer architecture research with Prof. Jung's help.


NameAhmad RashedTasnim KhanThan Kywe
Sy.JuniorSophomoreSenior (Veteran)
Discip.Computer EngineeringComputer ScienceCE&CS
Int.GPUComputer ArchitectureComputer Architecture
NameKarl TahtJayesh H. Joshi
Discip.EE&CSEE (UT-Austin)
Int.Computer ArchitectureComputer Architecture


Seminar Time: Jan. 9

  • Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories by Wonil [ slides ]
  • Managing GPU Concurrency in Heterogeneous Architectures by Shuwen [ slides ]

Seminar Time: Jan. 16

  • FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems by Mustafa [ slides ]
  • I-CASH: Intelligently Coupled Array of SSD and HDD by Gieseo [ slides ]

Seminar Time: Jan. 23

  • Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory by Lubaba

Seminar Time: Jan. 30

  • Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery by Wonil
  • Enabling Preemptive Multiprogramming on GPUs by Shuwen

Seminar Time: Feb. 6

  • Architectural Support for Address Translation on GPUs by Jie
  • MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems by Lubaba

Seminar Time: Feb. 13

  • Reducing Read Latency of Phase Change Memory via Early Read and Turbo Read by Mustafa
  • Performance Impact and Interplay of SSD Parallelism through Advanced Commands, Allocation Strategy and Data Granularity by Gieseo

Seminar Time: Feb. 20

  • Willow: A User-Programmable SSD by Wonil
  • Divergence-Aware Warp Scheduling by Shuwen

Seminar Time: Feb. 27

  • Enabling Preemptive Multiprogramming on GPUs by Jie
  • SSD Characterization: From Energy Consumption's Perspective by Lubaba

Seminar Time: Mar. 6

  • Overcoming the Challenges of Cross-Point Resistive Memory Architectures by Mustafa
  • None by Gieseo

Seminar Time: Mar. 13

  • FlashVM: Virtual Memory Management on Flash by Wonil
  • The Case for GPGPU Spatial Multitasking by Shuwen

Seminar Time: Mar. 20

  • Single-Graph Multiple Flows: Energy Efficient Design Alternative for GPGPUs by Jie
  • Improving Cache Performance by Exploiting Read-Write Disparity by Lubaba

Seminar Time: Mar. 27

  • Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case by Mustafa
  • None by Gieseo

Seminar Time: Apr. 3

  • Unioning of the Buffer Cache and Journaling Layers with Non-volatile Memory by Wonil
  • Maximizing SIMD Resource Utilization in GPGPUs with SIMD Lane Permutation by Shuwen

Seminar Time: Apr. 10



Seminar Time: Feb. 14

  • Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs by Robin [ slides ]
  • Approximate Storage in Solid-State Memories by Wonil [ slides ]

Seminar Time: Feb. 21

  • SDF: Software-Defined Flash for Web-Scale Internent Storage Systems by Jie [ slides ]
  • OWL: Cooperative Thread Array Aware SchedulingTechniques for Improving GPGPU Performance by Chandu [ slides ]

Seminar Time: Feb. 28

  • 3D GPU Architecture using Cache Stacking: Performance, Cost, Power and Thermal Analysis By Shuwen [ slides ]
  • Deduplication in SSDs: Model and Quantitative Analysis By Wonil [ slides ]
  • AC-DIMM: Associative Computing with STT-MRAM By Mustafa [ slides ]

Seminar Time: Mar. 7

  • Improving GPU Performance via Large Warps and Two-Level Warp Scheduling Applications By Robin[ slides ]
  • iGPU: Exception Support and Speculation Execution on GPUs.pdf By Jie [ slides ]
  • Relaxing Non-Volatility for Fast and Energy-Efficient STT-RAM Caches By Youngbin [ slides ]

Seminar Time: Mar. 14

  • Orchestrated Scheduling and Prefetching for GPGPUs By Shuwen [ slides ]
  • Lifetime Improvement of NAND Flash-based Storage Systems Using Dynamic Program and Erase Scaling By Wonil [ slides ]

Seminar Time: Mar. 21

  • Energy Efficient GPU Transactional Memory via Space-Time Optimizations By Robin[ slides ]
  • De-indirection for Flash-based SSDs with Nameless Writes By Jie[ slides ]
  • Design of Non-destructive Single-sawtooth Pulse Based Readout for STT-RAM by NVM-SPICE By Youngbin[ slides ]

Seminar Time: Mar. 28

  • The Dual-Path Execution Model for Efficient GPU Control Flow By Shuwen [ slides ]
  • MapReduce: Simplied Data Processing on Large Clusters By Wonil [ slides ]

Seminar Time: April 4

  • Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative By Robin[ slides ]

Seminar Time: April 11

  • Cache-Conscious Wavefront Scheduling by Shuwen [ slides ]
  • A First Study on Self-Healing Solid-State Drives by Mustafa[ slides ]

Seminar Time: April 25

  • Design Paradigm for Robust Spin Torque Transfer Magnetic RAM (STT MRAM) From Circuit/Architecture Perspective by Youngbin [ slides ]


Seminar Time: May 16

  • MRPB: Memory Request Prioritization for Massively Parallel Processors by Shuwen [ slides ]
  • Spin-transfer torque switching in magnetic tunnel junctions and spin-transfer torque random access memory by Mustafa [ slides ]
  • Moneta: A High-performance Storage Array Architecture for Next-generation, Non-volatile Memories by Wonil [ slides ]

Seminar Time: May 23

  • Optimizing NAND Flash-based SSDs via Retention Relaxation by Youngbin [ slides ]
  • Exploiting Concurrent Kernel Execution on Graphic Processing Units by Robin [ slides ]

Seminar Time: May 30

  • Improving GPGPU Resource Utilization and Performance Through Alternative Thread Block Scheduling by Shuwen [ slides ]
  • RAIDR: Retention‐Aware Intelligent DRAM Refresh by Mustafa [ slides ]

Seminar Time: June 6

  • Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory by Youngbin [ slides ]
  • GPUdrive:Reconsidering Storage Accesses for GPU Acceleration by Mustafa [ slides ]
  • Power, Energy and Thermal Considerations in SSD-Based I/O Acceleration by Jie [ slides ]

Seminar Time: June 13

  • SIMD Divergence Optimization through Intra-Warp Compaction by Shuwen [ slides ]
  • DC Express: Shortest Latency Protocol for Reading Phase Change Memory over PCI Express by Wonil [ slides ]

Seminar Time: June 20

  • GPUdmm: A High-Performance and Memory-Oblivious GPU Architecture Using Dynamic Memory Management by Jie [ slides ]
  • Phase Change memory technology by Youngbin [ slides ]

Seminar Time: June 27

  • On-the-Fly Elimination of Dynamic Irregularities for GPU Computing by Shuwen [ slides ]
  • Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling by Mustafa [ slides ]
  • SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs by Wonil [ slides ]

Seminar Time: July 11

  • Supporting x86-64 Address Translation for 100s of GPU Lanes by Jie [ slides ]
  • ADAMS: Asymmetric Differential STT-RAM Cell Structure For Reliable and High-performance Applications by Youngbin [ slides ]

Seminar Time: July 18

  • An Efficient STT‐RAM Last Level Cache Architecture for GPUs by Mustafa [ slides ]
  • Flash on Rails Consistent Flash Performance through Redundancy by Wonil [ slides ]

Seminar Time: July 25

  • Orchestrating Cache Management and Memory Scheduling for GPGPU Applications by Jie [ slides ]
  • A Scalable Multi-Path Microarchitecture for Efficient GPU Control Flow by Shuwen [ slides ]

Seminar Time: Aug. 8

  • CAPRI: Prediction of Compaction-Adequacy for Handling Control-Divergence in GPGPU Architectures by Shuwen [ slides ]
  • Double barrier magnetic tunnel junctions with Write/Read mode select layer by Mustafa [ slides ]
  • Multi Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme by Jie [ slides ]


Seminar Time: Sep. 12

  • TAP: A TLP-Aware Cache Management Policy for a CPU-GPU Heterogeneous by Shuwen [ slides ]
  • Reducing SSD Read Latency via NAN Flash Program and Erase Suspension by Wonil [ slides ]

Seminar Time: Sep. 19

  • The Direct-to-Data (D2D) Cache: Navigating the Cache Hierarchy with a Single Lookup by Jie [ slides ]
  • RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization by Mustafa [ slides ]

Seminar Time: Sep. 26

  • Improving DRAM Performance by Parallelizing Refreshes with Accesses by Lubaba [ slides ]

Seminar Time: Oct. 3

  • When Poll is Better than Interrupt by Wonil [ slides ]
  • A Locality-Aware Memory Hierarchy for Energy-Efficient GPU Architectures by Shuwen [ slides ]

Seminar Time: Oct. 10

  • Practical Nonvolatile Multilevel-Cell Phase Change Memory by Lubaba [ slides ]

Seminar Time: Oct. 17

  • STAG: Spintronic-Tape Architecture for GPGPU Cache Hierarchies by Jie [ slides ]
  • Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture by Mustafa [ slides ]

Seminar Time: Oct. 24

  • Beyond block I/O: Rethinking traditional storage primitives by Wonil [ slides ]
  • Improving DRAM Performance by Parallelizing Refreshes with Accesses by Lubaba [ slides ]

Seminar Time: Oct. 31

  • Making Non-Volatile Nanomagnet Logic Non-Volatile by Lubaba [ slides ]

Seminar Time: Nov. 7

  • A Case for Refresh Pausing in DRAM Memory Systems by Mustafa [ slides ]

Seminar Time: Nov. 14

  • Cache Coherence for GPU Architectures by Shuwen [ slides ]
  • Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors by Lubaba [ slides ]

Seminar Time: Nov. 21

  • Precision-Aware Soft Error Protection for GPUs by Jie [ slides ]

Seminar Time: Dec. 5

  • Adaptive Cache Management for Energy-efficient GPU Computing by Jie [ slides ]
  • Half-DRAM a High-bandwidth and Low-power DRAM Architecture from the Rethinking of Fine-grained Activation by Mustafa [ slides ]