GALS System Design:
Side Channel Attack Secure Cryptographic Accelerators

Frank Kagan Gürkaynak

This is the www enabled version of my thesis. This has been converted from the sources of the original file by using TTH, some perl and some hand editing.

Extended Table of Contents

1  Introduction
2  GALS System Design
    2.1  Design Styles
        2.1.1  Synchronous Design
        2.1.2  Asynchronous (Self-timed) Design
        2.1.3  GALS
    2.2  The GALS Methodology
        2.2.1  Port Controller Types
        2.2.2  Local Clock Generator
        2.2.3  Timing Constraints
    2.3  GALS-Based Solutions
        2.3.1  Low Power
        2.3.2  High Performance
        2.3.3  Ease of Integration
        2.3.4  Secure Applications
3  Cryptographic Accelerators
    3.1  A Cryptology Primer
    3.2  Advanced Encryption Standard (AES)
    3.3  AES operations
        3.3.1  AddRoundKey
        3.3.2  SubBytes and InvSubBytes
        3.3.3  ShiftRows and InvShiftRows
        3.3.4  MixColumns and InvMixColumns
        3.3.5  Key Expansion
    3.4  AES Hardware Implementations
        3.4.1  Datapath Width
        3.4.2  Encryption/Decryption
        3.4.3  The AES Round Organization
        3.4.4  Roundkey Generation
        3.4.5  Comparison of AES chips
    3.5  Cryptographic Security
        3.5.1  Side Channel Attacks
        3.5.2  Differential Power Analysis
        3.5.3  Countermeasures Against Side Channel Attacks
        3.5.4  Implementation Issues
4  Secure AES Implementation Using GALS
    4.1  Partitioning
    4.2  DPA Countermeasures
        4.2.1  Noise Generation
        4.2.2  Operation Re-Ordering
        4.2.3  GALS Modules
        4.2.4  Variable Clock Periods
        4.2.5  Security Effort
    4.3  Realization and Results
        4.3.1  David
        4.3.2  Goliath
        4.3.3  Interface
        4.3.4  Random Number Generation
        4.3.5  Reference Design
        4.3.6  Physical Implementation
        4.3.7  Simulation Results
        4.3.8  Measurement Results
5  Designing GALS Systems
    5.1  Design Automation Issues
    5.2  Designing Asynchronous Finite State

        5.2.1  Port Controllers in Acacia
        5.2.2  Data Exchange between David and Goliath
        5.2.3  Data Exchange between Goliath and Synchronous Interface
    5.3  Testing Acacia
    5.4  Adapting Modules for GALS
    5.5  Related Research Directions
        5.5.1  Network-on-Chip Systems
        5.5.2  Dynamic Voltage and Frequency Scaling
        5.5.3  Latency-Insensitive Design
6  Conclusion
    6.1  Cryptographic Hardware Design
    6.2  GALS Design Methodology
    6.3  Final Words
A  'Guessing' Effort for Keys
B  List of Abbreviations


The integrated circuit manufacturing technology improves almost daily, and enables designers to construct circuits that are both smaller and are able to work faster. While this increases the performance and allows more functions to be integrated on to micro-chips, it also poses significant challenges to designers.
Conventional digital circuits rely on a global clock signal to function. These circuits are called synchronous, as the timing of all operations of the circuit are derived from the global clock signal. As a result of the technological improvements with each new generation of integrated circuits, both the clock rate, and the number of clock connections within the micro-chip continue to increase. Reliably distributing the clock signal over the micro-chip has become one of the leading challenges of modern digital system design.
The Globally-Asynchronous Locally-Synchronous (GALS) system design has been developed to address this problem. A GALS system consists of several sub-designs, called GALS modules, that have their own local clock generators. Each module by itself is synchronous and can be designed using a conventional design methodology. What is required is a reliable method to exchange data between these independent GALS modules. Instead of a global clock signal, GALS systems use an asynchronous handshaking protocol between GALS modules. Each GALS module contains additional control circuitry that briefly pauses the local clock to ensure data integrity during these transfers.
The feasibility of the GALS design methodology, and an extension of the methodology to support multi-point connections between GALS modules has been investigated in two previous Ph. D. theses by J. Muttersbach and T. Villiger. In this thesis, the GALS methodology has been applied to improve the security of cryptographic systems.
Cryptographic systems are an integral part of modern digital society providing solutions to secure information from unauthorized access. In its most basic form, a cryptographic algorithm uses a secret key (a series of 0's and 1's) to transform information so that it can only be deciphered by others who have the same secret key.
There are several well established algorithms, like the Advanced Encryption Standard (AES), that provide a very high level of security. However, once this algorithm is implemented, in either hardware or software, it acquires several physical properties (heat, power consumption etc) that can be monitored during operation. Starting in 1999, it was shown that it is possible to extract the secret key of a cryptographic system by only monitoring the power consumption. This is a very serious problem, and immediately a number of countermeasures were developed against these so-called side channel attacks.
In this thesis, the design of a GALS-based AES implementation is presented. The design consists of three independent GALS modules which have a local clock generator that is able to change its period randomly. By combining this architecture with several well-known countermeasures against side channel attacks, the security of the AES implementation has been improved considerably.
This work represents the first application of GALS to improve the side-channel security of a cryptographic system. A mature GALS design flow, which is mainly based on industry standard electronic design automation tools, has been used to fabricate the circuit. Measurement results showed that the performance metrics (throughput, area, power consumption) of the GALS integration are comparable to circuits that were designed using conventional synchronous methods.

File translated from TEX by TTH, version 3.77.
On 20 Dec 2006, 15:44.