oruccakir
Back to Projects

Evangeline

A project for me to learn how to run transformer based LLMs on FPGAs, using a Xilinx Zynq UltraScale+ MPSoC ZCU102 evaluation kit.

Overview

An exploratory project focused on the implementation and acceleration of transformer-based Large Language Models (LLMs) on Field-Programmable Gate Arrays (FPGAs). The project utilizes a Xilinx Zynq UltraScale+ MPSoC ZCU102 evaluation kit and involves a complete toolchain from high-level model definitions (llama.cpp) to low-level hardware design.

Key Features

1

FPGA-based acceleration of transformer models.

2

Integration with llama.cpp and ggml libraries.

3

Complete hardware-to-software stack for the ZCU102 evaluation kit.

4

Includes custom hardware platforms, Linux images, and device tree configurations.

5

Helper scripts for automated builds and deployment.

Project Gallery

Evangeline Setup
Story 1

Evangeline Setup

Here’s my FPGA setup for the Evangeline project. The ZCU102 board is connected to my laptop via serial connection for programming and debugging. I’m using a serial terminal to interact with the Linux OS running on the FPGA, where I’m testing out transformer models using llama.cpp. It’s been a challenging but rewarding experience getting everything to work together from the hardware design in Vitis to running inference on the FPGA.

Technologies

FPGALLMsTransformersVitisC++llama.cppZynq UltraScale+ MPSoC