Evangeline

FPGALLMsTransformersVitisC++llama.cppZynq UltraScale+ MPSoC

Overview

A project for me to learn how to run transformer based LLMs on FPGAs, using a Xilinx Zynq UltraScale+ MPSoC ZCU102 evaluation kit.

An exploratory project focused on the implementation and acceleration of transformer-based Large Language Models (LLMs) on Field-Programmable Gate Arrays (FPGAs). The project utilizes a Xilinx Zynq UltraScale+ MPSoC ZCU102 evaluation kit and involves a complete toolchain from high-level model definitions (llama.cpp) to low-level hardware design.

Features

FPGA-based acceleration of transformer models.
Integration with llama.cpp and ggml libraries.
Complete hardware-to-software stack for the ZCU102 evaluation kit.
Includes custom hardware platforms, Linux images, and device tree configurations.
Helper scripts for automated builds and deployment.

GitHub Repository

Evangeline Setup

Here’s my FPGA setup for the Evangeline project. The ZCU102 board is connected to my laptop via serial connection for programming and debugging. I’m using a serial terminal to interact with the Linux OS running on the FPGA, where I’m testing out transformer models using llama.cpp. It’s been a challenging but rewarding experience getting everything to work together from the hardware design in Vitis to running inference on the FPGA.