SightHouse: Automated Function Identification for Reverse Engineering

Published On: April 3, 2026
SightHouse: Automated function identification

SightHouse is an open-source tool developed by Quarkslab that helps reverse engineers automatically identify known functions inside binaries by matching them with previously analyzed code.

SightHouse is a tool designed to assist reverse engineers by retrieving information and metadata from programs and identifying similar functions.


What Problem Does It Solve?

Reverse engineering modern software is complex due to:

  • Large codebases with thousands of functions
  • Heavy use of third-party libraries
  • Repetitive or AI-generated code patterns

This makes it difficult to distinguish custom logic vs reused components, often wasting time analyzing irrelevant code.

SightHouse tackles this by automating function identification using similarity matching.

SightHouse: Automated function identification

Deployment & Usage

SightHouse is available on Pypi.

# Install SRE clients only
pip install sighthouse-client 
# Install frontend only
pip install sighthouse-frontend
# Install pipeline only 
pip install sighthouse-pipeline
# Or install everything
pip install sighthouse[all]

From sources

You can also install it from the git repository:

# Download the repo
git clone https://github.com/quarkslab/sighthouse && cd sighthouse 
# Make install will create a new virtual env and install sighthouse in it
make install 

How SightHouse Works

SightHouse compares functions from a target binary with a large database of known function signatures. When a match is found:

  • The function is identified
  • Its original name and source are suggested
  • Metadata is added directly inside the reverse engineering tools

It supports integration with:

  • IDA Pro
  • Ghidra
  • Binary Ninja

Core Technology

SightHouse relies on the Binary Similarity problem, which involves detecting similar functions using:

  • Assembly code
  • Intermediate representations
  • Raw binary patterns

After benchmarking multiple approaches, the tool uses BSIM (from Ghidra) as its core engine because:

  • It is scalable
  • Works across architectures
  • Supports large databases via PostgreSQL/Elasticsearch
  • Provides stable performance for production use

Architecture Overview

SightHouse is built with three main components:

  1. SRE Clients (Plugins)
    • Integrated into tools like Ghidra/IDA
    • Sends binaries for analysis
  2. Frontend API Server
    • Central system handling requests
    • Runs analysis using Ghidra + BSIM
  3. Signature Pipeline
    • Automatically builds and updates the signature database
    • Scrapes projects, compiles them, and extracts function signatures

๐Ÿ‘‰ The diagram on page 3 shows this pipeline workflow, including scrapers, compilers, analyzers, and storage systems working together.


Check SightHouse

Leave a Comment