Bayyinah

Pre-LLM file integrity verification · v1.1.1

Every file has a surface and a substrate. The surface is what your reader, viewer, or inbox displays. The substrate is the actual bytes underneath — metadata, hidden text, embedded scripts, alternate streams. Most of the time the two agree. When they don't, it's usually deliberate.

A contract that displays one figure and contains another. A PDF that opens cleanly while carrying instructions a language model would silently obey. An email whose visible sender and routing headers don't match. A spreadsheet whose hidden sheet does the actual math.

Bayyinah pulls every layer of a file apart and reports whether the surface matches the substrate. It is a single-purpose scanner. It makes no moral judgement of its own — it surfaces the gap and lets the reader perform the recognition.

Try it now

Drop in any file. Bayyinah returns a structured integrity report in a couple of seconds.

Try a document you already trust to see a score of 1.0 with no findings. Then try one from a less-trusted source.

Verdict legend
sahih
sound — score 1.0, no findings, scan complete
mushtabih
doubtful — score 0.7-1.0, some signal, nothing verified
mukhfi
concealed — score 0.3-0.7, concealment detected
munafiq
severe concealment — score below 0.3 with at least one verified mechanism
mughlaq
closed — scan incomplete or errored, no verdict can be issued

These labels classify the report, not the document's author or intent. Bayyinah surfaces the gap; the reader performs the recognition.

Why it matters now

Modern AI systems ingest files all day. Every document Q&A tool, every email summarizer, every customer-support agent reads files written by humans and by other models. When a file hides something, the model consumes the hidden content along with the visible content. Most AI-safety work focuses on what the model says back. Bayyinah addresses the input layer: vet the file before any model touches it.

What it covers

23 file kinds, grouped by family:

API

The same scanner is exposed over HTTP for integration:

curl -X POST -F "[email protected]" https://bayyinah.dev/scan

Returns the full IntegrityReport as JSON — mechanism, tier, confidence, severity, location, and inversion-recovery for every finding. See the repo for the full endpoint list.

Status

Version
1.1.1 (production-stable)
License
Apache 2.0
Source
github.com/BayyinahEnterprise/Bayyinah-Integrity-Scanner
DOI
10.5281/zenodo.19677111
Demo limit
25 MiB per upload, no auth

Every finding Bayyinah emits is tagged with a tier: Verified (unambiguous concealment), Structural (pattern of concealment, context may justify), or Interpretive (suspicious, context-dependent). The integrity score is a single continuous number; readers are expected to inspect the findings, not stop at the score.