Member-only story
I Used an LLM to Analyze 140,000 UFO Reports. The Aliens Are Real…
…and they’re interrupting dogs.
Anže Kravanja11 min read·Just now--
A few weeks ago, I got a (seemingly) brilliant idea. “I’ll find UFO/UAP reports and analyze them with an LLM,” I declared to my empty room. The logic was simple: there are mountains of reports filled with rich, descriptive text. Instead of manually reading every single one to find patterns, I could outsource the heavy lifting to a Large Language Model.
It’s never been easier to figure out if we are alone in the universe. Thank you, LLMs.
(Disclaimer: I may not have definitively proven we’re not alone, but the journey was a blast, and the findings were… unexpected.)
The Data
Thankfully, the U.S. has the National UFO Reporting Center (NUFORC), a public database full with decades of sighting reports. After downloading 140,000 of them, I had my raw material.
Each report is a gem, containing fields like Shape, Duration, and Characteristics. But the real treasure was buried in the free-text Summary and Text fields. This is where the witnesses pour their hearts out, and it's where I pointed my digital shovel.
{
"Sighting": 114864,
"Occurred": "2014-09-21 13:00:00 Local",
"Location": "Huntsville, TX, USA",
"Shape"…