67: How a Repo Named 67hack Won on 6/7
First, the dumb part
We named the team repo 67hack. No deep reason. It was the hackathon, we needed a name, someone typed 67hack, and it stuck. Then we won first place. On June 7th. 6/7.
I’m not going to pretend that means anything. But if you’ve spent any time online lately you know “six seven” has become the meme that refuses to die, and there was something perfect about a repo called 67hack taking the top prize on 6/7. We laughed about it. Then we went and got dinner like normal people who had not been awake for most of the previous 48 hours.
Okay. The real story.
What we won
We took 1st Place, Best All-Around Application at vibeFORWARD: M2—Agents in New York. The last two hackathon posts on this blog were solo runs, me alone with too much caffeine trying to manifest an idea before sunrise. This one was different, and better, because this time it was a team.
FRTC — hunts the ring no threshold caught
Five of us built FRTC — Fraud Ring Triage Copilot: me, Buddhsen Tripathi, Joy van Oranje, Saisrijith Reddy Maramreddy, and Olena Teslia. I’ve done enough of these alone to tell you the honest truth: a good team doesn’t just split the work, it raises the ceiling on what you’ll even attempt. We attempted a lot.
The problem nobody flags
Our track gave us 90 days of transactions from a fictional community bank. Five thousand transactions, around three hundred accounts, and somewhere inside, a fraud ring that never once tripped an alert.
That last part is the whole problem. Most fraud detection fires on thresholds: transfer over X, flag it. But a real ring knows exactly where those lines are drawn. They keep every transfer just under the limit and spread the money across mule accounts, so no single transaction ever looks wrong. The fraud only exists in the relationships between accounts, and a rule that looks at one transaction at a time will never see it.
So we stopped trying to catch transactions. We went after the shape of the ring.
How FRTC actually works
The design that won is a hybrid, and I think the hybrid is why it won.
A purely statistical system can find suspicious clusters but can’t explain them. A pile of language model agents can reason and explain but will cheerfully contradict each other and hallucinate a member in or out. We wanted the strengths of both without trusting either one too much.
So FRTC works in three moves. First, an unsupervised engine (NumPy and networkx, no language models) scores every account for anomaly and builds a coordination graph from shared devices, account-opening cohorts, and transfer structure. That surfaces a candidate ring. Second, six specialist agents and one adversarial Skeptic drill into that candidate over shared memory, each supporting or challenging it account by account. Third, a synthesizer fuses everything, confirms the ring, and streams the verdict to a UI you can watch reason in real time.
The agents reasoning in real time, with shared memory on the right
The detail I’m proudest of is small and load-bearing. Ring membership stays anchored to the engine’s candidate, and the Skeptic is only allowed to prune an account that has no concrete tie to the rest. That one rule means the language models can argue all they want, but their variance can never quietly drop a genuine member. The models reason. Python does the math. Neither gets the final word alone.
The payoff
It found the ring. Ten accounts, $161,750.90 moving across 250 peer transfers, and it matched the benchmark answer key to the cent.
The suspicious activity report it writes itself
The match felt good, but the moment I actually trusted it was when we pointed it at a completely different synthetic dataset, one with a different ring of a different size hiding a different amount, and it found that one too at 100% precision and recall. Nothing was baked in. The detectors adapt to whatever data they’re handed, and the case report is derived from that data rather than recited from a fixture. That’s the difference between a demo that works once and a system that actually works.
A quieter win
Two moments from that day stuck with me, and neither had anything to do with the leaderboard.
We needed FRTC deployed, fast. Over the past year I’d built a set of DigitalOcean skills for Claude, the kind of unglamorous tooling that only pays off when someone else picks it up and it just works. My teammate Buddhsen asked Claude to install those skills, dropped in an API key, and one-shot deployed the whole app. No hassle, no me hovering over his shoulder.
The other one came right at the end. A teammate had never run Claude Code before, a completely cold start with zero API usage. We pulled down my claude-dotfiles repo, ran the setup, and inside an hour they had a fully autonomous Claude Code terminal humming. From never having touched it to shipping with it, in under sixty minutes.
Watching people deploy on infrastructure I’ve lived on for a decade and spin up tooling I’d built and half forgotten, that was its own kind of win. The best tools disappear into someone else’s victory.
What I’m taking with me
1st Place — vibeFORWARD: M2—Agents, NYC
I’ve learned the hard way that the flashiest agent demo is usually the one you trust least, because somewhere in it a model is making a decision nobody can check. The thing that won here wasn’t the agents being clever. It was being deliberate about where we let them decide and where we didn’t. That lesson is going to outlast this trophy.
And building it with four people who each saw a different angle of the problem? That was the actual prize. The 6/7 thing was just a bonus the universe threw in.
You can explore FRTC live at the demo, read the writeup on Devpost, and dig into the code on GitHub.