Bot Hunting Is All About the Vibes

Christopher Bouzy is trying to stay ahead of the bots. As the person behind Bot Sentinel, a popular bot-detection system, he and his team continuously update their machine learning models out of fear that they will get “stale.” The task? Sorting 3.2 million tweets from suspended accounts into two folders: “Bot” or “Not.”

To detect bots, Bot Sentinel’s models must first learn what problematic behavior is through exposure to data. And by providing the model with tweets in two distinct categories—bot or not a bot—Bouzy’s model can calibrate itself and allegedly find the very essence of what, he thinks, makes a tweet problematic.

Training data is the heart of any machine learning model. In the burgeoning field of bot detection, how bot hunters define and label tweets determines the way their systems interpret and classify bot-like behavior. According to experts, this can be more of an art than a science. “At the end of the day, it is about a vibe when you are doing the labeling,” Bouzy says. “It’s not just about the words in the tweet, context matters.”

He’s a Bot, She’s a Bot, Everyone’s a Bot

Before anyone can hunt bots, they need to figure out what a bot is—and that answer changes depending on who you ask. The internet is full of people accusing each other of being bots over petty political disagreements. Trolls are called bots. People with no profile picture and few tweets or followers are called bots. Even among professional bot hunters, the answers differ.

Bouzy defines bots as “problematic accounts” and trains Bot Sentinel to weed them out. Indiana University informatics and computer science professor Filippo Menczer says the tool he helps develop, Botometer, defines bots as accounts that are at least partially controlled by software. Kathleen Carley is a computer science professor at the Institute for Software Research at Carnegie Mellon University who has helped develop two bot-detection tools: BotHunter and BotBuster. Carley defines a bot as “an account that is run using completely automated software,” a definition that aligns with Twitter’s own. “A bot is an automated account—nothing more or less,” the company wrote in a May 2020 blog post about platform manipulation.

Just as the definitions differ, the results these tools produce don’t always align. An account flagged as a bot by Botometer, for example, might come back as perfectly humanlike on Bot Sentinel, and vice versa.

Some of this is by design. Unlike Botometer, which aims to identify automated or partially automated accounts, Bot Sentinel is hunting accounts that engage in toxic trolling. According to Bouzy, you know these accounts when you see them. They can be automated or human-controlled, and they engage in harassment or disinformation and violate Twitter’s terms of service. “Just the worst of the worst,” Bouzy says.

Botometer is maintained by Kaicheng Yang, a PhD candidate in informatics at the Observatory on Social Media at Indiana University who created the tool with Menczer. The tool also uses machine learning to classify bots, but when Yang is training his models, he’s not necessarily looking for harassment or terms of service violations. He’s just looking for bots. According to Yang, when he labels his training data he asks himself one question: “Do I believe the tweet is coming from a person or from an algorithm?”

How to Train an Algorithm

Not only is there no consensus on how to define a bot, but there’s no single clear criteria or signal any researcher can point to that accurately predicts whether an account is a bot. Bot hunters believe that exposing an algorithm to thousands or millions of bot accounts helps a computer detect bot-like behavior. But the objective efficiency of any bot-detection system is muddied by the fact that humans still have to make judgment calls about what data to use to build it.

Take Botometer, for example. Yang says Botometer is trained on tweets from around 20,000 accounts. While some of these accounts self-identify as bots, the majority are manually categorized by Yang and a team of researchers before being crunched by the algorithm. (Menczer says some of the accounts used to train Botometer come from data sets from other peer-reviewed research. “We try to use all the data that we can get our hands on, as long as it comes from a reputable source,” he says.)

Original Source Link

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Bot Hunting Is All About the Vibes

Latest News

Kim Kardashian Snaps Selfie With Karlie Kloss Amid Taylor Swift Drama – Hollywood Life

Flight attendants at Southwest Airlines seal deal for 22% pay hikes next month

BlackRock’s Tokenized Fund News Sends Hedera (HBAR) Soaring 100%, The Reason May Surprise You

UK veterinary deal with EU could boost agrifood exports by 22%, study finds

Somehow This $10,000 Flame-Thrower Robot Dog Is Completely Legal in 48 States

Study highlights increased risk of second cancers among breast cancer survivors

Must Read

WATCH: The 2024 Bitcoin Halving Livestream

MORE EVIDENCE That Kylie Jenner Might Be Pregnant With Timothee Chalamet’s Baby!!!

You might also likeRELATED
Recommended to you

Latest Posts

Flight attendants at Southwest Airlines seal deal for 22% pay hikes next month

BlackRock’s Tokenized Fund News Sends Hedera (HBAR) Soaring 100%, The Reason May Surprise You

UK veterinary deal with EU could boost agrifood exports by 22%, study finds

Somehow This $10,000 Flame-Thrower Robot Dog Is Completely Legal in 48 States

Study highlights increased risk of second cancers among breast cancer survivors

Building on the Success of Uniting for Ukraine

Pelvic exams at hospitals require written consent, new U.S. guidelines say

Tax bills for metro Birmingham homeowners are growing. The worst may lie ahead.

Hamas Releases Video Showing Well-Known Israeli-American Hostage Hersh Goldberg-Polin

Editor Picks

Trump Runs Away When Asked If He Still Supports Mike Johnson

VeChain Price Poised For A Bullish Breakout?

Sabrina Carpenter References BF Barry Keoghan’s NSFW Saltburn Bathtub Scene During Coachella Performance!

Must Read

Kim Kardashian Snaps Selfie With Karlie Kloss Amid Taylor Swift Drama – Hollywood Life

Pelvic exams at hospitals require written consent, new U.S. guidelines say

Flight attendants at Southwest Airlines seal deal for 22% pay hikes next month

Hot Topics

Bot Hunting Is All About the Vibes

Latest News

Must Read

You might also likeRELATEDRecommended to you

Latest Posts

Editor Picks

Must Read

Hot Topics

You might also likeRELATED
Recommended to you