Michael L. Burnham

Data

PolNLI
Over 200,000 political documents from social media, news outlets, bills, court cases, event descriptions, congressional newsletters, and more. Over 800 unique classification tasks for topic classification, event extraction, stance detection, and hatespeech detection. High quality labels validated by both human coders and GPT-4.

PoliStance: Affect
Training data for teaching LLMs political stance classification. This dataset focuses on identifying approval/disapproval of politicians.

PoliStance: Affect Quote Tweets
Training data for teaching LLMs political stance classification. This dataset contains a particularly challenging set of quote tweets containing two opinions from two authors.

PoliStance: Issue Tweets
Training data for teaching LLMs political stance classification. Tweets from senators expressing policy preferences.

Supreme Court Case Summaries
Summaries of Supreme Court cases combined with issue labels from the Supreme Court Database.

COVID-19 Twitter Data
~1 Million Tweets about COVID-19 from ~26,000 users. Collected between September 2020 and February 2021. Ideology is estimated for each user via Tweetscores.