Bias Detection in Social Media Content
End-to-end bias detection and real-time analytics for large-scale social media data
Bias DetectionSocial MediaLarge Language ModelAzureData StreamingData Engineering
Project Snapshot
- Timeframe: Sep 2024 – May 2025
- Category: Research, Machine Learning, Social Media
- Role: Researcher & Machine Learning Engineer
Project Description
- This project investigates gender and racial bias in social media content through an end-to-end machine learning and data engineering pipeline. An Azure-based medallion architecture was deployed using Azure Resource Manager to support both batch ingestion (SBIC dataset) and hourly streaming ingestion from the Bluesky API. A BERT-based classifier was trained on batch data, achieving an average accuracy of 88.62%, and deployed via Azure Functions and Azure ML endpoints to enable real-time inference on streaming content. To support analysis and interpretation, interactive PowerBI dashboards were developed for both batch and streaming data, featuring six-hour refresh cycles, cross-report linking, and Key Influencer visuals for near real-time demographic and bias insights.
