Bias Detection in Social Media Content

Timeframe: Sep 2024 – May 2025
Category: Research, Machine Learning, Social Media
Role: Researcher &amp; Machine Learning Engineer

End-to-end bias detection and real-time analytics for large-scale social media data

Bias DetectionSocial MediaLarge Language ModelAzureData StreamingData Engineering

Project Snapshot

This project investigates gender and racial bias in social media content through an end-to-end machine learning and data engineering pipeline. An Azure-based medallion architecture was deployed using Azure Resource Manager to support both batch ingestion (SBIC dataset) and hourly streaming ingestion from the Bluesky API. A BERT-based classifier was trained on batch data, achieving an average accuracy of 88.62%, and deployed via Azure Functions and Azure ML endpoints to enable real-time inference on streaming content. To support analysis and interpretation, interactive PowerBI dashboards were developed for both batch and streaming data, featuring six-hour refresh cycles, cross-report linking, and Key Influencer visuals for near real-time demographic and bias insights.