Publications
2025
OBJECTIVE: Patients with chronic illness share their experiences in online communities, generating rich data on pain management. This study applied natural language processing methods, including large language models, to Reddit discussions from lupus communities to characterize multidimensional pain experiences framed in the biopsychosocial model.
METHODS: We extracted Reddit posts from the r/Lupus and r/LupusSupport subreddits posted from June 9, 2010 through December 31, 2023. Pain-related posts were identified using a clinically informed pain lexicon. Topic modeling was used to identify thematic patterns, which were then compared to structured summaries generated by an LLM instruction fine-tuned using the biopsychosocial model of pain. Two reviewers conducted content analysis of the LLM-generated summaries, evaluating thematic accuracy and coverage.
RESULTS: Data from Reddit included 31,785 posts, from 10,857 authors. We identified common pain complaints, management strategies, and sociocultural, affective, and nociplastic dimensions of pain. Instruction fine-tuned LLMs produced structured summaries with an average thematic accuracy score of 3.1 out of 4 (kappa = .09) and content coverage score of 2.9 out of 4 (kappa = .38). Sociocultural features presented in 123 posts (33.8%), including peer support and validation (n=106) and provider interactions or access issues (n=35). Nociplastic pain presented in 205 posts (56.3%).
CONCLUSION: NLP methods can be used to extract rich, multidimensional insights about pain experiences from online communities focused on lupus. These approaches highlight the psychological, social, and cultural facets of pain that may be underrepresented in clinical settings, supporting more patient-centered approaches to care in rheumatology.