A multi-task Large Language Model (LLM) pipeline that fine-tunes FinBERT to detect financial sentiment nuances across news, social media, and retail investor forums.
| Metric / Dataset | FinBERT Full | LoRA |
|---|---|---|
| Overall Accuracy | 85.4% | 83.2% |
| Macro F1-Score | 0.83 | 0.80 |
| PhraseBank | 95.9% | 97.1% |
| 83.3% | 80.5% | |
| FiQA | 81.5% | 72.6% |
This project implements a Multi-Task FinBERT architecture with dual prediction heads:
| Component | Purpose |
|---|---|
| Classification Head | Predicts Negative/Neutral/Positive for news & social media |
| Regression Head | Predicts continuous sentiment scores for forum discussions |
Why Multi-Task FinBERT: Comparing Multi-Task vs. Single-Task FinBERT on the same data revealed a +4.4% accuracy boost. The improvement was especially significant on Twitter (+6.1%), where capturing sentiment nuance is critical.
We benchmarked multiple models (BERT-Base, DistilBERT, FinBERT) and training approaches (Full Fine-Tuning, LoRA):
| Dataset | Type | Purpose |
|---|---|---|
| Financial PhraseBank | Professional News | Formal financial language |
| Twitter Financial News | Social Media | Short-form, informal commentary |
| FiQA Sentiment | Forum Discussions | Regression target for sentiment intensity |
This project is being developed in three phases:
The project is built with PyTorch and HuggingFace Transformers. It includes a custom CLI for training and evaluation.
# Clone and setup
git clone https://github.com/pmatorras/financial-sentiment-llm.git
cd financial-sentiment-llm
python -m venv .venv
source .venv/bin/activate
pip install -e .
# Train the multi-task model
python -m finsentiment train
# Evaluate performance
python -m finsentiment Evaluate
# Run the demo locally
pip install -e ".[demo]"
python app.py # Opens at localhost:7860
For advanced configuration and flags, see the
GitHub Repository.