This repository contains the codebase for the paper "AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting".
If you use this code as part of any published research, please acknowledge the following paper:
@article{wu2025grounded,
title={AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting},
author={Wu, Jibang and Yang, Chenghao and Wu, Yi and Mahns, Simon and Wang, Chaoqi and Zhu, Hao and Fang, Fei and Xu, Haifeng},
journal={arXiv preprint arXiv:2502.16810},
year={2025}
}The datasets used in this research are available at:
- User Preference Data: https://huggingface.co/datasets/Sigma-Lab/AI_Realtor_User_Preference_Anonymized
- Listing Data: https://huggingface.co/datasets/Sigma-Lab/AI_Realtor_Listing_Data
Important: Users must agree to the license terms before accessing the datasets.
This project is intended only for educational and research purposes, not for commercial purposes.
Reasonable efforts have been made to process the data and remove or anonymize Personally Identifiable Information (PII). However, the complete absence of PII cannot be guaranteed. The User agrees to handle the Dataset with care and is solely responsible for:
- Ensuring their use of the Dataset complies with all applicable privacy laws and regulations (e.g., GDPR, CCPA).
- Any consequences arising from the use of any PII that may remain within the Dataset.
- Not attempting to re-identify any individuals from the anonymized data.
pip install -r requirements.txtrequirements.txt includes GPU/local-LLM dependencies such as vllm, which may require a CUDA-enabled Linux environment. For the credential-free artifact smoke test on a CPU-only machine, the following smaller dependency set is sufficient:
pip install datasets pandas numpy matplotlib seaborn scipyOnly scripts that call hosted LLM APIs require credentials. The smoke test below does not need an OpenAI key. If you have a key, set it as an environment variable rather than editing source files:
export OPENAI_API_KEY="your-openai-api-key"
export OPENAI_ORG_ID="your-openai-org-id" # optionalOn Windows PowerShell:
$env:OPENAI_API_KEY = "your-openai-api-key"
$env:OPENAI_ORG_ID = "your-openai-org-id" # optionalThe Elasticsearch demos read optional connection settings from:
export ELASTICSEARCH_URL="https://localhost:9200/"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="your-password"Download the listing data with:
python -c "from utils import get_original_all_features_data; get_original_all_features_data()"This will download the listing data from Hugging Face and save it to ./data/ai_realtor_listing_data.json.
Some research scripts also expect additional governed or derived artifacts that are not generated by the smoke test:
data/extracted_features.jsonl: highlight-feature annotations used byget_highlight_data()and highlight-model prompting/evaluation scripts.responses_latest.json: anonymized user preference responses used by user-simulation and hallucination-detection scripts.ratings.pkl: the governed paper Elo artifact. The public smoke test can useratings.synthetic.pklinstead.
The public Hugging Face user-preference dataset can be used to reconstruct the user-simulation input after accepting its license terms. If a script references one of the filenames above, place the corresponding file at that path or adjust the script argument where one is available.
The following commands exercise the public, non-API artifact path:
python -c "from utils import get_original_all_features_data; get_original_all_features_data()"
python benchmark/win_rate_plot.py
python benchmark/generate_synthetic_ratings.py --output ratings.synthetic.pkl
python benchmark/elo_plot.py --ratings-pkl ratings.synthetic.pklExpected outputs:
data/ai_realtor_listing_data.jsonwith 1,883 listing records.comparison_win_rates_improved.pdffrombenchmark/win_rate_plot.py.ratings.synthetic.pkl, a non-sensitive ratings file used only for smoke testing.elo_ratings_grouped.pdffrombenchmark/elo_plot.py.
The original ratings.pkl used for the paper's Elo visualization is not included in the public repository because it is derived from privacy/ethics-sensitive evaluation artifacts. If you have governed access to that file, place it at ratings.pkl or pass its path with python benchmark/elo_plot.py --ratings-pkl path/to/ratings.pkl.
Reviewers with an OpenAI key can additionally verify hosted-model paths. These commands make API calls and may incur cost, latency, and rate-limit delays. They should be treated as optional reproduction checks, not as part of the default smoke test.
First confirm the key is visible:
python -c "import os; assert os.environ.get('OPENAI_API_KEY'), 'OPENAI_API_KEY is not set'"A small end-to-end API sanity check is:
python rag_agents/preference_summary_from_ranking_demo.pyExpected outputs:
- Printed preference summaries for the four built-in example users.
preference_analysis_responses_binary_feedback.pkl.preference_analysis_responses_binary_feedback.csv.
To reproduce the OpenAI-based highlight prompting baseline, prepare data/ai_realtor_listing_data.json and data/extracted_features.jsonl, then run:
python highlight_model/prompt_baseline_gpt4.py --model gpt-4oExpected outputs are checkpoint files under prompting_baseline_outputs/gpt-4o/, named like highlight_model_prompting_gpt4_output_0.pt. The script processes 10 batches and skips already-existing batch outputs, so it can be resumed.
To reproduce the OpenAI batch user-simulation path, prepare responses_latest.json, then run:
python user_simulation/predicting_preference_batch_api.py \
--data responses_latest.json \
--model_name gpt-4o-mini \
--exp_name naive_few_shot \
--eval_mode onlineExpected behavior:
- The first run creates batch input files, submits OpenAI Batch API jobs, and writes checkpoints under
responses_latest_batch_api/. - Later runs poll existing batch jobs and download completed results.
- Once all batches complete, the script saves
batch_scores.pt...,batch_accuracy.pt...,accuracy_histogram.pdf, andshotwise_accuracy.pdf.
├── benchmark/ # Evaluation and benchmarking scripts
├── hallucination_detection/ # Hallucination detection and evaluation
├── highlight_model/ # Highlight model training and inference
├── rag_agents/ # RAG-based agent implementations
├── user_simulation/ # User preference simulation and prediction
├── const.py # Constants and feature mappings
├── utils.py # Utility functions
└── requirements.txt # Python dependencies
const.py: Contains the desired feature names and mappings from original features to standardized onesutils.py: Utility functions for data processing, feature normalization, and data loading
user_simulation/: Contains scripts for predicting user preferences and simulating user behavior
highlight_model/: Training and inference scripts for the highlight model that identifies important features
rag_agents/: Retrieval-Augmented Generation agents for generating persuasive real estate descriptions
benchmark/: Scripts for evaluating model performance using ELO ratings and win rateshallucination_detection/: Tools for detecting and evaluating hallucination in generated content
from utils import get_original_all_features_data, get_highlight_data
# Load listing data, please run this at the very beginning to gather necessary data for running the project codes.
all_features = get_original_all_features_data()# Generate the win-rate plot
python benchmark/win_rate_plot.py
# Generate the Elo plot from a governed ratings artifact
python benchmark/elo_plot.py --ratings-pkl ratings.pkl
# Generate the Elo plot from a non-sensitive synthetic ratings artifact
python benchmark/generate_synthetic_ratings.py --output ratings.synthetic.pkl
python benchmark/elo_plot.py --ratings-pkl ratings.synthetic.pklThe main dependencies include:
- PyTorch
- Transformers
- OpenAI
- Datasets
- Pandas
- NumPy
- Matplotlib
- And many others (see
requirements.txtfor complete list)
This codebase is for research purposes. If you find issues or have suggestions, please open an issue or contact the authors.
For questions about this research, please refer to the paper or contact the authors directly.