Skip to content

yangalan123/AI-Realtor-Codebase

Repository files navigation

AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting

This repository contains the codebase for the paper "AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting".

Citation

If you use this code as part of any published research, please acknowledge the following paper:

@article{wu2025grounded,
  title={AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting},
  author={Wu, Jibang and Yang, Chenghao and Wu, Yi and Mahns, Simon and Wang, Chaoqi and Zhu, Hao and Fang, Fei and Xu, Haifeng},
  journal={arXiv preprint arXiv:2502.16810},
  year={2025}
}

Data Release

The datasets used in this research are available at:

Important: Users must agree to the license terms before accessing the datasets.

License and Usage

This project is intended only for educational and research purposes, not for commercial purposes.

Privacy Disclaimer

Reasonable efforts have been made to process the data and remove or anonymize Personally Identifiable Information (PII). However, the complete absence of PII cannot be guaranteed. The User agrees to handle the Dataset with care and is solely responsible for:

  • Ensuring their use of the Dataset complies with all applicable privacy laws and regulations (e.g., GDPR, CCPA).
  • Any consequences arising from the use of any PII that may remain within the Dataset.
  • Not attempting to re-identify any individuals from the anonymized data.

Setup Instructions

1. Install Dependencies

pip install -r requirements.txt

requirements.txt includes GPU/local-LLM dependencies such as vllm, which may require a CUDA-enabled Linux environment. For the credential-free artifact smoke test on a CPU-only machine, the following smaller dependency set is sufficient:

pip install datasets pandas numpy matplotlib seaborn scipy

2. Configure API Keys

Only scripts that call hosted LLM APIs require credentials. The smoke test below does not need an OpenAI key. If you have a key, set it as an environment variable rather than editing source files:

export OPENAI_API_KEY="your-openai-api-key"
export OPENAI_ORG_ID="your-openai-org-id"  # optional

On Windows PowerShell:

$env:OPENAI_API_KEY = "your-openai-api-key"
$env:OPENAI_ORG_ID = "your-openai-org-id"  # optional

The Elasticsearch demos read optional connection settings from:

export ELASTICSEARCH_URL="https://localhost:9200/"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="your-password"

3. Prepare Data Artifacts

Download the listing data with:

python -c "from utils import get_original_all_features_data; get_original_all_features_data()"

This will download the listing data from Hugging Face and save it to ./data/ai_realtor_listing_data.json.

Some research scripts also expect additional governed or derived artifacts that are not generated by the smoke test:

  • data/extracted_features.jsonl: highlight-feature annotations used by get_highlight_data() and highlight-model prompting/evaluation scripts.
  • responses_latest.json: anonymized user preference responses used by user-simulation and hallucination-detection scripts.
  • ratings.pkl: the governed paper Elo artifact. The public smoke test can use ratings.synthetic.pkl instead.

The public Hugging Face user-preference dataset can be used to reconstruct the user-simulation input after accepting its license terms. If a script references one of the filenames above, place the corresponding file at that path or adjust the script argument where one is available.

4. Credential-Free Artifact Smoke Test

The following commands exercise the public, non-API artifact path:

python -c "from utils import get_original_all_features_data; get_original_all_features_data()"
python benchmark/win_rate_plot.py
python benchmark/generate_synthetic_ratings.py --output ratings.synthetic.pkl
python benchmark/elo_plot.py --ratings-pkl ratings.synthetic.pkl

Expected outputs:

  • data/ai_realtor_listing_data.json with 1,883 listing records.
  • comparison_win_rates_improved.pdf from benchmark/win_rate_plot.py.
  • ratings.synthetic.pkl, a non-sensitive ratings file used only for smoke testing.
  • elo_ratings_grouped.pdf from benchmark/elo_plot.py.

The original ratings.pkl used for the paper's Elo visualization is not included in the public repository because it is derived from privacy/ethics-sensitive evaluation artifacts. If you have governed access to that file, place it at ratings.pkl or pass its path with python benchmark/elo_plot.py --ratings-pkl path/to/ratings.pkl.

5. Optional OpenAI-Backed Checks

Reviewers with an OpenAI key can additionally verify hosted-model paths. These commands make API calls and may incur cost, latency, and rate-limit delays. They should be treated as optional reproduction checks, not as part of the default smoke test.

First confirm the key is visible:

python -c "import os; assert os.environ.get('OPENAI_API_KEY'), 'OPENAI_API_KEY is not set'"

A small end-to-end API sanity check is:

python rag_agents/preference_summary_from_ranking_demo.py

Expected outputs:

  • Printed preference summaries for the four built-in example users.
  • preference_analysis_responses_binary_feedback.pkl.
  • preference_analysis_responses_binary_feedback.csv.

To reproduce the OpenAI-based highlight prompting baseline, prepare data/ai_realtor_listing_data.json and data/extracted_features.jsonl, then run:

python highlight_model/prompt_baseline_gpt4.py --model gpt-4o

Expected outputs are checkpoint files under prompting_baseline_outputs/gpt-4o/, named like highlight_model_prompting_gpt4_output_0.pt. The script processes 10 batches and skips already-existing batch outputs, so it can be resumed.

To reproduce the OpenAI batch user-simulation path, prepare responses_latest.json, then run:

python user_simulation/predicting_preference_batch_api.py \
  --data responses_latest.json \
  --model_name gpt-4o-mini \
  --exp_name naive_few_shot \
  --eval_mode online

Expected behavior:

  • The first run creates batch input files, submits OpenAI Batch API jobs, and writes checkpoints under responses_latest_batch_api/.
  • Later runs poll existing batch jobs and download completed results.
  • Once all batches complete, the script saves batch_scores.pt..., batch_accuracy.pt..., accuracy_histogram.pdf, and shotwise_accuracy.pdf.

Project Structure

├── benchmark/                    # Evaluation and benchmarking scripts
├── hallucination_detection/      # Hallucination detection and evaluation
├── highlight_model/             # Highlight model training and inference
├── rag_agents/                  # RAG-based agent implementations
├── user_simulation/             # User preference simulation and prediction
├── const.py                     # Constants and feature mappings
├── utils.py                     # Utility functions
└── requirements.txt             # Python dependencies

Key Components

Feature Processing

  • const.py: Contains the desired feature names and mappings from original features to standardized ones
  • utils.py: Utility functions for data processing, feature normalization, and data loading

User Simulation

  • user_simulation/: Contains scripts for predicting user preferences and simulating user behavior

Highlight Model

  • highlight_model/: Training and inference scripts for the highlight model that identifies important features

RAG Agents

  • rag_agents/: Retrieval-Augmented Generation agents for generating persuasive real estate descriptions

Evaluation

  • benchmark/: Scripts for evaluating model performance using ELO ratings and win rates
  • hallucination_detection/: Tools for detecting and evaluating hallucination in generated content

Usage Examples

Loading Data

from utils import get_original_all_features_data, get_highlight_data

# Load listing data, please run this at the very beginning to gather necessary data for running the project codes. 
all_features = get_original_all_features_data()

Running Visualization

# Generate the win-rate plot
python benchmark/win_rate_plot.py

# Generate the Elo plot from a governed ratings artifact
python benchmark/elo_plot.py --ratings-pkl ratings.pkl

# Generate the Elo plot from a non-sensitive synthetic ratings artifact
python benchmark/generate_synthetic_ratings.py --output ratings.synthetic.pkl
python benchmark/elo_plot.py --ratings-pkl ratings.synthetic.pkl

Requirements

The main dependencies include:

  • PyTorch
  • Transformers
  • OpenAI
  • Datasets
  • Pandas
  • NumPy
  • Matplotlib
  • And many others (see requirements.txt for complete list)

Contributing

This codebase is for research purposes. If you find issues or have suggestions, please open an issue or contact the authors.

Contact

For questions about this research, please refer to the paper or contact the authors directly.

About

Official Codebase for "AI Reatlor: Towards Grounded Persuasive Language Generation for Automated Copywriting"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages